Shape from Shading:A Survey
Ruo Zhang,Ping-Sing Tsai,James Edwin Cryer,and Mubarak Shah AbstractÐSince the first shape-from-shading(SFS)technique was developed by Horn in the early1970s,many different approaches have emerged.In this paper,six well-known SFS algorithms are implemented and compared.The performance of the algorithms was analyzed on synthetic images using mean and standard deviation of depth( )error,mean of surface gradient(p,q)error,and CPU timing.Each algorithm works well for certain images,but performs poorly for others.In general,minimization approaches are more robust,while the other approaches are faster.The implementation of these algorithms in C and images used in this paper are available by anonymous ftp under the pub/tech_paper/survey directory at eustis.cs.ucf.edu(132.170.108.42).These are also part of the electronic version of paper.
Index TermsÐShape from shading,analysis of algorithms,Lambertian model,survey of shape from shading algorithms.
æ
1I NTRODUCTION
S hape recovery is a classic problem in computer vision.
The goal is to derive a3D scene description from one or more2D images.The recovered shape can be expressed in several ways:depth xY y ,surface normal n x Y n y Y n z , surface gradient pY q ,and surface slant,0,and tilt, .The depth can be considered either as the relative distance from camera to surface points,or the relative surface height above the x-y plane.The surface normal is the orientation of a vector perpendicular to the tangent plane on the object surface.The surface gradient, pY q dz dx Y dz dy ,is the rate of change of depth in the x and y directions.The surface slant, 0,and tilt, ,are related to the surface normal as n x Y n y Y n z l sin0 os Y l sin0sin Y l os0 ,where l is the magnitude of the surface normal.
In computer vision,the techniques to recover shape are called shape-from-X techniques,where X can be shading, stereo,motion,texture,etc.Shape-from-shading(SFS)deals with the recovery of shape from a gradual variation of shading in the image.Artists have long exploited lighting and shading to convey vivid illusions of depth in paintings. To solve the SFS problem,it is important to study how the images are formed.A simple model of image formation is the Lambertian model,in which the gray level at a pixel in the image depends on the light source direction and the surface normal.In SFS,given a gray level image,the aim is to recover the light source and the surface shape at each pixel in the image.However,real images do not always follow the Lambertian model.Even if we assume Lamber-tia
n reflectance and known light source direction,and if the brightness can be described as a function of surface shape and light source direction,the problem is still not simple. This is because if the surface shape is described in terms of the surface normal,we have a linear equation with three unknowns,and if the surface shape is described in terms of the surface gradient,we have a nonlinear equation with two unknowns.Therefore,finding a unique solution to SFS is difficult;it requires additional constraints.
This paper is about the comparison and performance analysis of SFS techniques.We have implemented six well-known SFS algorithms and compared them in terms of timing(CPU time)and accuracy(mean and standard deviation of the depth error and mean of the surface gradient error)in order to analyze the advantages and disadvantages of these approaches.Two synthetic images with two different light sources each and three real images were used in our experiments.We found that none of the algorithms has consistent performance for all images since they work well for certain images,but perform poorly for others.In general,minimization approaches are more robust,while the other approaches are faster.The imple-mentation of these algorithms in C and images used in this paper are available by anonymous ftp.
2L ITERATURE R EVIEW
Shading plays an important role in human perception of surface shape.Researchers in human vision have attempted to understand and simulate the mechanisms by which our eyes and brains actually use the shading information to recover the3D shapes.Ramachandran[45]demonstrated that the brain recovers the shape information not only by the shading,but also by the outlines,elementary features, and the visual system's knowledge of objects.The extrac-tion of SFS by visual system is also strongly affected by stereoscopic processing.Barrow and Tenenbaum discov-ered that it is the line drawing of the shading pattern that seems to play a central role in the interpretation of shaded patterns[2].Mingolla and Todd's study of human visual system based on the perception of solid shape[30]indicated that the traditional assumptions in SFSÐLambertian re-
.R.Zhang was with the Computer Vision Lab.,School of Computer Science,
University of Central Florida,Orlando,FL32816.
.P.S-Tsai is with Intel Corporation,CH6-428,5000W.Chandler Blvd., Chandler,AZ85226.E-mail:ping-sing.tsai@intel.
.J.Cryer and M.Shah are with the Computer Vision Lab,School of
Computer Science,University of Central Florida,Orlando,FL32816.
E-mail:shah@cs.ucf.edu.
Manuscript received18Nov.1997;revised5Apr.1999.
Recommended for acceptance by R.Kasturi.
For information on obtaining reprints of this article,please send e-mail to:
,and reference IEEECS Log Number107596.
0162-8828/99/$10.00ß1999IEEE
flectance,known light source direction,and local shape recoveryÐare not valid from a psychological point of view. One can observe from the above discussion that human visual system uses SFS differently than computer vision normally does.
Recently,Horn et al.[19]discovered that some impos-sibly shaded images exist,which could not be shading images of any smooth surface under the assumption of uniform reflectance properties and lighting.For this kind of image,SFS will not provide a correct solution,so it is necessary to detect impossibly shaded images.
SFS techniques can be divided into four groups: minimization approaches,propagation approaches,local approaches,and linear approaches(see Fig.1).Minimiza-tion approaches obtain the solution by minimizing an energy function.Propagation approaches propagate the shape information from a set of surface ,singular points)to the whole image.Local approaches derive shape based on the assumption of surface type.Linear approaches compute the solution based on the linearization of the reflectance map.
2.1Minimization Approaches
One of the earlier minimization approaches,which recov-ered the surface gradients,was by Ikeuchi and Horn[21]. Since each surface point has two unknowns for the surface gradient and each pixel in the image provides one gray value,we have an underdetermined system.To overcome this,they introduced two constraints:the brightness constraint and the smoothness constraint.The brightness constraint requires that the reconstructed shape produce the same brightness as the input image at each surface point, while the smoothness constraint ensures a smooth surface reconstruction.The shape was computed by minimizing an energy function which consists of the above two constraints. To ensure a correct convergence,the shape at the occluding boundary was given for the initialization.Since the gradient at the the occluding boundary has at least one infinite component,stereographic projection wa
s used to transform the error function to a different space.Also using these two constraints,Brooks and Horn[5]minimized the same energy function,in terms of the surface normal.Frankot and Chellappa[11]enforced integrability in Brooks and Horn's algorithm in order to recover integrable surfaces(surfaces for which z xy z yx).Surface slope estimates from the iterative scheme were expressed in terms of a linear combination of a finite set of orthogonal Fourier basis functions.The enforcement of integrability was done by projecting the nonintegrable surface slope estimates onto the nearest(in terms of distance)integrable surface slopes. This projection was fulfilled by finding the closest set of coefficients which satisfy integrability in the linear combi-nation.Their results showed improvements in both accu-racy and efficiency over Brooks and Horn's algorithm[5]. Later,Horn also[18]replaced the smoothness constraint in his approach with an integrability constraint.The major problem with Horn's method is its slow convergence. Szeliski[48]sped it up using a hierarchical basis precondi-tioned conjugate gradient descent algorithm.Based on the geometrical interpretation of Brooks and Horn's algorithm, Vega and Yang[51]applied heuristics to the variational approach in an attempt to improve the stability of Brooks and Horn's algorithm.
Instead of the smoothness constraint,Zheng and Chellappa[54]introduced an intensity gradient constraint which specifies that the intensity gradients of the recon-structed image and the input image are close to each other in both the x and y directions.
All of the above techniques use variational calculus. Leclerc and Bobick[25]solved directly for depth by using a discrete formulation and employing a conjugate gradient technique.The brightness constraint and smoothness constraint were applied to ensure convergence,and a stereo depth map was used as an initial estimate.Recently,Lee and Kuo[28]also proposed an approach to recover depth using the brightness and the smoothness constraint.They approximated surfaces by a union of triangular patches. This approach did not require the depth initialization. The approaches described so far deal with a single smooth surface.Malik and Maydan[29]developed a
Fig.1.Four categories of shape from shading approaches.The number in parentheses beside authors'names is the year of publication.
solution for piecewise smooth surfaces.They combined the line drawing and shading constraints in an energy function and recovered both surface normal and line labeling through the minimization of the energy function.
2.2Propagation Approaches
Horn's characteristic strip method[17]is essentially a propagation method.A characteristic strip is a line in the image along which the surface depth and orientation can be computed if these quantities are known at the starting point of the line.Horn's method constructs initial surface curves around the neighborhoods of singular points(singular points are the points with maximum intensity)using a spherical approximation.The shape information is propa-gated simultaneously along the characteristic strips out-ward,assuming no crossover of adjacent strips.The direction of characteristic strips is identified as the direction of intensity gradients.In order to get a dense shape map, new strips have to be interpolated when neighboring strips are not close to each other.
Rouy and Tourin[46]presented a solution to SFS based on Hamilton-Jacobi-Bellman equations and vis
cosity solu-tions theories in order to obtain a unique solution.A link between viscosity solutions and optimal control theories was given via dynamic programming.Moreover,conditions for the existence of both continuous and smooth solutions were provided.
Oliensis[33]observed that the surface shape can be reconstructed from singular points instead of the occluding boundary.Based on this idea,Dupuis and Oliensis[9],[34] formulated SFS as an optimal control problem and solved it using numerical methods.Bichsel and Pentland[3]simpli-fied Dupuis and Oliensis's approach and proposed a minimum downhill approach for SFS which converged in less than10iterations.
Similar to Horn's and Dupuis and Oliensis's approaches, Kimmel and Bruckstein[23]reconstructed the surface through layers of equal height contours from an initial closed curve.Their method applied techniques in differ-ential geometry,fluid dynamics,and numerical analysis, which enabled the good recovery of nonsmooth surfaces. The algorithm used a closed curve in the areas of singular points for initialization.
2.3Local Approaches
Pentland's local approach[37]recovered shape information from the intensity and its first and second d
erivatives.He used the assumption that the surface is locally spherical at each point.Under the same spherical assumption,Lee and Rosenfeld[26]computed the slant and tilt of the surface in the light source coordinate system using the first derivative of the intensity.
2.4Linear Approaches
The approaches by Pentland and Tsai and Shah are linear approaches which linearize the reflectance map and solve for shape.
Pentland[38]used the linear approximation of the reflectance function in terms of the surface gradient and applied a Fourier transform to the linear function to get a closed form solution for the depth at each point.
Tsai and Shah[50]applied the discrete approximation of the gradient first,then employed the linear approximation of the reflectance function in terms of the depth directly. Their algorithm recovered the depth at each point using a Jacobi iterative scheme.
2.4.1Interreflections
None of the above methods deals with interreflectionsÐthe mutual illumination between surface facets.
Nayaret et al.
[31],[32]addressed the shape-from-interreflection problem using photometric stereo.They observed that the erroneous shape extracted by shape-from-photometric-stereo algo-rithms in the presence of interreflections was shallower than the real shape.Therefore,they proposed a method to iteratively refine the shape.Similar formulation of interre-flection was also discussed by Forsyth and Zisserman[10].
2.4.2Convergence,Uniqueness,and Existence
Little work has been done on proving the uniqueness or existence of a solution to SFS.The uniqueness of SFS can be proven under the condition that the light source direction is equal to,or symmetric around,the viewing direction[33]. With an initial known curve,the method of characteristic strips yields a unique solution if the first derivative of surface depth is continuous.For general cases,the unique-ness is unknown.However,Lee and Kuo[28]showed that, given the depth at a reference point,the addition of the smoothness constraint and successive linearization of the reflectance map(based on the local gradients obtained from the previous iteration)provides a unique solution for their approach,in most cases.Blake et al.[4],[6]and Rouy and Tourin[46]also provided suffici
ent conditions for unique-ness.Their conditions contain either the singular point or the occluding boundary.
If we consider local uniqueness instead of global uniqueness over the entire image,the uniqueness of a solution can be easily determined at singular points and occluding boundaries,provided that the reflectance map is given.These are the points at which we can determine the surface orientation directly from the image brightness.The brightness pattern in any arbitrary region could arise from an infinite number of different surfaces.However,the information at singular points and at occluding boundaries can be used to constrain the possible solutions.
2.4.3The Organization of the Paper
The organization of the remainder of this paper is as follows:The next section introduces background knowledge related to reflectance models.Section4 describes six SFS approaches studied in detail in this paper. Section5is devoted to the description of the synthetic and real images used in this study.We present our experimental results for six different SFS algorithms in Section6.The error analysis is presented in Section7.Section8sum-marizes the CPU timing of all six algorithms for different images.Finally,conclusions and future research are covered in Section9.
3R EFLECTANCE M ODELS
Depending on their physical properties,surfaces can be categorized as pure Lambertian,pure specular,hybrid,or more sophisticated surfaces.In this section,we will describe the reflectance models and discuss their properties related to shape from shading.
3.1Lambertian and Specular Reflectance Models Lambertian surfaces are surfaces having only diffuse ,surfaces which reflect light in all directions. The brightness of a Lambertian surface is proportional to the energy of the incident light.The amount of light energy falling on a surface element is proportional to the area of the surface element as seen from the light source position(the foreshortened area).The foreshortened area is a cosine function of the angle between the surface orientation and the light source direction.Therefore,the Lambertian surface can be modeled as the product of the strength of the light source e,the albedo of the surface&,and the foreshortened area os i as follows:
s v e& os i Y I where is the reflectance map and i is the angle between the surface normal~x n x Y n y Y n z and the source direction ~ s
x Y s y Y s z (see Fig.2).If we let the surface normal and the light source direction both be unit vectors,
we can rewrite the above formula as:
s v e&~xÁ~ Y P whereªÁºrepresents dot product.
Specularity only occurs when the incident angle of the light source is equal to the reflected angle.It is formed by two components:the specular spike and the specular lobe. The specular spike is zero in all directions except for a very narrow range around the direction of specular reflection. The specular lobe spreads around the direction of specular reflection.
The simplest model for specular reflection is described by the following delta function:
s f sÀP r Y Q where s is the specular brightness,f is the strength of the specular component, s is the angle between the light source direction and the viewing direction,and r is the angle between the surface normal and the viewing direction.This model assumes that the highlight caused by specular reflection is only a single point,but in real life this assumption is not true.Another model developed by Phong [40]represents the specular component of reflection as powers of the cosine of the angle between the perfect specular direction and the viewing direction.This model is capable of predicting specularities which extend beyond a single point;however,the parameters have no physical meaning.A more refined model,the Torrance-Sparrow model[49],assumes that a surface is composed of small, ran
domly oriented,mirror-like facets.It describes the specular brightness as the product of four components: energy of incident light,Fresnel coefficient,facet orientation distribution function,and geometrical attenuation factor adjusted for foreshortening.On the basis of the Torrance-Sparrow model,Healey and Binford[15]derived a simplified model by using the Gaussian distribution as the facet orientation function and considering the other components as constants.It can be described as:
s ueÀ m P Y R where u is a constant, is the angle between the surface normal~x and the bisector r of the viewing direction and source direction,and m indicates the surface roughness (Fig.3).
Most surfaces in the real world are neither purely Lambertian nor purely specular,they are a combination of both.That is,they are hybrid surfaces.One straightforward equation for a hybrid surface is:
s IÀ3 s v 3s Y S where s is the total brightness for the hybrid surface,s ,s v are the specular brightness and Lambertian brightness, respectively,and3is the weight of the specular component. Nayar et al.[32]proposed a reflectance model which consists of three components:diffuse lobe,specular lobe, and specular spike.The Lambertian model was used to represent the diffuse lobe,the specular component of the Torrance-Sparrow model was used to model the specular lobe,and the spike component of the B
eckmann-Spizzichino model was used to describe the specular spike. The resulting hybrid model is given as:
s u dl os i u sl eÀ P P u ss iÀ r 0r X T where u dl,u sl,and u ss are the strengths of the three components, is the angle between the surface normal of a micro-facet on a patch and the mean normal of this surface patch,and'is its standard derivation.If we consider the surface normal being in the direction,then i Y0i is the direction of incidence light in terms of the slant and tilt in 3D, r Y0r is the direction of reflected light.
Although the Lambertian model is widely used because of its simplicity,it is a poor approximation to the diffuse component of rough surfaces.See[20],[24],[7],[35]for more sophisticated approaches.
4S ELECTED S HAPE FROM S HADING A LGORITHMS Most SFS algorithms assume that the light source direction is known.In the case of the unknown light source direction, there are algorithms[36],[26],[54]which can estimate the light source direction without the knowledge of the surface shape.However,some assumptions about the surface shape are required,such as the local spherical surface,uniform and isotropic distribution of the surface orientation.Once the light source direction is known,3D shape can be estimated.
We have implemented two minimization,one propaga-tion,one local,and two linear methods.Ease of finding/ making an implementation was an important criterion.In addition,this selection was guided by several other reasons. First,we have attempted to focus on more recent algorithms.Since authors improve their own algorithms
or others'algorithms,new algorithms,in general,perform better than older algorithms.Second,some papers deal with theoretical issues related to SFS,but do not provide any particular algorithm.Such papers have not been dealt with in detail in this paper.Third,some approaches combine shape from shading with stereo,or line drawings,etc.We have mentioned such approaches in the review section for the sake of completeness,but have not discussed them in detail.Finally,since papers related to interreflection and specularity consider special and more complex situations of shape from shading,they have not been dealt with in detail.
4.1Minimization Approaches
Minimization approaches compute the solution which minimizes an energy function over the entire image.The function can involve the brightness constraint and other constraints,such as the smoothness constraint,the integr-ability constraint,the gradient constraint,and the unit normal constrain
t.In this section,first,we briefly describe these constraints and,then,discuss SFS methods which use these constraints.
The Brightness constraint is derived directly from the image irradiance(2).It indicates the total brightness error of the reconstructed image compared with the input image, and is given by
sÀ P dx dyY U
where s is the measured intensity and is the estimated reflectance map.
The Smoothness constraint ensures a smooth surface in order to stabilize the convergence to a unique solution,and is given by
p P x p P y q P x q P y dx dyX V
Here p and q are surface gradients along the x and y directions.Another version of the smoothness term is less restrictive by requiring constant change of depth only in x and y directions:
p P x q P y dx dyX W
The smoothness constraint can also be described in terms of the surface normal~x:
k~x x k P k~x y k P dx dyX IH This means that the surface normal should change gradually.
The Integrability constraint ensures valid surfaces,that is, xYy yYx.It can be described by either
p yÀq x P dx dyY II or
xÀp P yÀq P dx dyX IP The Intensity Gradient constraint requires that the intensity gradient of the reconstructed image be close to the intensity gradient of the input image in both the x and y directions:
xÀs x P yÀs y P dx dyX IQ The Unit Normal constraint forces the recovered surface normals to be unit vectors:
k~x k PÀI dx dyX IR
4.1.1Zheng and Chellappa(91)
Zheng and Chellappa[54]applied the intensity gradient constraint,instead of a smoothness constraint(used by Ikeuchi and Horn[21],Brooks and Horn[5],and Horn[18]), therefore,their energy function became:
Fig.2.Lambertian reflection
geometry.Fig.3.Specular reflection geometry.
s À P x Às x P y Às y P
" x Àp P y Àq P dx dyX
IS
The minimization of the above function was done through variational calculus.In variational calculus,the Euler equations,which are differential equations,are first computed and then solved discretely to minimize the energy function.Zheng and Chellappa simplified the Euler equations by taking the first-order Taylor series of the reflectance map and representing the depth,gradient and their derivatives in discrete form.Then,a new iterative scheme was derived which updates depth and gradients simultaneously.The algorithm was implemented using a hierarchical structure (pyramid)in order to speed up the computation.There was no special requirement for the initialization of the boundary.The initial values for both depth and gradient can be zero.
The implementation of Zheng and Chellappa's method is very straightforward.We used the forward difference approximation to compute the partial derivatives [54].For the boundary points,where the forward approximation could not be applied,we switched to the backward difference approximation for the first order partial deriva-tives and set the second order partial derivatives to zero.In all the experiments,we set "to I as suggested by the authors [54].
4.1.2Lee and Kuo (91)
Lee and Kuo [28]used the brightness constraint and the smoothness constraint.In their approach,surfa
ces were approximated by the union of triangular surface patches.The vertices of the triangles were called nodal points and only nodal depths were recovered.Depths at the pixels,which are not nodal points,were obtained through interpolation.For each triangular patch,the intensity of the triangle was taken as the average intensity of all pixels in the triangle and the surface gradient of the triangle was approximated by the cross product of any two adjacent edges of the triangle.This established a relationship between the triangle's intensity and the depth at its three nodal points.Linearizing the reflectance map in terms of the surface gradient pY q ,a linear relationship between the intensity and depth at the nodal points was derived.The surface depths at the nodal points were computed using optimization.The optimization problem was reduced to the solution of a sparse linear system and a multigrid computational algorithm was applied.There was no initialization required.
Lee and Kuo's algorithm was implemented using the V-cycle multigrid scheme to solve the linear system,as reported in [28].We used Gauss-Seidel relaxation as the smoothing operator and as the exact solver for the finest grid.Full-weighting restriction was applied to transfer the residual from finer grids to coarser grids and bilinear interpolation was applied to make the prolongation from the coarser grid to finer grids.The same template was used for the smoothness term as given in their paper.The nodal points in the finest grids were chosen to be the image pixels.Successive linearizations were done through a maximum of 10successive iterations,and the number of V-cycles was set
to 10for the first iteration,2for the second,1for the rest.The initial values for depth of the finest grid and corrections for the coarser grids were all zero.Since the algorithm does not work for light source direction H Y H Y I ,we used H X HHI Y H X HHI Y I X H as the input light source direction instead.We used smoothing factor of 2,000and computed the level of grids by v log w ÀI ,where w is the size of the image.Therefore,we have seven levels for 256by 256images and six levels for 128by 128images.The depth maps,after the first iteration,contain more detail but have a smaller range.After 10iterations,details are smoothed out,but the depth range is wider.This means that more iterations will provide more low frequency information,which overtakes the high frequency information from the initial iterations.
valid from是什么意思4.2Propagation Approaches
Propagation approaches start from a single reference surface point or a set of surface points where the shape either is known or can be uniquely determined (such as singular points)and propagate the shape information across the whole image.We discuss one algorithm in this section.
4.2.1Bichsel and Pentland (92)
Following the main idea of Dupuis and Oliensis,Bichsel and Pentland [3]developed an efficient minimu
m downhill approach which directly recovers depth and guarantees a continuous surface.Given initial values at the singular points (brightest points),the algorithm looks in eight discrete directions in the image and propagates the depth information away from the light source to ensure the proper termination of the process.Since slopes at the surface points in low brightness regions are close to zero for most directions (except the directions which form a very narrow angle with the illumination direction),the image was initially rotated to align the light source direction with one of the eight directions.The inverse rotation was performed on the resulting depth map in order to get the original orientation back.Assuming the constraint of parallel slope,the surface gradient, pY q ,was precomputed by taking the derivative of the reflectance map with respect to q in the rotated coordinate system,setting it to zero and
Fig.4.Depth maps for the synthetic images:(a)synthetic vase,(b)Mozart.
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论