Mosaicing on Adaptive Manifolds Shmuel Peleg,Member,IEEE,Benny Rousso,Alex Rav-Acha,and Assaf Zomet
AbstractÐImage mosaicing is commonly used to increase the visual field of viewby pasting together many images or video frames.
Existing mosaicing methods are based on projecting all images onto a predetermined single manifold:A plane is commonly used for a camera translating sideways,a cylinder is used for a panning camera,and a sphere is used for a camera which is both panning and tilting.While different mosaicing methods should therefore be used for different types of camera motion,more general types of camera motion,such as forward motion,are practically impossible for traditional mosaicing.A new methodology to allow image mosaicing in more general cases of camera motion is presented.Mosaicing is performed by projecting thin strips from the images onto manifolds which are adapted to the camera motion.While the limitations of existing mosaicing techniques are a result of using predetermined manifolds,the use of more general manifolds overcomes these limitations.
Index TermsÐMosaicing,motion analysis,image alignment.
æ
1I NTRODUCTION
C REATING pictures having larger field of view by
combining many smaller images is common since the beginning of photography,as the camera's field of view is smaller than the human field of view.In addition,some large objects cannot be captured in a single picture as is the case in aerial photography.Using omnidirectional cameras [19]can provide a partial solution,but capturing a wide field of view with the limited resolution of a video camera compromises image resolution.A common solution is photo-mosaicing:aligning and pasting pictures,or frames in a video sequence,to create a wider view.Digital photography enabled new implementations for mosaicing [17],[18],[20],[3],[10],[28],which were first applied to aerial and satellite images and later used for scene and object representation.
The simplest mosaics are created by panning the camera around its optical center,in which case the panoramic image can be created on a cylindrical or a spherical manifold[15],[5],[16],[29],[12],[28].The original images, which are formed by a perspective projection onto a plane, are warped to be perspectively projected into an appro-priate cylinder,where they can be combined to a full 360degrees panorama,as in Fig.1.While the limitations to pure sideways rotation enable easy mosaic
ing without the problems of motion parallax,this approach cannot be used with other camera motions.
Simple mosaicing is also possible from a set of images whose mutual displacements are pure image-plane transla-tions.This is the case for a translating camera orthogonally viewing a planar scene.For somewhat more general camera motions,more general transformation for image alignment can be used,like a global affine transformation or a planar-projective transformation[4],[7],[11],[26],[10]. In most cases,images are aligned pairwise using the global parametric transformation,a reference frame is selected, and all images are aligned to this reference frame and combined to create the panoramic mosaic.Such methods imply the perspective projection of all the images onto the planar manifold corresponding to the image plane of the reference frame.Using a planar manifold and aligning all frames to a single reference frame is reasonable only when there are no considerable depth differences in the scene and the camera motion is mainly a sideways translation and rotation around the optical axis.Significant distortions are created,for example,in more general camera motion which includes sideway rotation or when there are scale changes in the image due to camera translation,as shown in Fig.2.
Most restrictions on the motion of the camera used for mosaicing can be eliminated by using a manif
old whose shape is determined adaptively during the mosaicing process.To enable undistorted mosaicing the selected manifold should have the property that,after projecting the images onto the manifold,the optical flow1vectors become approximately uniform:parallel to each other and of equal magnitude.Typical cases for this optical flow are sideways image translation,where the manifold is a plane and a panning camera,where the manifold is a vertical cylinder and the optical flow in a center of the image is approximately uniform.Different manifolds relating to more general camera motions will be presented in this paper.It will be shown,for example,that the general case of a translating camera can be handled using a cylindrical manifold whose axis is the direction of motion.
When a moving camera captures a general static scene, the optical flow depends on the scene depth,making mosaicing difficult.A technique for mosaicing such scenes is theªslit camera,ºor theªpushbroom camera,ºused in
.S.Peleg,A.Rav-Acha,and A.Zomet are with the School of Computer Science and Engineering,The Hebrew University of Jerusalem,91904 Jerusalem,Israel.E-mail:{peleg,alexis,zomet}@cs.huji.ac.il.
.  B.Rousso is with Impulse Dynamic,Israel.E-mail:il. Manuscript received4Dec.1998;revised19Apr.2000;accepted23July 2000.
Recommended for acceptance by R.Szeliski.
For information on obtaining reprints of this article,please send e-mail to: ,and reference IEEECS Log Number108405.
1.Image motion is represented by the optical flow:the displacement vectors associated with each image point which specify the location of the image point in the next frame relative to its location in the current frame.
0162-8828/00/$10.00ß2000IEEE
aerial photography [8].This camera can be modeled as a 1D sensor array which collects strips by ªsweepingºthe scene,as described in Fig.3.
The imaging process of the pushbroom camera can be modeled by a multiperspective projection:For each strip,the projection is perspective,while different strips may be acquired from different viewpoints.Thus,in the direction of the strips,the projection is perspective,while,in the scanning direction,the projection is parallel.Since under parallel projection there is no parallax,the strips in the resulting mosaic are aligned at the seams.
All mosaicing techniques described in this paper process video sequences acquired by perspective cameras moving on a smooth route.They approximate the mosaic image which would have been acquired by a pushbroom camera moving on the same route.This is done by reprojecting thin strips from the images onto a manifold such that the optical flow becomes approximately uniform:parallel and of equal magnitude.Usually,both the manifold and the reprojection transformation are computed implicitly.
The implementations of manifold mosaicing presented here are based on the motion computed between the images.Unlike other methods for multiperspective mosaics [22],[30],the mosaics are usually constructed without knowing or recovering the structure of the scene and without knowing explicitly the full motion and calibration of the camera.
The warping of the strips to induce parallel and uniform optical flow can sometimes be achieved in two steps:First,the images are rectified [9],[14]to get a parallel and horizontal optical flow.After rectification,the images are resampled by interpolating the coordinates linearly along the flow.This method has several drawbacks:1)Image rectification assumes camera translation.Mosaicing must handle pure rotations.2)Image rectification rotates the image planes to be parallel to the camera translation vector.In order to rectify more than two images,it should be assumed that the camera tran
slates in a constant direction.
The projection of the strips onto the manifold can be viewed as a ªRectificationºonto a nonplanar manifold.Each region in the mosaic is taken from that image where it is captured at highest resolution.While this could have been neglected in the traditional mosaicing,which does not allow any scale changes,it is critical for general camera motions where,for example,a region is seen at higher resolution when closer.
Examples of manifold mosaicing using strips will be given for cases of almost uniform image translations caused by a panning camera [21],for a forward moving camera [24],[23],and for 2D planar projective transformation caused by a tilted panning camera or a tilted camera translating in a planar scene [33].Mosaics generated in this manner can be considered as similar to the vertical ªslitsº[31]or the ªlinear push-broom camerasº[8].However,unlike the straight ªslitºor ªbroom,ºthe broom in manifold mosaicing may change its shape from a straight line to a circular arc to become mostly perpendicular to the optical flow.This demonstrates the flexibility in mosaicing with strips,and its adaptation to changes in camera motion.
2M OSAICING
WITH
S TRIPS
Most existing mosaicing systems align and combine full images or video frames [28],[10],[26],[13].The combina-tion of full frames into mosaics introduces some difficulties:.
It is almost impossible to accurately align complete frames due to lens distortion,motion parallax,moving objects,etc.This results in ªghostingºor blurring when the mosaic is constructed.Similar
Fig.1.A panoramic image can be generated from a panning camera by combining the
images on the surface of a cylinder.
Fig.2.Mosaicing images of a planar scene under a tilted viewby warping the input images to the coordinate system of the reference
image.The resulting mosaic will be
curled.
Fig.3.An aerial pushbroom camera.
Fig.4.A panoramic image generated from a vertical ªslitºmoving on a smooth path on a horizontal plane.
artifacts may appear due to accumulative error in the motion between nonsuccessive frames.
.It is difficult to determine the mosaicing manifold,
<,if all images are aligned to one reference image,different reference images will give different mo-saics.In the case of a projection onto a cylinder,it is important that the camera motion is a pure sideways rotation.
In order to overcome the above difficulties,we propose using mosaicing with strips.Thin strips are taken from the input images and placed,after warping,onto the mosaic.We present implementations of manifold mosaicing for several types of scenarios.All the implementations are satisfying the following principles:.The width of the strips should be proportional to the motion.
.
The borders of the strips should match.For example,when moving to the right,the right border of the
strips taken from image I n should correspond to the left border of the strip taken from image I n  1.This is necessary in order to get a continuous mosaic.
.The collected strips should be warped and pasted
into the mosaic image such that,after warping,their optical flow becomes parallel to the direction in which the panoramic image is constructed and of equal magnitude.
.In order to avoid global resizing,each image strip
includes a feature (the anchor )which does not change under the warping.
adaptive
.It is recommended to have the anchor perpendicular
to the optical flow.This maximizes the information collected by the virtual 1D sensor array.
.It is recommended to take the strips from the center
of the image to reduce effects of lens distortion.The warping of the strips into the mosaic is equivalen
t to reprojection onto a manifold without the explicit computa-tion of the manifold.No accumulative distortions are encountered as each strip contains an anchor and is warped to match just its neighboring strips.The anchors are placed parallel along the mosaic,completing a parallel projection in the direction of the motion of the camera.Thus,the anchors are the realization of the ªslitºor ªbroomºof the virtual camera.Assuming the motion between successive
Fig.5.The mosaicing process and the direction of the optical flow.(a)A vertical slit is optimal when the optical flow is horizontal.(b)A vertical slit is useless when the optical flow is vertical.(c)A circular slit is optimal when the optical flow is radial.(d)For general motion,optimal slits should be perpendicular to the optical flowand bent
accordingly.
Fig.6.Determining the shape of the slit with camera translation.In this case,the optimal slit will be the longest circular section having its center at the FOE and passing through the field of view.This is the longest curve in the FOV that is perpendicular to the optical
flow.
Fig.7.The mosaicing manifold is formed by the motion of the camera and the shape of the anchors in the
images.
Fig.8.The pipe projection geometry.The axes of the pipe ^s pass through the optical center O  0;0;0 and through the FOE S .^d and ^
r are unit vectors chosen to form a Cartesian coordinate system together
with ^s
.The image point P  x;y;f c  is projected onto to its corresponding point Q on the pipe.Q is represented by k ,the position
along the axis ^s
,and  ,the angle from ^d .
frames is small,canceling the parallax by linear interpola-tion of the coordinates along the flow is a satisfying approximation for the narrow gaps between the anchors.When the strips are wide,it is possible to reduce the parallax and simulate the parallel projection by generating intermediate views [27],[6].The introduction of intermedi-ate views simulates a denser image sequence,where the strips are narrower,with smaller discontinuities due to motion parallax.
In order to maximize the information collection rate of the 1D sensor,it is recommended to have the slit perpendicular to the optical flow,as illustrated in Fig.5.An example for the determination of the shape of the slit is given for image motion generated by pure translation of the camera,as shown in Fig.6.In this case,the image motion can be described by a radial optical flow emanating from the focus of expansion (FOE)and the field of view of the camera (FOV)can be described as a circle on the image plane.The optimal slit is the longest circular section having its center at the FOE and passing through the FO V.This is the longest curve in the FO V that is perpendicular to the optical flow.
The definition of the scanning slit as perpendicular to the optical flow is very simple for some cases.
.
In sideways image motion,the optimal slit is vertical (Fig.5a).
.In image scaling (zoom)and in forward motion,the
optimal slit is a circle (Fig.5c).
.In image motion generated by camera translation,
the optimal slit is a circular arc (Fig.5d).
Image motion is usually more general than these simple special cases.However,in most cases,slits that are straight lines,circular curves,or elliptic curves are sufficient for mosaicing.
The shape of the slit determines the shape of the manifold on which the mosaic is created.The circular slit,for example (Fig.5c),forms a cylindrical manifold.In Fig.7,it is demonstrated how the anchors form the manifold.When mosaicing with strips,changes in image bright-ness,usually caused by automatic gain control,cause visible brightness seams between strips.These illumination discontinuities can be eliminated by blending the different images,for example,using the Laplacian pyramid [3].
3I MPLEMENTATION E XAMPLES
In this section,implementations of manifold mosaicing are described.First,the simple case of a camera moving on a
Fig.9.The relation between pipe projection and mosaicing with strips.The strip on the pipe corresponds to a strip on the image that is warped to achieve parallel optical flow.Strips are taken from
the images in which best resolution is obtained.
Fig.10.Pipe projection with a forward moving camera and a planar scene.The focus of expansion is inside the image.(a)and (b)are individual frames from a video sequence.(c)The pipe mosaic of the sequence.
The road is always at full resolution.Fig.11.Cutting and pasting strips.(a),(b),and (c)Strips are perpendicular to the optical flow.(d)Strips are warped and pasted so that their back side is fixed and their front side is warped to
match the back side of the next strip.
Fig.12.The image of vertical and horizontal lines:(a)Tilted camera moving horizontally and viewing a planar scene.(b)Tilted camera panning horizontally.The lines are on a cylinder centered on the rotation axis.
horizontal path [31],[21],where the optical flow can be approximated at the center as horizontal and uniform.Then,we describe two algorithms for mosaicing from a forward moving camera.The first algorithm constructs the mosaic by explicitly reprojecting the images onto the pipe manifold [24].The second algorithm performs the reprojec-tion implicitly by pasting strips onto the mosaic image [23].Finally,an algorithm is described for cases in which the image motion can be described by a 2D homography.This algorithm handles the case of a tilted panning camera and the case of a tilted camera translating in a planar scene.
3.1Horizontal Motion
A mosaicing approach which creates,for the first time,multiperspective panoramic views on general manifolds has been described in [31].It was assumed that the camera motion and calibration are known from an external device and that its motion is a combination of translation sideways on a plane and panning.When the camera pans,the motion in the center of the image is approximately horizontal and uniform.Similar image motion occurs when the camera translates sideways,assuming there are no considerable depth differences.Thus,a mosaic can be constructed by copying thin vertical strips from the input images and pasting them side by side onto the mosaic image.This is
equivalent to projecting the strips onto a manifold which is a combination of cylindrical patches (when the camera pans,as in Fig.1)and planar patches (when the camera only translates).For a more general motion combining panning and horizontal translation,the manifold follows the center strip of the images as shown in Fig.1and Fig.4and the process can be viewed as a 1D vertical slit camera which is scanning the scene.
Manifold mosaicing can be implemented without knowing the camera motion and internal parameters from an external device [21].First,the motion between the images is computed pairwise,using 2D rotation and translation model.Then,the rotations about the Z-axis are canceled and narrow vertical strips from the centers of the images are pasted onto the mosaic image.
The strips are taken from the center of the image to minimize effects of misalignment.There are three reasons for that selection:
.
The approximation of the motion generated by a panning camera as uniform is optimal at the center of the image.
.
Lens distortion is minimal at the center of the images .Alignment is usually better at the center than at the
edges of the pictures.
This selection corresponds to the Voronoi tessellation [1].Using the Voronoi tessellation for image cut-and-paste reduces visible misalignment due to lens distortions.Voronoi tessellation causes every seam to be at the same distance from the two corresponding image centers.As lens distortions are radial,features that are perpendicular to the seam will be distorted equally on the seam and,therefore,will remain aligned regardless of lens distortion.The construction of the mosaic is very fast and has been demonstrated live on a PC [21].Results are impressive in most cases and have the desired feature of manifold mosaicing:Each object in the mosaic appears in the same size as it appears in the video frames,avoiding any scaling and,therefore,avoiding distortions and loss of resolution.Mosaicing is done without the explicit assumption of pure rotation and without the need to project the images onto a cylinder before mosaicing.Fig.14and Fig.13show panoramic mosaic images created with an implementation of the manifold mosaicing on the PC [21].
Fig.13.Manifold mosaicing with vertical scanning.The curved boundary is created by the unstabilized motion of the hand-held
camera.
Fig.14.An example of panoramic imaging using manifold mosaicing with straight strips.The curved boundary is created by the unstabilized motion of the hand-held camera.
3.2Forward Motion Using Pipe Projection Forward camera motion used to be the classical case where traditional mosaicing fails.A theoretical camera model which handles the case of forward motion is a slit camera which scans the scene through a circular slit.This slit is symmetrical around th
e FOE and gives a wide FOV[32]. Manifold mosaicing can approximately simulate the ªCircular slitºcamera using a cylindrical manifold along the trajectory of the camera.We call the3D projection on such a manifoldªPipe Projectionºand its definition is described in this section.
The translation of the camera(and also zoom)induces radial optical flow which emerges from the FOE,except for the singular case of sideways translation in which the optical flow vectors are parallel.Cases of radial optical flow are much more complicated for mosaicing as the optical flow is not parallel and may depend on the structure of the scene.The pipe projection described in this section simplifies the mosaicing in these cases.
The internal parameters of the camera are assumed to be known throughout this presentation.It should be men-tioned though that an error in the focal length results only in a scaling of the mosaic.
Given a sequence of images taken by a translating camera,we would like to transform the images such that the radial optical flow will turn into a parallel optical flow in the transformed representation.In order to do that,we project the2D planar image onto a3D cylindrical manifold, which we call a pipe(see Fig.8).The axis of the pipe^s is chosen to pass through the optical center O  0;0;0 and through the FOE S  s x;s y;f c ,where f c is the focal length, and,thus,^s S=j S j.Each image point
P  x;y;f c is projected onto to its corresponding point Q on the pipe.The point Q is collinear with O and P and its distance from the pipe's axis^s is R,where R is the radius of the pipe.The pipe projection is similar to the pipe-rectification proposed in [25]for stereo matching,but the projection from image to cylinder is different.
In the pipe representation of the image,the optical flow of each pixel Q is now parallel to the pipe's axis^s.This does not solve the problem of motion parallax,but limits the distortion caused by parallax to be only along the trajectory of the camera.When the frame-rate is high and the depth differences are not significant,the pipe projection gives a good approximation of the mosaic generated from the ªCircular slitºmodel.View interpolation can be used to reduce the parallax effects when the frame rate is not high enough.
The position in the pipe of a point Q is represented by k, the position along the axis^s,and ,the angle from^d.^d and^r are unit vectors chosen to form a Cartesian coordinate system together with^s.The3D position of a point k;  on the pipe is Q  Q x;Q y;Q z  k^s Rcos  ^d Rsin  ^r and the corresponding pixel in the image plane for the point Q is P  x;y;f c  f c Q x=Q z;f c Q y=Q z;f c .
Pixels in the image whose original distance from the axis ^s is less than R become magnified on the
pipe,but,when projected back to the image,they restore their resolution. However,pixels with distance greater than R shrink on the pipe,thus losing their original resolution.Note that selecting
R
f c2
w
2
2
h
2
2
s
;
where w and h are the width and height of the image, ensures that no pixel will reduce its resolution at the projection,as the intersection of the pipe with the image plane never occurs within the image boundaries.
Most regions on the pipe are covered by projections from several images.For every point in the pipe,the projected values are taken from the image having the best resolution among all projected images.The best preserved resolution is around the intersection of the pipe with the image plane (Q z f)and the resolution decreases as j Q zÀf c j increases. This definition forms a strip which will be be taken from that image having best resolution.An example is in Fig.9. The pipe representation is a natural generalization of the planar manifold and the cylindrical manifold:The planar manifold is analogous to the side of the pipe and the cylindrical manifold is analogous to a pipe with zero trajectory.
Cases like oblique view,forward motion,and zoom can be handled well using the pipe projection and give good results,while traditional mosaicing methods may fail in these cases.The pipe projection fits best,cases with close-to-linear translation.When the motion is more complicated,it is recommended to use the strip mechanism described in the next section.An example for the use of pipe projection is shown in Fig.10.
3.3Curved Strips for Forward Motion
Manifold mosaicing with curved strips can be used for the case of varying forward motion by implicitly simulating projection of strips onto a mosaicing manifold.The manifold is determined by the motion and can vary from one image to another,as demonstrated in Fig.7.For simplicity,we show here the implementation of the scheme when the motion can be described by a parametric model.
3.3.1Image Alignment
The image alignment used in our implementation for forward camera motion was a transformation describing a perspective projection of a plane(homography).While this is a transformation having only eight parameters,it gave reasonable overall results in our examples,as in Fig.15. This transformation is described by the following equations:
x nÀ1
y nÀ1
a bx n cy n
1 gx n hy n
d ex n fy n
1 gx n hy n
@
1
A: 1
Image alignment is performed between every two consecutive images using one of many ,[2]). Since rotation about the optical axis does not introduce new information,we derotate the images by an approximation to such a rotation:2!z%eÀc2.After the derotation,the homo-graphy is recomputed.The process of computing the
2.In a small rotation!z about the Z-axis,x H xÀ!z y and y H y !z x. Therefore,in(1),e%!z and c%À!z.

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。