Adaptive Thresholding Using the Integral Image
Derek Bradley∗Carleton University,Canada derek@derekbradley.ca
Gerhard Roth
National Research Council of Canada
Gerhard.a
Figure1:Real-time adaptive image thresholding.Left:Input image.Center:Wellner’s previous technique.Right: Our technique.
Abstract.Image thresholding is a common task in many computer vision and graphics applications.The goal of thresholding an image is to classify pixels as either“dark”or“light”.Adaptive thresholding is a form of thresholding that takes into account spatial variations in illumination.We present a technique for real-time adaptive thresholding using the integral image of the input.Our technique is an extension of a previous method.However,our solution is more robust to illumination changes in the image.Additionally,our method is simple and easy to implement. Our technique is suitable for processing live video streams at a real-time frame-rate,making it a valuable tool for interactive applications such as augmented reality.Source code is available online.
1Introduction
Image thresholding segments a digital image based on a certain characteristic of the pixels(for example,intensity value).The goal is to create a binary representation of the image,classifying each pixel into one of two categories, such as“dark”or“light”.This is a common task in many image processing applications,and some computer graphics applications.For example,it is often one of thefi
rst steps in marker-based augmented reality systems[Billinghurst et al.2001;Bradley and Roth2004;Fiala2005],and it has been used in high dynamic range photography[Ward 2003].The most basic thresholding method is to choose afixed threshold value and compare each pixel to that value.These techniques have been described and evaluated extensively in a number of survey papers[Weszka and Rosenfeld1978;Palumbo et al.1986;Sahoo et al.1988;Lee et al.1990;Glasbey1993;Trier and Jain1995;Sezgin and Sankur2004].However,fixed thresholding often fails if the illumination varies spatially in the image or over time in a video stream.
In order to account for variations in illumination,the common solution is adaptive thresholding.The main difference here is that a different threshold value is computed for each pixel in the image.This technique provides more robust-ness to changes in illumination.A number of adaptive thresholding methods exist[White and Rohrer1983;Bernsen 1986;Parker1991;Wellner1993;Yang et al.1994;Shen and Ip1997;Chan et al.1998;Savakis1998;Sauvola and Pietikainen2000;Yang and Yan2000].Further examples and comparisons can be found in[Venkateswarlu and Boyle1995;Sezgin and Sankur2004].We present a very simple and clear technique using integral images.Our ∗Currently at The University of British Columbia,bradleyd@cs.ubc.ca
method is easy to implement for real-time performance on a live video stream.Though our technique
is an extension to a previous method[Wellner1993],we increase robustness to strong illumination changes.In addition,we present a clear and tidy solution without increasing the complexity of the implementation.Our technique is also similar to the thresholding method of White and Rohrer for optical character recognition[White and Rohrer1983],however we present an implementation designed for real-time video.The motivation for this work isfindingfiducials in augmented reality applications.Pintaric also presents an adaptive thresholding algorithm specifically for augmented reality markers[Pintaric2003],however his method requires that afiducial has been located in a previous frame in order for the technique to threshold correctly.Our algorithm makes no assumptions and is more general,suitable for use in any application.The source code is available online at the address listed at the end of this paper.
2Background
2.1Real-Time Adaptive Thresholding
In this paper we focus on adaptively thresholding images from a live video stream.In order to maintain real-time performance,the thresholding algorithm must be limited to a small constant number of iterations through each image.Thresholding is often a sub-task that makes up part of a lar
ger process.For instance in augmented reality, input images must be segmented to locate known markers in the scene that are used to dynamically establish the pose of the camera.A simple and fast adaptive thresholding technique is therefore an important tool.
2.2Integral Images
An integral image(also known as a summed-area table)is a tool that can be used whenever we have a function from pixels to real numbers f(x,y)(for instance,pixel intensity),and we wish to compute the sum of this function over a rectangular region of the image.Examples of where integral images have been applied include texture mapping[Crow1984],face detection in images[Viola and Jones2004],and stereo correspondence[Veksler2003]. Without an integral image,the sum can be computed in linear time per rectangle by calculating the value of the function for each pixel individually.However,if we need to compute the sum over multiple overlapping rectangular windows,we can use an integral image and achieve a constant number of operations per rectangle with only a linear amount of preprocessing.
To compute the integral image,we store at each location,I(x,y),the sum of all f(x,y)terms to the left and above the pixel(x,y).This is accomplished in linear time using the following equation for each pixel(taking into account the border cases),
I(x,y)=f(x,y)+I(x−1,y)+I(x,y−1)−I(x−1,y−1).(1) Figure2(left and center)illustrates the computation of an integral image.Once we have the integral image,the sum of the function for any rectangle with upper left corner(x1,y1),and lower right corner(x2,y2)can be computed in constant time using the following equation,
x2∑x=x1
y2
∑
y=y1
f(x,y)=I(x2,y2)−I(x2,y1−1)−I(x1−1,y2)+I(x1−1,y1−1).(2)
Figure2(right)illustrates that computing the sum of f(x,y)over the rectangle D using Equation2is equivalent to computing the sums over the rectangles(A+B+C+D)-(A+B)-(A+C)+A.
Figure2:The integral image.Left:A simple input of image values.Center:The computed integral image.Right: Using the integral image to calculate the sum over rectangle D.
3The Technique
Our adaptive thresholding technique is a simple extension of Wellner’s method[Wellner1993].The main idea in Wellner’s algorithm is that each pixel is compared to an average of the surrounding pixels.Specifically,an approximate moving average of the last s pixels seen is calculated while traversing the image.If the value of the current pixel is t percent lower than the average then it is set to black,otherwise it is set to white.This method works because comparing a pixel to the average of nearby pixels will preserve hard contrast lines and ignore soft gradient changes.The advantage of thi
s method is that only a single pass through the image is required.Wellner uses1/8th of the image width for the value of s and15for the value of t.However,a problem with this method is that it is dependent on the scanning order of the pixels.In addition,the moving average is not a good representation of the surrounding pixels at each step because the neighbourhood samples are not evenly distributed in all directions.By using the integral image(and sacrificing one additional iteration through the image),we present a solution that does not suffer from these problems.Our technique is clean,straightforward,easy to code,and produces the same output independently of how the image is processed.Instead of computing a running average of the last s pixels seen,we compute the average of an s x s window of pixels centered around each pixel.This is a better average for comparison since it considers neighbouring pixels on all sides.The average computation is accomplished in linear time by using the integral image.We calculate the integral image in thefirst pass through the input image.In a second pass,we compute the s x s average using the integral image for each pixel in constant time and then perform the comparison. If the value of the current pixel is t percent less than this average then it is set to black,otherwise it is set to white. The following pseudocode demonstrates our technique for input image in,output binary image out,image width w and image height h.
procedure AdaptiveT hreshold(in,out,w,h)
1:for i=0to w do
2:sum←0
3:for j=0to h do
4:sum←sum+in[i,j]
5:if i=0then
6:intImg[i,j]←sum
7:else
8:intImg[i,j]←intImg[i−1,j]+sum
9:end if
10:end for
11:end for
12:
for i =0to w do 13:
for j =0to h do 14:
x 1←i −s /2{border checking is not shown }15:
x 2←i +s /216:
y 1←j −s /217:
y 2←j +s /218:
count ←(x 2−x 1)×(y 2−y 1)19:
sum ←intImg [x 2,y 2]−intImg [x 2,y 1−1]−intImg [x 1−1,y 2]+intImg [x 1−1,y 1−1]20:
if (in [i ,j ]×count )≤(sum ×(100−t )/100)then 21:
out [i ,j ]←022:
else 23:
out [i ,j ]←25524:
end if 25:
end for 26:end for
4Performance and Examples
We tested our real-time adaptive thresholding technique on a Pentium 4processor at 3.4Ghz with a Point Grey Dragonfly camera,capturing at a resolution of 640x 480pixels.On average our technique operates at approximately 15milliseconds per frame,yielding a frame-rate of 65frames per second.Since we require two iterations through each image rather than just one as in Wellner’s method,we expect to perform slower.In practice we measured our technique at approximately 2.5times slower than that of Wellner,however we still achieve a real-time frame-rate and a better segmentation.
Two examples of our adaptive thresholding result are presented in Figure 1and Figure 3.The results of Wellner’s method are also shown.Figure 1illustrates a text example with a very dark shadow.Our
method is able to segment all of the text in the image.Figure 3illustrates an augmented reality example with a number of planar markers that are tracked in a video stream.In this example our technique provides near perfect segmentation despite the strong illumination changes in the image.Wellner’s technique fails at the specular reflection in the center of the image at the bottom,and the dark shadow in the top corners.
Figure 3:Augmented reality marker example.Left:Input image.Center:Wellner’s method.Right:Our technique.5Discussion
Our previous work in augmented reality [Bradley and Roth 2004]demonstrates a need for robust adaptive threshold-ing of real-time video streams.Our technique is well-suited for scenes with strong spatial changes in illumination.
Temporal variations in illumination are also handled automatically,which is not the case for global thresholding methods.The main drawback of our method is that we must process the image twice.However,this is not a sig-nificant drawback with the current speed of processors and our technique is still suitable for real-time applications. The amount of processing can also be minimized by performing the thresholding step at pixel(i−s/2,j−s/2)at the same time as computing the integral image at pixel(i,j).In this case,a constant amount of processing must be performed for the following number of pixels,
(w×h)+(s
2
×w)+(
s
2
×h)−(
s2
4
).(3)
This technique for thresholding makes the assumption that the image contains primarily background pixels(to be segmented as white)and that the foreground pixels are rather distributed in the image.In the case of a foreground component that is larger than s x s pixels,the center of the component will be incorrectly classified as background. This problem can be alleviated by using a larger kernel(ie:by increasing s),howeverfine details in the segmentation may then be lost.The kernel size should be set appropriately according to the application.
References
B ERNSEN,J.1986.Dynamic thresholding of gray-level images.In Int.Conf.Pattern Recognition,vol.2,1251–1255.
B ILLINGHURST,M.,K ATO,H.,AND P OUPYREV,I.2001.The magicbook-moving seamlessly between reality and virtuality.IEEE Comput.Graph.Appl.21,3,6–8.
B RADLEY,D.,AND R OTH,G.2004.Augmenting non-rigid objects with realistic lighting.Tech.Rep.NRC47398, National Research Council of Canada,October.
C HAN,F.H.Y.,L AM,F.K.,AN
D Z HU,H.1998.Adaptive thresholding by variational method.IEE
E Transactions on Image Processing7,3(March),468–473.
C ROW,F.C.1984.Summed-area tables for texture mapping.SIGGRAPH Comput.Graph.18,3,207–212.
F IALA,M.2005.ARTag,afiducial marker system using digital techniques.In Proc.of Computer Vision and Pattern Recognition,vol.2,590–596.
G LASBEY,C.A.1993.An analysis of histogram-based thresholding algorithms.CVGIP:Graph.Models Image Process.55,6,532–537.
L EE,S.,C HUNG,S.,AND P ARK,R.1990.A comparative performance study of several global thresholding techniques for segmentation.Computer Vision Graphics Image Processing52,2(Nov.),171–190.
P ALUMBO,P.W.,S WAMINATHAN,P.,AND S RIHARI,S.N.1986.Document image binarization:Evaluation of algorithms.SPIE697,278–285.
P ARKER,J.R.1991.Gray level thresholding in badly illuminated images.IEEE Trans.Pattern Anal.Mach.Intell. 13,8,813–819.
P INTARIC,T.2003.An adaptive thresholding algorithm for the augmented reality toolkit.In IEEE Int.Augmented Reality Toolkit Workshop.
S AHOO,P.K.,S OLTANI,S.,W ONG,A.K.,AND C HEN,Y.C.1988.A survey of thresholding techniques. Comput.Vision Graph.Image Process.41,2,233–260.
S AUVOLA,J.,AND P IETIKAINEN,M.2000.Adaptive document image binarization.PR33,2(February),225–236.
S AVAKIS,A.E.1998.Adaptive document image thresholding using foreground and background cluste
ring.In ICIP(3),785–789.
S EZGIN,M.,AND S ANKUR,B.2004.Survey over image thresholding techniques and quantitative performance evaluation.Journal of Electronic Imaging13,1,146–168.
S HEN,D.,AND I P,H.H.S.1997.A hopfield neural network for adaptive image segmentation:An active surface paradigm.PRL18,37–48.
T RIER,O.D.,AND J AIN,A.K.1995.Goal-directed evaluation of binarization methods.PAMI17,12(December), 1191–1201.
V EKSLER,O.2003.Fast variable window for stereo correspondence using integral images.In IEEE Conference on Computer Vision and Pattern Recognition(CVPR2003),556–661.
adaptiveV ENKATESWARLU,N.B.,AND B OYLE,R.D.1995.New segmentation techniques for document image analysis. IVC13,7(September),573–583.
V IOLA,P.,AND J ONES,M.J.2004.Robust real-time face detection.Int.J.Comput.Vision57,2,137–154.
W ARD,G.2003.Fast,robust image registration for compositing high dynamic range photographs from hand-held exposures.journal of graphics tools8,2,17–30.
W ELLNER,P.D.1993.Adaptive thresholding for the digitaldesk.Tech.Rep.EPC-93-110,EuroPARC.
W ESZKA,J.S.,AND R OSENFELD,A.1978.Threshold evaluation techniques. System,Man and Cybernetics SMC-8,8,622–629.
W HITE,J.M.,AND R OHRER,G.D.1983.Image thresholding for optical character recognition and other applica-tions requiring character image extraction.IBMRD27,4(July),400–411.
Y ANG,Y.,AND Y AN,H.2000.An adaptive logical method for binarization of degraded document images.PR33, 5(May),787–807.
Y ANG,J.D.,C HEN,Y.S.,AND H SU,W.H.1994.Adaptive thresholding algorithm and its hardware implemen-tation.PRL15,141–150.
Web Information:
Source code for our technique and additional image examples are available on the web.The web address is www.derekbradley.ca/AdaptiveThresholding/index.html.
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论