Functions to Support Input and Output of Intervals--688IT编程网

a r X i v :c s /0703003v 1 [c s .N A ] 28 F e

b 2007Functions to Support Input and Output of Intervals

M.H.van Emden ∗ B.Moa †S.C.Somosan ‡

Research Report DCS-311-IR

Department of Computer Science

University of Victoria Canada

Abstract Interval arithmetic is hardly feasible without directed rounding as provided,for example,by the IEEE ﬂoating-point standard.Equally essential for interval methods is directed rounding for conversion between the external decimal and internal binary numerals.This is not provided by the standard I/O libraries.Conversion algorithms exist that guarantee identity upon conversion followed by its inverse.Although it may be possible to adapt these algorithms for use in decimal interval I/O,we argue that outward rounding in radix conversion is computationally a simpler problem than guaranteeing identity.Hence it is preferable to develop decimal interval I/O ab initio ,which is what we do in this paper.1Introduction Interval arithmetic endows every computation with the authority of proof —the theo

rem being that the real-valued solution x belongs to a set of reals [a,b ],where a and b are IEEE-standard ﬂoating-point numbers.Yet it can easily happen that we get as output something obviously wrong such as [0.33333,0.33333]when x =1/3.It may well be that a correct interval has been computed.For example,if [a,b ]is the narrowest single-length IEEE-standard ﬂoating-point interval containing 1/3,and if we do not allow the output to be truncated,we get [a,b ]as

[0.333333313465118408203125,

0.3333333432674407958984375],

which is best written as

0.3333333[13465118408203125,432674407958984375],a notation proposed and analyzed in [9].This is the unabridged version of [0.33333,0.33333],

which is what we get with a typical default precision.The untruncated decimal numerals are exact representations of[a,b]because every binaryﬂoating-point number can be repre-sented as a decimal numeral.However,if one wants to ensure exact representation,then we get a decimal for every bit;not an economical representation.Then we might as well write the bits themselves,which gives

0.010101010101010101010101[0,1].

On output the only problem is that any reasonable choice of display precision causes the standard output routines to shorten the numerals for the bounds to the same result.The improvement needed here is output that is aware of whether an upper or a lower bound is to be displayed.

On input,however,we have a more serious problem.Suppose we want to initialize an interval variable at0.1.Initializing the lower and upper bounds with

lb=ub=0.1

is guaranteed to generate an interval that does not contain0.1,for the simple reason that there is noﬂoating-point number equal to0.1.Thus the best the compiler can do is to produce one of theﬂoating-point numbers closest to0.1.We don’t know which one it is.The table in Figure1shows for each of1/2,1/3,...,1/10that the compiler picks the upper bound of the narrowest interval containing the fraction concerned.To remind us that we cannot count on this to happen,the other choice is made for1/11.Thus,for input we need an algorithm that produces,for every fraction or numeral as input,a narrow interval containing it;ideally the narrowest.

Floating-point numbers are normalized as a power of2multiplied by a mantissa that is between1and2,like this:

2ˆ(-2)* 1.0101010101010101010101[0,1].

This shows the single-length format,which has23bits after the binary point.

Long strings of bits are usually shown in hexadecimal notation,which uses the16char-acters

0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f

to represent four consecutive bits at a time.As there are only23bits to be displayed,we represent theﬁrst three bits in octal notation,with the characters

0,1,2,3,4,5,6,7.

Thus the23bits after the binary point are shown as an octal character followed byﬁve hexadecimal characters.Thus,the narrowest interval containing1/3is

2ˆ(-2)* 1.2aaaa[a,b].

In this way we get the table in Figure1.

In this paper we develop algorithms to support software that performs correct input and output of intervals.

1/i floating-point number narrowest interval

produced by standard containing1/i

I/O library via compiler

--------------------------------------------------------

1/22ˆ(-1)* 1.0000002ˆ(-1)* 1.000000[,]

1/32ˆ(-2)* 1.2aaaab2ˆ(-2)* 1.2aaaa[a,b]

1/42ˆ(-2)* 1.0000002ˆ(-2)* 1.000000[,]

1/52ˆ(-3)* 1.4ccccd2ˆ(-3)* 1.4cccc[c,d]

1/62ˆ(-3)* 1.2aaaab2ˆ(-3)* 1.2aaaa[a,b]

1/72ˆ(-3)* 1.1249252ˆ(-3)* 1.12492[4,5]

1/82ˆ(-3)* 1.0000002ˆ(-3)* 1.000000[,]

1/92ˆ(-4)* 1.638e392ˆ(-4)* 1.638e3[8,9]

1/102ˆ(-4)* 1.4ccccd2ˆ(-4)* 1.4cccc[c,d]

1/112ˆ(-4)* 1.3a2e8c2ˆ(-4)* 1.3a2e8[c,d] Figure1:Why standard input cannot be relied on to obtain an interval for some of the common fractions.

2Previous work

The problems described in the introduction have motivated Rump[8]to develop a method for decimal input and output for intervals.He computes two2-dimensional arrays low and upp of double-lengthﬂoating-point numbers that satisfy

low[d,e]≤d×10e≤upp[d,e]

for d∈{1,2,...,9}and for e∈{−340,−339,...,308}.These two arrays contain a total of11682double-lengthﬂoating-point numbers.

With these precomputed numbers available,a decimal d k×10e is converted on input to the interval

i=1low[d i,e−i],

i=1

upp[d i,e−i]](1)

of double-lengthﬂoating-point numbers,where the inner brackets enclose array subscripts. The additions for the lower(upper)bound are performed in downward(upward)rounding mode.As a result of the possibly occurring roundings,the interval obtained is not in general as narrow as possible.To minimize the inevitable widening,Rump recommends performing the additions starting with the smallest array elements.

The work of Rump should be compared and contrasted with what we will call here general-purpose c

onversion algorithms.Though these are not speciﬁcally intended for in-terval conversions,they can perhaps be adapted to this purpose because of their guarantees on accuracy.

Let us brieﬂy review these algorithms.Steele and White[5]formulated identity re-quirements for conversions between internal and external numerals.The internal identity requirement is that conversion of an internal numeral to external and back results in the

same.If the external numerals are unlimited in length,this requirement can always be met. Similarly,the external identity requirement is that conversion of an external numeral to in-ternal and back results in the same.Usually,the internal numerals have aﬁxed length,so this requirement can only be met when the external numeral is not longer than is warranted by thisﬁxed length.

Every binary numeral is representable as a decimal one.Hence,if the internal numerals are binary and the external ones are decimal,then it is trivial to satisfy the internal identity requirement by using the decimal equivalent of the internal binary numeral.The problem addressed by Steele and White is to do so with as few decimal digits as possible.

To transfer binaryﬂoating-point numerals from one computer to another by means of binaryﬁles,one needs to be assured that theseﬁles are written and read in a compatible way. Moreover,the layout of t

heﬂoating-point numerals of both machines needs to be the same. Adherence to the IEEEﬂoating-point format is not enough:in addition,both machines have to be big-endian or both need to be little-endian.Because of these difﬁculties it is attractive to convert internal binary numerals to an external textﬁle containing decimal numerals and to have an algorithm that guarantees faithful translation back to internal format.

Such use of decimal I/O requires that the conversion is efﬁcient.For this reason,Gay [3]and Burger and Dybvig[1]devised a faster decimal output.The identity requirement assumes sufﬁciently accurate decimal input.This was addressed by Clinger[2]and by Gay [3].

Compared to Rump’s approach,this has the advantage of not requiring a database of precomputed numbers.

The identity requirements satisﬁed by[5,3,1,2]are convincing for the purposes en-visaged by these authors.However,the requirements for decimal interval I/O are equally compelling and quite different:

1.On input,compute the narrowestﬂoating-point interval containing the decimal nu-

meral.

2.On output,compute the narrowest interval containing theﬂoating-point number that

is bounded by decimal numerals of speciﬁed length.

To satisfy these requirements it seemed simplest to develop our algorithms fromﬁrst prin-ciples rather than to attempt to adapt the work referenced above.

From[4]it seems that XSC does the conversions between binary and decimal numerals correctly,but the authors do not say how it was done.Although the source code of XSC is publicly available(GPL license),it contains more than150ﬁles.Moreover,the code is not documented in such a way as to facilitateﬁnding the pieces of code that do the conversion, to gather them,and to understand how the conversion was done.

Our task,then,is to explain how to do the conversion,and to show how to implement it.As the utility of this kind of work requires executable programs,we needed to pick an implementation language.The primary language for numerical work is Fortran.However, this is more in the direction of systems programming,for which C/C++is a reasonable choice of language.

3Decimal input

The“scientiﬁc notation”for a number consisting of a decimal fraction and an exponent is almost universally used.Thus it would seem that one only has to cater to this format for I/O routines.Yet for input there is a strong case to be made for the pre-scientiﬁc notation of a fraction as a pair of integers.Therefore we consider these two in turn.

3.1Rational fractions

In many situations the most convenient way to input a number is as a rational fraction p/q,where p and q are integers.Not only is it convenient,but there is also a convincing correctness criterion:as there exists,in a given format,a unique leastﬂoating-point interval that contains p/q,the input function should yield this interval.

An algorithm for this purpose is one that has been widely,but not universally,taught to children for at least two centuries.We will illustrate this algorithm with p/q=3/7.As this number is less than1,we know that the binary fraction has the form0·....How do we get the missing digits?Multiply by two to get6/7.As the result is less than one,the result has the Multiply by two to get12/7.As the result is not less than one, the result has the and subtract1from12/7,so that we continue with5/7. Double again,get10/7,so that we continue with3/7after noting that the result has the form0.0

Let us check our computation.Theﬁrst group of digits after the decimal point is worth 3/8.Every next one is worth1/8times the previous group of three.So the value of the inﬁnite string is(3/8)∗(1+(1/8)+(1/8)2+...).Let S=1+(1/8)+(1/8)2+.... Then S/8=(1/8)+(1/8)2+...=S−1.Hence S=8/7and weﬁnd that the inﬁnite string is(3/8)∗(8/7)=3/7.

Let us now translate this procedure to a machine-executable algorithm.We can decide whether the next binary digit should be a0or a1by computing in a variable pwr(from “power”)the successive powers2−k for k=0,1,2,...and adding some of these powers in a variable named frac(from“fraction”).We compute the next value of pwr at every step,but only add it to frac when deciding to write a1in the algorithm described above. This idea is embodied in the following code.

float frac=0.0;

//fraction to be built up out of powers of two

float pwr= 1.0;//power of2to add to fraction

/let r=p/q

while(frac+pwr>frac&&p>0){

//p>0and r-frac=(p/q)*pwr

pwr=pwr/2.0;p=2*p;

if(p>=q){

p=p-q;frac=frac+pwr;

}

//if p=0then r=frac

//if p>0then0<r-frac<=pwr

}

Typically,there are inﬁnitely many binary digits.Only theﬁrst of aﬁnite segment can be accommodated

in aﬂoating-point number.At every iteration pwr is halved.At some point this quantity becomes insigniﬁcant.The criterion for this is whether adding pwr to frac makes any difference to frac.As soon as that is not the case,the iteration terminates.

The last assertion states that every time around the loop we have that frac is a lower bound for p/q and that the difference between the two is at most2−i,where i is the number of times around the loop.This code builds in frac a lower bound to p/q that approaches p/q as closely as the precision allows.If p/q has can be represented in theﬂoating-point number format,this shows by p becoming0.

The above segment of code is the kernel of the function shown in Figure2for the IEEE standard754single-length format.It produces the leastﬂoating-point interval that contains the fraction p/q.

Interval*convert(int p,int q){

//Assumes p and q are positive and less than2ˆ{30}.

//Returns the greatest float that is not greater than

//the rational r=p/q.

steelefloat sf=scaleFactor(p,q);

float frac=0.0;

//fraction to be built up out of powers of two

float pwr= 1.0;//power of2to add to fraction

while(frac+pwr>frac&&p>0){

//p>0and r-frac=(p/q)*pwr

pwr=pwr/2.0;p=2*p;

if(p>=q){

p=p-q;frac=frac+pwr;

}//p>0and r-frac=(p/q)*pwr

}

frac=frac*sf;

if(p==0)//r=frac

return new Interval(frac,frac);

else return new Interval(frac,next(frac));

}

Figure2:A C++function to convert a rational p/q to aﬂoating-point number.The function next()produces the leastﬂoating-point number greater than its argument.

The function assumes that0.5≤p/q<1,which is of course not in general the case.It therefore needs the function in Figure3.

3.2Input of decimalﬂoating-point numerals

Many programming languages and dataﬁles use a similar format for fractional numbers. Though details may vary,it is easy to extract from suchﬁles an integer e containing the

688IT编程网

Functions to Support Input and Output of Intervals

发表评论

推荐文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

688IT编程网

Functions to Support Input and Output of Intervals

发表评论

推荐文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时 正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

非零金额正则表达式

半小时正则表达式