Module std::num::dec2flt
[−]
[src]
dec2flt
): internal routines only exposed for testing
Converting decimal strings into IEEE 754 binary floating point numbers.
Problem statement
We are given a decimal string such as 12.34e56
. This string consists of integral (12
),
fractional (45
), and exponent (56
) parts. All parts are optional and interpreted as zero
when missing.
We seek the IEEE 754 floating point number that is closest to the exact value of the decimal string. It is well-known that many decimal strings do not have terminating representations in base two, so we round to 0.5 units in the last place (in other words, as well as possible). Ties, decimal values exactly half-way between two consecutive floats, are resolved with the half-to-even strategy, also known as banker's rounding.
Needless to say, this is quite hard, both in terms of implementation complexity and in terms of CPU cycles taken.
Implementation
First, we ignore signs. Or rather, we remove it at the very beginning of the conversion process and re-apply it at the very end. This is correct in all edge cases since IEEE floats are symmetric around zero, negating one simply flips the first bit.
Then we remove the decimal point by adjusting the exponent: Conceptually, 12.34e56
turns
into 1234e54
, which we describe with a positive integer f = 1234
and an integer e = 54
.
The (f, e)
representation is used by almost all code past the parsing stage.
We then try a long chain of progressively more general and expensive special cases using
machine-sized integers and small, fixed-sized floating point numbers (first f32
/f64
, then
a type with 64 bit significand, Fp
). When all these fail, we bite the bullet and resort to a
simple but very slow algorithm that involved computing f * 10^e
fully and doing an iterative
search for the best approximation.
Primarily, this module and its children implement the algorithms described in: "How to Read Floating Point Numbers Accurately" by William D. Clinger, available online: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.4152
In addition, there are numerous helper functions that are used in the paper but not available in Rust (or at least in core). Our version is additionally complicated by the need to handle overflow and underflow and the desire to handle subnormal numbers. Bellerophon and Algorithm R have trouble with overflow, subnormals, and underflow. We conservatively switch to Algorithm M (with the modifications described in section 8 of the paper) well before the inputs get into the critical region.
Another aspect that needs attention is the RawFloat
trait by which almost all functions
are parametrized. One might think that it's enough to parse to f64
and cast the result to
f32
. Unfortunately this is not the world we live in, and this has nothing to do with using
base two or half-to-even rounding.
Consider for example two types d2
and d4
representing a decimal type with two decimal
digits and four decimal digits each and take "0.01499" as input. Let's use half-up rounding.
Going directly to two decimal digits gives 0.01
, but if we round to four digits first,
we get 0.0150
, which is then rounded up to 0.02
. The same principle applies to other
operations as well, if you want 0.5 ULP accuracy you need to do everything in full precision
and round exactly once, at the end, by considering all truncated bits at once.
FIXME Although some code duplication is necessary, perhaps parts of the code could be shuffled around such that less code is duplicated. Large parts of the algorithms are independent of the float type to output, or only needs access to a few constants, which could be passed in as parameters.
Other
The conversion should never panic. There are assertions and explicit panics in the code, but they should never be triggered and only serve as internal sanity checks. Any panics should be considered a bug.
There are unit tests but they are woefully inadequate at ensuring correctness, they only cover
a small percentage of possible errors. Far more extensive tests are located in the directory
src/etc/test-float-parse
as a Python script.
A note on integer overflow: Many parts of this file perform arithmetic with the decimal
exponent e
. Primarily, we shift the decimal point around: Before the first decimal digit,
after the last decimal digit, and so on. This could overflow if done carelessly. We rely on
the parsing submodule to only hand out sufficiently small exponents, where "sufficient" means
"such that the exponent +/- the number of decimal digits fits into a 64 bit integer".
Larger exponents are accepted, but we don't do arithmetic with them, they are immediately
turned into {positive,negative} {zero,infinity}.
Modules
parse |
[Unstable] Validating and decomposing a decimal string of the form: |
rawfp |
[Unstable] Bit fiddling on positive IEEE 754 floats. Negative numbers aren't and needn't be handled. Normal floating point numbers have a canonical representation as (frac, exp) such that the value is 2exp * (1 + sum(frac[N-i] / 2i)) where N is the number of bits. Subnormals are slightly different and weird, but the same principle applies. |
Structs
ParseFloatError |
An error which can be returned when parsing a float. |