by bishala on 8/28/19, 4:38 AM with 144 comments
by coldtea on 8/28/19, 10:58 AM
From the comment in protobuf source (which does the same thing as Python), mentioned in the Twitter thread:
(...) An arguably better strategy would be to use the algorithm described in "How to Print Floating-Point Numbers Accurately" by Steele & White, e.g. as implemented by David M. Gay's dtoa(). It turns out, however, that the following implementation is about as fast as DMG's code. Furthermore, DMG's code locks mutexes, which means it will not scale well on multi-core machines. DMG's code is slightly more accurate (in that it will never use more digits than necessary), but this is probably irrelevant for most users.
Rob Pike and Ken Thompson also have an implementation of dtoa() in third_party/fmt/fltfmt.cc. Their implementation is similar to this one in that it makes guesses and then uses strtod() to check them. (...)
https://github.com/protocolbuffers/protobuf/blob/ed4321d1cb3...
by fs111 on 8/28/19, 10:21 AM
by Noe2097 on 8/28/19, 11:27 AM
by bhouston on 8/28/19, 11:34 AM
I always implemented round to a specific digit based on the built-in roundss/roundsd functions which are native x86-64 assembler instructions (i.e. https://www.felixcloutier.com/x86/roundsd).
I do not understand why this would not be preferable to the string method.
float round( float x, int digits, int base) { float factor = pow( base, digits ); return roundss( x * factor ) / factor; }
I guess this has the effect of not working for numbers near the edge of it's range.
One could check this and fall back to the string method. Or alternatively use higher precision doubles internally:
float round( float x, int digits, int base ) { double factor = pow( base, digits ); return (float)( roundsd( x * factor ) / factor ); }
But then what do you do if you have a double rounded and want to maintain all precision? I think there is likely some way to do that by somehow unpacking the double into a manual mantissa and exponent each of which are doubles and doing this manually - or maybe using some type of float128 library (https://www.boost.org/doc/libs/1_63_0/libs/multiprecision/do...)...
But changing this implementation now could cause slight differences and if someone was rounding then hashing this type of changes could be horrible if not behind some type of opt-in.
by bishala on 8/28/19, 4:39 AM
by shellac on 8/28/19, 11:03 AM
by latchkey on 8/28/19, 10:10 AM
by zelly on 8/28/19, 11:58 AM
by analog31 on 8/28/19, 12:27 PM
In some cases, rounding is performed for the primary purpose of displaying a number as a string, in which case it can't be any less complicated than the string conversion function itself.
by jancsika on 8/28/19, 3:43 PM
Is there a phrase for the ratio between the frequency of an apparent archetype of a bug/feature and the real-world occurrences of said bug/feature? If not then perhaps the "Fudderson-Hypeman ratio" in honor of its namesakes.
For example, I'm sure every C programmer on here has their favored way to quickly demo what bugs may come from C's null-delimited strings. But even though C programmers are quick to cite that deficiency, I'd bet there's a greater occurrence of C string bugs in the wild. Thus we get a relatively low Fudderson-Hypeman ratio.
On the other hand: "0.1 + 0.2 != 0.3"? I'm just thinking back through the mailing list and issue tracker for a realtime DSP environment that uses single-precision floats exclusively as the numeric data type. My first approximation is that there are significantly more didactic quotes of that example than reports of problems due to the class of bugs that archetype represents.
Does anyone have some real-world data to trump my rank speculation? (Keep in mind that simply replying with more didactic examples will raise the Fudderson-Hypeman ratio.)
by d--b on 8/28/19, 12:03 PM
by ericfrederich on 8/28/19, 1:27 PM
I remember even writing a program that tested every possible floating point number (must have only been 32 bit). I think I used ctypes and interpreted every binary combination of 32 bits as a float, turned it into a string, then back and checked equality. A lot of them were NaN.
by deckar01 on 8/28/19, 1:23 PM
by ChrisSD on 8/28/19, 10:15 AM
by dahart on 8/28/19, 5:08 PM
by kstenerud on 8/29/19, 5:46 AM
The only silly part of ieee754 2008 is the fact that they specified two representations (DPD, championed by IBM, and BID, championed by Intel) with no way to tell them apart.
by science404 on 8/28/19, 1:59 PM
CPython rounds float values by converting them to string and then back
by Jenz on 8/28/19, 10:07 AM
by acoye on 8/28/19, 3:59 PM
by seamyb88 on 8/28/19, 4:14 PM