c - Conversion between float and int, byte representation?

Question

Welcome To Ask or Share your Answers For Others

c - Conversion between float and int, byte representation?

1 Reply

深蓝 · Answer 1 · 2021-10-23T20:07:05+0000

The most common encoding of floating-point numbers uses IEEE 754. For single-precision numbers, there is a sign bit (s), 8 exponent bits (e), and 23 fraction bits (f).

For most values of s, e, and f, the value represented is -1^s?2^e-127?F, where F is the number you get by writing “1.” followed by the 23 bits of f and then interpreting that string as a binary numeral. E.g., if f is 10000000000000000000000, then the binary numeral is 1.10000000000000000000000, which is (in decimal) 1.5, so F is 1.5.

The above holds whenever 0 < e < 255. The values 0 and 255 are special.

When e is 0, the value represented is the same as above except that you start F with “0.” instead of “1.”. In particular, if f is zero, then the value represented is zero. If f is not zero, these are called denormal numbers, because they are smaller than the normal values represented in the primary way above.

When e is 255 and f is 0, the value represented is +infinity or -infinity, according to the sign bit, s. When e is 255 and f is not zero, the value represented is called a NaN, Not a Number, which is used for debugging or catching errors or other special purposes. There are quiet NaNs (which do not cause traps; they are typically used when you want to continue calculations to get a final result, then figure out what to do about a NaN) and signaling NaNs (which do cause traps; they are typically used when you want to abort a calculation because an error has occurred).

There may be variations in how the encoding appears on different platforms, especially the ordering of bytes within the 32 bits. And some platforms do not use IEEE 754 encodings.

Double-precision encoding uses the same scheme, except e is 11 bits, the 127 (called the exponent bias) is changed to 1023, and f is 52 bits. Also the special value for the exponent is its 11-bit maximum, 2047, rather than the 8-bit maximum, 255.

Categories

c - Conversion between float and int, byte representation?

c - Conversion between float and int, byte representation?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags