0

Unlike integers, decimal fractions cannot be directly represented in binary. Therefore, what is the procedure to find how many bits to use to express a given binary fraction are sufficient?

Basically I am trying to understand that for a given range of quantity, how many bits are needed at minimum to store it. It is simple for integers, but appears complex for fractions. I am talking about fixed point representation.

I write VHDL so I decide how many I use.

quantum231
  • 1,207
  • wtf is fixed-point-arithmetics ??? – mercio Dec 02 '16 at 12:02
  • 2
    @mercio The opposite of floating-point arithmetic. Calm down with the question marks, please. – Arthur Dec 02 '16 at 12:05
  • "Sufficient" for what purpose? If not exact representation then what is your actual requirement? – Erick Wong Dec 02 '16 at 12:17
  • Let's see, if I have X bits, what is the closest quantity I can get to my required quantity so I can calculate the error? Is using recursion the only way to do this? Or is there a formula? – quantum231 Dec 02 '16 at 12:20

2 Answers2

1

There are many potential ways to represent fraction in binary. It really comes down to how you want to do it.

When you state decimal fractions cannot be directly represented in binary I think you mean numbers written with a decimal point cannot be represented in a finite number of binary digits. This is however true in any base. The fraction $\frac{1}{3}$ can not be represented perfectly in decimal or binary if you have a fixed number of digits after the point. However using recurring notation it can written in both:

$$\frac{1}{3}={0.\overline{3}}_{_{10}}={0.0\overline{1}}_{_2}$$

When you talk about storing integers in a particular range there is a finite number of values. However for any range there is an infinite number of fractions. If you then limit the size of the denominator then you can do exactly the same as you do with decimal fraction - i.e. you write the numerator and denominator as integers.

Update:

Ok so for a decimal fraction you mean a number with a finite number of decimal places, i.e. $$\frac{x}{10^n};x,n\in\mathbb{Z},x<10^n$$

If you want to store it perfectly then you could store the values of $x$ assuming that $n$ was a fixed constant. This would require $\lceil\log_2{10^{n}}\rceil=\lceil n\log_2{10}\rceil$ to store the value of $x$. So for your example $0.114256$ assuming 6 decimal digits maximum you would need 20 bits to store numerators up to $999,999$.

If you use less bits than this then you are not going to be able to represent everything. For example with 6 decimal digits you have a million possible distinct values but with only 6 binary digits you can only store 64 distinct values. The simplest way with less bits would be just to store fix 6 binary digits of the value. You can covert a decimal fraction to a binary fraction as follows:

  • start with your value
  • multiple it by two and record the integer part
  • discard the integer part and repeat the above step with the fractional part
  • do as many times as desired. Then look at all the integer parts you recorded.

E.g.. to convert 0.114256 to binary:

$0.114256\times2=0+0.228512$

$0.228512\times2=0+0.457024$

$0.457024\times2=0+0.914048$

$0.914048\times2=1+0.828096$

$0.828096\times2=1+0.656192$

$0.656192\times2=1+0.312384$

$0.312384\times2=0+0.624768$

$0.624768\times2=1+0.249536$

$0.249636\times2=0+0.499072$

$0.499072\times2=0+0.998144$

(Note: As this is very close to $1$ then the next several digits will be $1$.)

$0.998144\times2=1+0.996288$

$0.996288\times2=1+0.992576$

$0.992576\times2=1+0.985152$

We can conclude then that $0.114256_{_{10}}=0.00011101100111_{_2}$

This can then be rounded to the desired number of digits. If the next digits is a 0 then round down, if the next digit is a 1 then round up.

$0.114256_{_{10}}\approx0.000111_{_2}$ to six bits (rounded down)

$0.114256_{_{10}}\approx0.00011110_{_2}$ to eight bits (rounded up)

$0.114256_{_{10}}\approx0.0001110110100$ to thirteen bits (rounded up)

Ian Miller
  • 12,140
0

Fractions can be stored in single or double precision format, so the bits used for their representation can vary:

Single (float) precision: needs 32 bits -> 1 bit for the sign, 8 for the exponent and 23 for the fraction part.

Double: needs 64 bits -> 1 bit for the sign, 11 for the exponent and 52 for the fraction part.

kub0x
  • 2,143
  • I don't think this question asks about the conventional floating point standards. It doesn't ask "How many bits do you usually use", but rather "How many bits is sufficient, and how would you determine that?" – Arthur Dec 02 '16 at 12:09