Questions tagged [floating-point]

Approximate representation of numbers as a fixed number of digits multiplied by a logarithmic scale.

Floating point is a representation of approximations of real numbers as a fixed number of digits multiplied by a logarithmic scale.

Basic reading

David Goldberg. What Every Computer Scientist Should Know About Floating-Point Arithmetic. Computing Surveys, ACM, 1991. [HTML] [PDF]
What Every Programmer Should Know About Floating-Point Arithmetic, or, Why don’t my numbers add up?

252 questions

votes

7 answers

Why floating point representation uses a sign bit instead of 2's complement to indicate negative numbers

Consider a fixed point representation which can be regarded as a degenerate case of a floating number. It is entirely possible to use 2's complement for negative numbers. But why is a sign bit necessary for floating point numbers, shouldn't mantissa…

computer-architecture floating-point number-formats

asked Oct 13 '12 at 22:11

koo

votes

4 answers

Inequality caused by float inaccuracy

At least in Java, if I write this code: float a = 1000.0F; float b = 0.00004F; float c = a + b + b; float d = b + b + a; boolean e = c == d; the value of $e$ would be $false$. I believe this is caused by the fact that floats are very limited in the…

arithmetic floating-point numerical-algorithms

asked Nov 07 '16 at 22:25

Known Zeta

votes

1 answer

Floating point rounding

Can an IEEE-754 floating point number < 1 (i.e. generated with a random number generator which generates a number >= 0.0 and < 1.0) ever be multiplied by some integer (in floating point form) to get a number equal to or larger than that integer due…

numerical-analysis floating-point rounding

asked Aug 14 '12 at 18:45

Cade Roux

votes

1 answer

Understanding denormalized numbers in floating point representation

I am confused about how denormalized numbers work in floating point representation. I was referring to Stallings book and this article. The book initially explains floating point number format in general and then explains IEEE 754 floating point…

floating-point

asked Dec 16 '18 at 13:54

RajS

1,737
5
28
50

votes

1 answer

Implementation of Naive Bayes

I am implementing a Naive Bayes algorithm for text categorization with Laplacian smoothing. The problem I am having is that the probability approaches zero because I am multiplying many small fractions. Therefore, the probability eventually yields…

machine-learning natural-language-processing floating-point numerical-algorithms

asked Mar 14 '16 at 18:01

sam

votes

8 answers

Represent a real number without loss of precision

Current floating point (ANSI C float, double) allow to represent an approximation of a real number. Is there any way to represent real numbers without errors? Here's an idea I had, which is anything but perfect. For example, 1/3 is…

binary-arithmetic arithmetic floating-point real-numbers number-formats

asked Jul 11 '14 at 17:06

incud

votes

1 answer

Why does floating point modulus exactness matters?

Most Smalltalk dialects currently implement a naive inexact floating modulus (fmod/remainder). I just changed this to improve Squeak/Pharo and eventually other Smalltalk adherence to standards (IEEE 754, ISO/IEC 10967), as I already did for other…

arithmetic floating-point

asked May 03 '14 at 19:17

aka.nice

votes

5 answers

Number of FLOPs (floating point operations) for exponentiation

What is the number of floating point operations needed to perform exponentiation (power of)? Assuming multiplication of two floats use one FLOP, the number of operations for $x^n$ will be $n-1$. However, is there a faster way to do this? How does…

algorithms arithmetic floating-point

asked Mar 02 '19 at 13:05

Mr. Eivind

votes

1 answer

Difference between ways to compare floating-point numbers

There seems to be many approaches to judge whether two floating-point numbers are identical. Here are some examples I've found: fabs(x - y) < n * FLT_EPSILON * fabs(x) OR fabs(x - y) < n * FLT_EPSILON * fabs(y) fabs(x - y) < n * FLT_EPSILON *…

floating-point number-formats

asked Mar 24 '16 at 07:57

nalzok

1,111
11
21

votes

1 answer

Confused by Floating Point Spacing

I'm currently taking a numerical analysis class in college and we're covering floating point systems. For the most part, I have a good grasp on it. However, something I can't seem to visualize, and haven't seen any totally lucid explanations about…

computer-architecture floating-point number-formats

asked Sep 16 '13 at 23:24

Mike N.

votes

1 answer

Imaginary numbers and negative zero

I've been studying the low-level hardware implementations of floating point numbers and doing an exercise to design a custom floating point implementation. I know that being able to represent negative zero is important for some purposes, but ran…

floating-point number-formats

asked Apr 24 '18 at 06:45

Rory O'Hare

votes

1 answer

Simple algorithm for IEEE-754 division on 8-bit CPU?

IEEE Std 754-2008 is the modern definition of Floating-Point Arithmetic. It requires that division (among other operations) performs as if it first produced an intermediate result correct to infinite precision (..), and then rounded that…

arithmetic floating-point

asked Aug 31 '17 at 10:53

fgrieu

votes

2 answers

Program transformations for numeric stability

There's tons of research on program transformations for optimization. Is there any research on transformations that improve numeric stability? Examples of such transformations might include: Transform $\log(\exp(a)+\exp(b))$ into…

compilers floating-point numerical-algorithms

asked Jul 09 '15 at 05:46

Mike Izbicki

votes

3 answers

What does normalizing with hidden bit really mean?

I have a question related to representing numbers in base 2 with floating point. For example, if I have such a number $$0.000011 \cdot 2^3$$ then is its normalized form this? $$1.1\cdot 2^{-2}$$ Generally speaking about normalizing, normalizing…

floating-point number-formats normal-forms

asked Mar 02 '15 at 18:52

wonderingdev

votes

2 answers

Numerical methods: why doesn't this python code return 1.0?

I typed the following into the python console: >>>from numpy import float64 >>>x=float64(1.98682855148322934369) >>>x np.float64(1.9868285514832293) >>>y=float64(1)/x >>>x*y np.float64(0.9999999999999999) My argument for why the above code should…

approximation floating-point numerical-algorithms numerical-analysis error-estimation

asked Jan 21 '25 at 05:19

Beatnik Dopa

2 3

…

16 17 Next