Questions tagged [floating-point]

Approximate representation of numbers as a fixed number of digits multiplied by a logarithmic scale.

Floating point is a representation of approximations of real numbers as a fixed number of digits multiplied by a logarithmic scale.

Basic reading

252 questions
25
votes
7 answers

Why floating point representation uses a sign bit instead of 2's complement to indicate negative numbers

Consider a fixed point representation which can be regarded as a degenerate case of a floating number. It is entirely possible to use 2's complement for negative numbers. But why is a sign bit necessary for floating point numbers, shouldn't mantissa…
koo
  • 351
  • 1
  • 3
  • 5
15
votes
4 answers

Inequality caused by float inaccuracy

At least in Java, if I write this code: float a = 1000.0F; float b = 0.00004F; float c = a + b + b; float d = b + b + a; boolean e = c == d; the value of $e$ would be $false$. I believe this is caused by the fact that floats are very limited in the…
Known Zeta
  • 327
  • 1
  • 2
  • 8
14
votes
1 answer

Floating point rounding

Can an IEEE-754 floating point number < 1 (i.e. generated with a random number generator which generates a number >= 0.0 and < 1.0) ever be multiplied by some integer (in floating point form) to get a number equal to or larger than that integer due…
Cade Roux
  • 323
  • 2
  • 7
11
votes
1 answer

Understanding denormalized numbers in floating point representation

I am confused about how denormalized numbers work in floating point representation. I was referring to Stallings book and this article. The book initially explains floating point number format in general and then explains IEEE 754 floating point…
RajS
  • 1,737
  • 5
  • 28
  • 50
10
votes
1 answer

Implementation of Naive Bayes

I am implementing a Naive Bayes algorithm for text categorization with Laplacian smoothing. The problem I am having is that the probability approaches zero because I am multiplying many small fractions. Therefore, the probability eventually yields…
10
votes
8 answers

Represent a real number without loss of precision

Current floating point (ANSI C float, double) allow to represent an approximation of a real number. Is there any way to represent real numbers without errors? Here's an idea I had, which is anything but perfect. For example, 1/3 is…
10
votes
1 answer

Why does floating point modulus exactness matters?

Most Smalltalk dialects currently implement a naive inexact floating modulus (fmod/remainder). I just changed this to improve Squeak/Pharo and eventually other Smalltalk adherence to standards (IEEE 754, ISO/IEC 10967), as I already did for other…
aka.nice
  • 201
  • 1
  • 4
9
votes
5 answers

Number of FLOPs (floating point operations) for exponentiation

What is the number of floating point operations needed to perform exponentiation (power of)? Assuming multiplication of two floats use one FLOP, the number of operations for $x^n$ will be $n-1$. However, is there a faster way to do this? How does…
Mr. Eivind
  • 201
  • 2
  • 7
7
votes
1 answer

Difference between ways to compare floating-point numbers

There seems to be many approaches to judge whether two floating-point numbers are identical. Here are some examples I've found: fabs(x - y) < n * FLT_EPSILON * fabs(x) OR fabs(x - y) < n * FLT_EPSILON * fabs(y) fabs(x - y) < n * FLT_EPSILON *…
nalzok
  • 1,111
  • 11
  • 21
7
votes
1 answer

Confused by Floating Point Spacing

I'm currently taking a numerical analysis class in college and we're covering floating point systems. For the most part, I have a good grasp on it. However, something I can't seem to visualize, and haven't seen any totally lucid explanations about…
6
votes
1 answer

Imaginary numbers and negative zero

I've been studying the low-level hardware implementations of floating point numbers and doing an exercise to design a custom floating point implementation. I know that being able to represent negative zero is important for some purposes, but ran…
Rory O'Hare
  • 213
  • 1
  • 5
6
votes
1 answer

Simple algorithm for IEEE-754 division on 8-bit CPU?

IEEE Std 754-2008 is the modern definition of Floating-Point Arithmetic. It requires that division (among other operations) performs as if it first produced an intermediate result correct to infinite precision (..), and then rounded that…
fgrieu
  • 519
  • 3
  • 14
6
votes
2 answers

Program transformations for numeric stability

There's tons of research on program transformations for optimization. Is there any research on transformations that improve numeric stability? Examples of such transformations might include: Transform $\log(\exp(a)+\exp(b))$ into…
Mike Izbicki
  • 444
  • 2
  • 9
6
votes
3 answers

What does normalizing with hidden bit really mean?

I have a question related to representing numbers in base 2 with floating point. For example, if I have such a number $$0.000011 \cdot 2^3$$ then is its normalized form this? $$1.1\cdot 2^{-2}$$ Generally speaking about normalizing, normalizing…
wonderingdev
  • 165
  • 1
  • 1
  • 6
6
votes
2 answers

Numerical methods: why doesn't this python code return 1.0?

I typed the following into the python console: >>>from numpy import float64 >>>x=float64(1.98682855148322934369) >>>x np.float64(1.9868285514832293) >>>y=float64(1)/x >>>x*y np.float64(0.9999999999999999) My argument for why the above code should…
1
2 3
16 17