9

In theory, there are infinite inputs, that you can hash with SHA-256. So theoretically it would be possible that one hash string would read 0xaaaaaaaa...

But would that also be possible practically, or do the algorithms check that this is not happening?

6 Answers6

21

First of all, the output of SHA-256 is binary and consists of 32 bytes (256 denotes the output size in bits). What you are talking about is apparently the hexadecimal encoding of these bytes.

The possibility that you are talking about is called (1st) pre-image resistance (Wikipedia):

Given a hash value $h$, it should be difficult to find any message $m$ such that $h = \text{H}(m)$.

("difficult" is a non-technical term here, generally we use "computationally infeasible", obviously there will be messages that map to any hash value, the difficulty is finding them for a one-way hash)

No, the algorithms do not check this explicitly, because the algorithm by itself needs to be resistant against it. Furthermore, the repetition of certain bits is not that special all by itself. It would be unclear what you would need to test for.

"But would that also be possible practically" well, no, unless SHA-2 gets broken. Generally it is collision resistance that gets broken first. That means finding a hash where $\text{H}(m) = \text{H}(m')$ for any $m$ and $m'$. This is easier to attack because an attacker can try and find weaknesses in the algorithm that create an internal collision while controlling both $m$ and $m'$. SHA-256 is still considered secure in this regard.

tum_
  • 306
  • 1
  • 3
  • 9
Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323
15
  1. Yes, it's possible.
  2. Given the size of the input space (not actually infinite, but still very, very large), it's also likely, for any given 256-bit value, that several inputs that hash to that value exist.
  3. No, there's nothing special in the construction of the algorithm that prevents it (restricting the output space would probably be bad for security).
  4. Nonetheless, as long as SHA-256 isn't broken, there is no practical way to find an input that hashes to a given value.
hobbs
  • 620
  • 3
  • 10
6

But would that also be possible practically, or do the algorithms check that this is not happening?

This is practically beyond anybody to find a 32-$a$'s for SHA-256 without pure luck or one need breaking the pre-image resistance of SHA-256, that is not possible.

Is it possible that a SHA256 hash has the same character 64 times?

Yes, and No. We don't know such input exists or not since we cannot try all possible inputs.

Let see what is expected in a restricted SHA-256 to input size 256-bits.

Model SHA-256 as uniform random map $F:\{0,1\}^{256} \to F:\{0,1\}^{256}$, i.e. limit the input.

There are almost certainly less than $k=2^{256}$ outputs since the number of permutation is $k!$ and the number of function is $k^k$ and $$k!/k^k\to0.$$

Now each output $y$ has $1/2^k$ chance to appear. So we have $\Pr[F(x) = y] = 1/2^k$. Since each $x$, $F(x)$ is an independent random variable then we have

\begin{align} \Pr&[\exists x. F(x) = y] = 1 - \Pr[\forall x. F(x) \neq y] \\ &= 1 - \Pr[F(0) \neq y]\,\Pr[F(1) \neq y]\cdots\Pr[F(2^k - 1) \neq y] \\ &= 1 - (1 - 1/2^k)^{2^k}. \end{align}

This is also the expected ratio of the distinct outputs by the linearity of the expectations. When we set $k \to \infty$, this will converge to $1-e^{-1} \approx 0.632$. Therefore near 3 out of 10 of the output values are not expected to occur if we limit the inputs.

When the input size is increased by more than 256 bits the expected ratio of the distinct outputs will approach 1 with the uniform random model. Even for 512 bits or more This doesn't mean that all outputs will occur. We don't know and we have no way to see that. Even we don't know that SHA-256 attends the first 64 bit integers.

In theory there are infinite inputs, that you can hash with SHA256

No, not infinite inputs, Due to the length padding this is not possible.

The standard FIPS.180-4 defines a padding scheme that limits the upper input size.

Then append the 64-bit block that is equal to the number $l$ expressed using a binary representation.

Where the $l$ is the message length. Therefore, according to the standard, you can hash at most $2^{64}$-bit-sized input messages. This makes SHA-256 can have total $2^{2^{64}}$ different messages.

This upper limit, actually, due to the Merkle-Damgård (MD) design of SHA series. This is against the MOV attack (Handbook of Applied Cryptography; Chapter 9, Example 9.23);

kelalaka
  • 49,797
  • 12
  • 123
  • 211
1

But would that be possible practically

Is it possible that the SHA-256 of some input would be a repetition of the same hex digit?

Yes, most definitely. In theory, any of the values in the output space is possible (though I don't think there's any proof that ALL values are actually possible).

Is it possible to find an input, which, once hashed with SHA-256, yields a repetition of the same hex digit?

In theory, if you had infinite time and ressources, nothing prevents it. Practically, no. There are 16 values with are the repetition of a given hex digit, so it's "only" 16 times less difficult than finding the input for a single specific output.

Let's consider for a minute that SHA-256 would be bijective between the set of 256-bit numbers and itself (i.e. that the input space is the same size as the output space, and that each value in the input space yields a different value in the output space, and of course vice-versa).

The current hash rate of the whole bitcoin network (we're talking a LOT of resources, many strongly optimised just for this task) is a bit above $170000\ Phash/s$ on the good days. That's $1.7 \times 10^{20}$ hashes per second. $2^{256}$ is $1.15 \times 10^{77}$ values. So on average it would take over $2 \times 10^{55}$ seconds to find one of the possible values. That's $3 \times 10^{47}$ years. For reference, the universe is thought to be $1.5 \times 10^{10}$ years old, so that's a very, very, very long time.

But we don't even know if SHA-256 is bijective on that space. For all we know, maybe all the inputs that yield those outputs are bit strings 1000 bits long, not 256. That could make things a lot harder.

But nothing tells us that you won't find a file tomorrow with a hash which is the repetition of a hex number, just by random chance. The likelihood of it happening is just sooooooooo tiny anyone would say it's impossible.

jcaron
  • 205
  • 1
  • 7
1

In theory, there are infinite inputs, that you can hash with SHA-256. So theoretically it would be possible that one hash string would read 0xaaaaaaaa...

But would that also be possible practically, or do the algorithms check that this is not happening?

The algorithm has 2256 possible outputs.

Let us assume that you don't want the algorithm to output a hash 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ever. Ok, you could implement a check that it will never be outputted. You probably would need to do the same for the 15 other hex digits. So after these modifications, the algorithm would have 2256-16 possible outputs.

But wait! The algorithm could output 0xfaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa. There are 64 possible positions for the differing digit, 16 possible non-differing digits and 15 possible differing digits. So 15360 such hashes.

So perhaps we need an algorithm that can produce only 2256-16-15360 possible outputs?

Oops, there could be 2 differing digits. Or 3. Or 4. Or ...

Every step would reduce the possible amount of outputs that the hash function can create, making the hash function worse. It would be no longer a hash function that's a true 256-bit hash.

Why would you want to do this, to make the hash function worse?

The role of a hash function is to be unpredictable: to make it impossible to find an input that creates the hash 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa. To make this problem of finding the input harder, the hash needs as many bits as it can have. By reducing the effective bit amount, the hash function becomes worse: it's easier to find an input creating the desired output.

Someday, I'm sure SHA256 will be broken. Thus, someday we will know a number of inputs that all hash to the same output: 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa. That day has not arrived yet. It is possible that quantum computers could accelerate the arrival of the day SHA256 will be broken. It is also possible some mathematical weakness is found in SHA256, allowing finding an input producing the desired output.

Before that day arrives, you can safely use SHA256 knowing nobody will with a probability very close to 1 find an input producing an output having only one hex digit.

juhist
  • 1,643
  • 1
  • 13
  • 18
-4

It is possible. However, hashes are suppose to be random and 0xaaaaaa... doesn't seem random at all. Ensure that the algorithm, software, code, etc. is working properly because it looks like a security bug.