0

Suppose an owner has a file $F$ stored in a server, and wants proof that the server has the full file. The owner possibly does not store the full file. I was thinking about the following simple schema of challenge-response using only hashes:

Preparation

  1. The owner generates a key $k$ randomly.
  2. The owner calculate and store the commit $c=hash(hash(k||F))$ ("$||$" stands for concatenation).

Verification

  1. The owner sends the key $k$ to the server.
  2. The server calculate $p=hash(k||F)$ and send $p$ to the owner.
  3. The owner verifies if $hash(p) = c$.

I was researching challenge-response PoR and could not find this simple schema, usually related works use more sophisticated cryptographic functions. Deswarte y. et al. seems to be a general case of this schema. I appreciate any reference to previous work with this schema.

Supposing the hash function $hash$ is a cryptographic hash with properties of collision resistance and irreversibility. Is this PoR schema safe? Any attack?

Edit

Squema was fixed by exchange of $p=hash(hash(F)|hash(k))$ to $p=hash(k||F)$, as pointed out by Titanlord and PaĆ­lo Ebermann

Rafael
  • 165
  • 2
  • 8

2 Answers2

2

In your case, the owner can check exactly one time, if the server stores the file. But I think the owner can not even do that.

My dishonest server obtains file $F$, computes $t = H(F)$ with hash function $H$, and stores $t$ but not $F$. On challenge $k$, my server computes $t' = H(k)$ and $c' = H(t || t')$, and sends $c'$ back. The owner checks $c' = c$ and is happy, but my server never (really) stored $F$.

I think one fix would be to create $n$ many checks $c_1, \dots c_n$ with $c_i = H(F || k_i)$, where $k_i$ is some secret random nonce.

If you think about it the other way around, where the owner has the complete file (e.g., a password) but the server does not, you are entering the field of authentication schemes. Additionally, exploring Merkle trees might be valuable for your case.

Titanlord
  • 2,812
  • 13
  • 37
2

No, even the "fixed" scheme is not safe when instantiated with most hash functions in common use today. That's because hash functions tend to be designed sequentially: They process the input in blocks one after another, while only keeping a small internal state.

This means the server could just partially evaluate the hash function on $F$, stopping before any finalization step happens, keeping note of the final internal state $S$ (if the file length is not divisible by the hash function's block length, a few bytes $F'$ corresponding to the last partial block also need to be saved separately). It can then discard $F$ itself. The check can still be passed by performing a final update to the hash function state $S$ with the bytes $F'\|k$ and then the finalization step, resulting in the desired output $p$.

Note that this is applicable even when the hash function is not susceptible to length extension attacks, because we don't need to extend a finalized hash value, but just the internal state of the hash function, which we can always do by design as long as the hash function works sequentially.

ManfP
  • 205
  • 3
  • 8