5

In the Bouncy Castle libraries, the GCM cipher implementation has an interesting property that does not seem described in the GCM papers (neither the NIST or the original paper):

Some AAD was sent after the cipher started. We determine the difference b/w the hash value we actually used when the cipher started (S_atPre) and the final hash value calculated (S_at). Then we carry this difference forward by multiplying by H^c, where c is the number of (full or partial) cipher-text blocks produced, and adjust the current hash.

This scheme is present in the final calculations (in the doFinal() method).

Now what I understand is that addition of within the polynomial is equivalent to XOR. I can also see why exponentiation is required to carry the difference forward. What I don't see is how the complete scheme works, especially for partial 16 byte blocks.

Can somebody show how the adjustment in the doFinal method is defined mathematically?


Note that the $\operatorname{GHASH}$ function is performed over the follwing data within GCM:

$S = \operatorname{GHASH}_H (A || 0^v || C || 0^u || [len(A)]_{64} || [len(C)]_{64})$

where

  • $0^v$ and $0^u$ is padding (0..127 bits of zero's) up to the block size

$\operatorname{GHASH}$ itself is defined as follows:

Steps:

  1. Let $X_1, X_2, ... , X_{m-1}, X_m$ denote the unique sequence of blocks such that $X = X_1 || X_2 || ... || X_{m-1} || X_m$.
  2. Let $Y_0$ be the “zero block,” $0^{128}$.
  3. For $i = 1, ..., m$, let $Y_i = (Y_{i-1} \oplus X_i) • H$.
  4. Return $Y_m$

and

  • $X•Y$ is the product of two blocks, $X$ and $Y$, regarded as elements within the binary Galois field

So in this scheme additional AAD ($A$) is send after ciphertext ($C$) was already put within the calculation.

neubert
  • 2,969
  • 1
  • 29
  • 58
Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323

1 Answers1

6

Well, $\operatorname{GHASH}$ might be better understood as the polynomial:

$$\operatorname{GHASH}_H(X_1, X_2, ... , X_{m-1}, X_m) = X_1 H^{m} + X_2 H^{m-1} + ... + X_{m-1} H^2 + X_m H^1$$

where addition, multiplication and exponentiation are in the field $GF(2^{128})$. These addition, multiplication and exponentiation operations act algebraically quite a lot like the operators in a more traditional field (say, the real numbers); the usual tricks for rearranging polynomials work exactly the same.

What Bouncy Castle is doing is taking the vector $X_1, X_2, ... , X_{m-1}, X_m$ in two pieces, the AAD vector $A_1, A_2, ..., A_a = A || 0^v$ and the ciphertext $C_1, C_2, ..., C_c = C || 0^u$, and evaluating $\operatorname{GHASH}_H(A_1, A_2, ..., A_a, C_1, C_2, ..., C_c, X_{final})$ (where $X_{final}=[len(A)]_{64} || [len(C)]_{64}$ encodes the length of the AAD and the ciphertext). One way to rearrange this is:

$\operatorname{GHASH}_H(A_1, A_2, ..., A_a, C_1, C_2, ..., C_c, X_{final}) =$ $H^{c+1}\operatorname{GHASH}_H(A_1, A_2, ..., A_a) + H\operatorname{GHASH}_H(C_1, C_2, ..., C_c) + HX_{final}$

With this rearrangement, we can obviously evaluate $\operatorname{GHASH}_H(C_1, C_2, ..., C_c)$ before we see any AAD data (or even before we know how long it is).

Now, you ask about partial blocks. It turns out that we don't need to worry about partial blocks; GCM is defined so that both the ciphertext and AAD vectors are zero padded to the next full block; hence we never need to worry about partial blocks. If you are wondering a collision if we append a zero byte to the AAD, well, that's what $X_{final}$ is there for -- appending a 0 changes the AAD length, and hence the $X_{final}$ value.

poncho
  • 154,064
  • 12
  • 239
  • 382