ZKP: Prove that >18 while hiding age

Question

I am relatively new to cryptography, but I've been programming for a while. Here's a story that sets well the problem I'm trying to solve:

Alice has a digital passport that's signed with her government's private key. Each property is signed separately, and it would still be verifiable that, for example, her first name is "Alice", without saying that her last name is "Smith".

From here, knowing that she can prove that her date of birth is 05/04/1975, can she prove that she is over 18 while neither revealing her birthdate nor asking a third party for help?

My guess is that she can't, but I'm hoping you'll surprise me!

score 11 · Accepted Answer · answered Nov 23 '21 at 23:26

AFAICT, no, she can't do that using just the information on her passport, as you describe it,* and treating the digital signature algorithm as a black box.

However, if the government has foreseen this use case and included just a little bit more information on her passport, there is a way to do it using e.g. the hash chain method described in this earlier Q&A thread (based on a 2013 paper by Angel & Walfish).

In particular, what Alice needs is:

A preimage-resistant cryptographic hash function $H$ standardized by the government. Something like SHA-256 would be fine.
A secret government-issued token $s$, computationally indistinguishable from an output of $H$. (For example, $s$ may be generated by selecting a random string $r$ with at least as many bits of entropy as the output length of $H$ and letting $s = H(r)$.)
A statement signed by the government, of the form:

Alice** was born on or before 2050-01-01 (hash = $H^a(s)$).

where $a$ is Alice's age in days on 2050-01-01 and $H^a$ denotes $H$ iterated $a$ times, i.e. $H^0(s) = s$ and $H^{n+1}(s) = H(H^n(s))$

Now, to prove to Bob that she is currently (2021-11-24) 18 years old or older, Alice needs to produce the signed statement above, together with $h = H^b(s)$, where $b$ is Alice's age in days exactly 18 years ago (i.e. on 2003-11-24).

Bob can then calculate $H^c(h)$, where $c = 16840$ is the number of days between 2003-11-24 and 2050-01-01, and compare it with the hash in Alice's signed statement. If Alice's claim is valid, they should match, since $a = b + c$, and thus $H^c(h) = H^c(H^b(s)) = H^a(s)$.

(Of course, Bob should also verify that Alice's statement really is correctly signed with the government's signing key and that it really belongs to her and not, say, her older sister.)

Note that, for computational convenience and to mitigate timing attacks, it may be desirable for Alice to precalculate and store $H^{b'}(s)$ for various values of $b'$ corresponding to, say, her age in days at the beginning of each year so far. That way, she will only need to perform at most 365 hash evaluations on the spot to calculate $H^b(s)$ for any $b$.

To reduce Bob's computational demand, it may be desirable for the government to issue Alice multiple signed statements of the form above, but with different dates and corresponding values of $a$, so that Alice can send Bob a signed statement for, say, 2004-01-01 instead of 2050-01-01. Then again, a few thousand SHA-256 evaluations really isn't much these days, either, and Bob has no need to worry about side channel attacks.

However, issuing multiple signed statements also helps preserve the zero-knowledge property in the presence of rollover. For example, let's imagine that the government originally picked 2030-01-01 as the fixed date for the signed statements, but then decided a few years ago to move it to 2050 as they started to issue passports valid up to 2030 and later. This would mean that, at a minimum, Bob could learn from the date in Alice's statement whether her passport was issued before or after the transition, which might correlate with other information including her age. By giving Alice signed statements for both 2030-01-01 and 2050-01-01 (and possibly other applicable dates), this leak can be avoided, since Alice can just choose the statement with the earlier suitable date for Bob's request (but not, obviously, earlier than the date she wishes to prove she was born on or before).

Ps. As noted in my answer to the question linked above, the same protocol can also be used to allow Alice to prove her exact age. To do this, Alice needs to know not just $s = H(r)$ but also the original random token $r$, which should be chosen by the government so that it is not a valid output of $H$.

Then, if Alice wants to prove to Bob that she was born on 1975-05-04, and no earlier, she can reveal that date together with $r$ and at least one signed statement of the form above. Bob can then calculate Alice's claimed age in days $a$ at the date indicated in the statement, calculate $H^a(s) = H^{a+1}(r)$ and compare it with the hash in the signed statement.

Of course, with this variant of the protocol, Bob also need to verify the format of the $s$ or $r$ token sent by Alice (i.e. that it is a valid output of $H$ for "born on or before" claims, and not a valid output of $H$ for "born on this date" claims).

All that said, in your scenario a separate government signature for "Alice was born on 1975-05-04" would serve this purpose just as well.

Footnotes:

*) Your description that "[e]ach property is signed separately" is actually a bit problematic if taken at face value, since having the government sign an individual property such as "first name: Alice" means that anyone who has seen Alice's signed first name can make a copy of the signature and use it to claim that their first name is also Alice. Also, for example, if Alice Andrews and Bob Barker had passports with such separately signed first and last names, they could combine them to create a fake passport for "Alice Barker" or "Bob Andrews".

A slightly better solution would be to have one of the properties on the passport be a unique ID, and include that ID in all signed properties. Thus, for example, someone who has seen a signed statement of the form "ID: #123456789; first name: Alice" could now verify that Alice, bearing a passport with ID #123456789, indeed has the first name "Alice", but they cannot use the signature to claim anyone else's first name to be Alice.

Of course, this scheme still has its own shortcomings: for example, Alice might want to be able to prove to Bob that her name is Alice without Bob being able to prove that fact to anyone else. Implementing such a deniable authentication scheme seems outside the scope of this answer, although you might want to look at e.g. this earlier one.

**) Here I am using "Alice" as a stand-in for Alice's unique government-issued ID, as described in the previous footnote above.

knaccc · Answer 2 · 2021-12-27T17:06:12.313

Using an elliptic curve: The passport could contain a unix timestamp as a Pedersen Commitment in the form $C = bG + tH$, where $b$ is a scalar blinding factor, $t$ is the passport holder's date of birth as a unix timestamp (seconds since the epoch), and $G$ and $H$ are well known base points chosen such that their discrete log with respect to one another is unknown and unknowable.

$C$ is signed by the government. The passport makes the timestamp and blinding factor known to the passport holder.

To prove the passport holder is older than 18, we first compute $s$ as the latest possible timestamp the passport holder can have been born on to currently be 18 or over.

The passport holder needs to demonstrate that $s>=t$. We can do that by proving that $C' = sH - C == (s-t)H - bG$ is a commitment to a positive number. Because of the modular arithmetic nature of EC algebra, this means we need to ensure that the commitment is to a relatively small number, and not to a huge number that would result from $s-t$ being "negative".

To achieve this, we need something called a range proof.

The range proof will prove that $C'$ can be constructed by a sequence of 43 bits (which will allow the scheme to work for the next 200 years). Each 'bit' will be either zero or a power of 2.

First, we will declare 43 Pedersen commitments $C_i$ for values of $i$ between 0 and 42 inclusive, where each commitment will have its own blinding factor $b_i$ and each commitment will be either to the value zero or $2^i$.

Now, we need to prove that $C'$ is a commitment to the same number as $\sum_{i=0}^{42} C_i$. This can be done by calculating the public key $C'' = C' - \sum_{i=0}^{42} C_i$. Since the number committed to by $C'$ will equal the sum of numbers committed to by the 43 commitments, the result will be of the form $C'' == b'G$ where $b' == -b - \sum_{i=0}^{42} b_i$. Therefore providing a signature on $C''$ using the private key $b'$ will prove that there cannot be a non-zero $H$ component of $C''$, because a signature using the base point $G$ is not possible unless the discrete log of $H$ w.r.t. $G$ is known (and $H$ has specifically been chosen such that it is unknowable).

Finally, we need to prove that each of the components $C_i$ are either commitments to zero or commitments to $2^i$. We do this by providing a ring signature for each commitment for the two public keys $(C_i - 0H, C_i - 2^iH)$. A ring signature will only be possible if one of those two possibilities produces an EC point with no $H$ component, thus proving that $C_i$ is a commitment to either zero or $2^i$.

In summary, we have proven that $s>=t$ because we've proven that $sH -C$ is a commitment to a positive number. We proved it was a positive number by proving it can be built up as a series of 43 bits, where each bit is either zero or a specified power of 2.

ZKP: Prove that >18 while hiding age

2 Answers2

Linked