90

Applications like WhatsApp use end to end encryption. WhatsApp says that only the users share a specific key and no third party can view the messages. But I do not understand how the two users agree on the shared key. It must have been transferred via WhatsApp servers. In that way, WhatsApp would know the shared key right? Please help me understand how keys are shared in end to end encryption.

AV94
  • 1,019
  • 1
  • 8
  • 6

10 Answers10

141

End-to-end encryption over a channel with an eavesdropper, like the WhatsApp server, works by using a mathemagical spell called Diffie–Hellman key agreement. What follows is not actually how WhatsApp works,* but explores some of the high-level ideas to concretely answer the question without getting lost in the full gory details of everything about the protocol, which defends against many more threats than what you asked about.

Alice and Bob agree on public parameters $p$ and $g$, say $$p = 2^{2048} - 2^{1984} - 1 + 2^{64} (\lfloor2^{1918}\pi\rfloor + 124476)$$ and $g = 2$. When Alice wants to have a conversation with Bob, she picks an integer $a$ with $0 \leq a < (p - 1)/2$ uniformly at random, computes the remainder after dividing $g^a$ by $p$, written $g^a \bmod p$, and transmits $g^a \bmod p$ over the channel. Bob does similarly: picks $b$ uniformly at random in the same range, computes $g^b \bmod p$, and transmits $g^b \bmod p$ over the channel.

By a remarkable feat of mathematics, when Alice receives $g^b \bmod p$, she can compute $(g^b \bmod p)^a \bmod p$, and it turns out to be the same as $g^{ab} \bmod p$. Bob does the same: given $g^a \bmod p$, computes $(g^a \bmod p)^b \bmod p$, and obtains the same number as $g^{ab} \bmod p$.

Nobody has figured out the remarkable feat of mathematics that would allow someone who knows $g^a \bmod p$ and $g^b \bmod p$ but not $a$ or $b$ themselves to also compute $g^{ab} \bmod p$.

Thus can Alice and Bob share a secret, $g^{ab} \bmod p$, over a channel without revealing to an eavesdropper what the secret is. They will then hash this into a short secret key, say the 256-bit $k = \operatorname{SHA3-256}(g^{ab} \bmod p)$, perhaps with SHA3-256, and proceed to use it for symmetric-key encryption such as AES-CTR.

But that's not all. What if it's not just an eavesdropper, but our friend Mallory who, aside from having an annoyingly ungendered name, actively intercepts messages in transit? Mallory could pretend to be Bob to Alice, and Alice to Bob, and do a pair of Diffie–Hellman key agreements, and establish shared secrets with Alice and Bob, and then eavesdrop on all the messages they exchange. The WhatsApp server is in a position to do this.

To thwart this subterfuge of per-conversation shared secrets, Alice and Bob use long-term authentication in addition. Alice generates a number $\alpha$ just like $a$, and puts $g^\alpha \pmod p$ in the telephone book, or shows it to Bob on her phone screen; Bob does likewise with $\beta$. Thus they have a long-term shared secret $\kappa = \operatorname{SHA3-256}(g^{\alpha \beta} \pmod p)$. Now when Alice wants to have a conversation with Bob, she does the per-conversation protocol—but instead of transmitting just $g^a \bmod p$, she transmits $(g^a \bmod p, \operatorname{KMAC256}_\kappa(g^a \bmod p))$.

When Bob receives a candidate message $(A, t)$, he first checks whether $t = \operatorname{KMAC256}_\kappa(A)$ before processing it. If not, he drops the message on the floor and ignores it. Only Alice and Bob can compute the function $\operatorname{KMAC256}_\kappa$, because only Alice and Bob know the secret $\kappa$.

Next, instead of just encrypting their actual conversation with $k = \operatorname{SHA3-256}(g^{ab} \bmod p)$ using AES-CTR, they encrypt and authenticate their conversation using AES-GCM. The WhatsApp server can flip bits in messages encrypted using AES-CTR without either Alice or Bob being any the wiser even though only Alice and Bob know the secret key, but without knowing the secret key, the WhatsApp server cannot change anything in messages encrypted and authenticated using AES-GCM without Alice and Bob noticing.

What if Alice and Bob can't exchange $g^\alpha \bmod p$ and $g^\beta \bmod p$ on their phone screens because they've never actually met in person? They have to consult the telephone book—which WhatsApp publishes. When Alice consults WhatsApp for Bob's long-term public identity key $g^\beta \bmod p$, WhatsApp could spoof it and lie about it, and if Alice ever later checks Bob's phone screen she will detect the subterfuge because it won't match the key that the WhatsApp server handed out.

On the other hand, if WhatsApp does not spoof Bob's long-term public identity in the telephone book, then Alice will remember it in her personal address book. If the WhatsApp server ever tries to impersonate Bob when Alice tries to start a conversation, Alice will detect that. This model of authentication is sometimes called TOFU, or trust on first use. You will also find it in the ssh protocol.

There's a catch: Sometimes Bob's identity key changes, like when he gets a new phone. What should Alice do in that case? In Signal, the app won't send any new messages on behalf of Alice; it will notify her that something is amiss and ask her what to do. In WhatsApp, by default,§ the app will automatically retry with the Bob's new identity key, in an attempt to reduce usability hurdles for users who don't understand this, which led to an enormous kerfuffle about mainstream media reporting of the choice, which somehow dwarfed a much worse vulnerability in the app leaking the text of what you were typing if it looked like a URL.

In the end, since WhatsApp—part of Facebook—provides the software and doesn't let you scrutinize it, they certainly have the power to read and impersonate your messages, which they probably think very earnestly they won't ever abuse. That's better than letting everyone on the internet route between Alice and Bob, like the stalker who is sitting a few tables down at the coffee shop on the same wifi network as Alice, which is why it was so harmful for the Guardian to report breathlessly in a back door in WhatsApp the way it did.

Even if your phone spoke TLS to the WhatsApp servers with a pinned certificate, it's still easier to engineer audit trails for malicious changes to the telephone book than it is to engineer audit trails for eavesdropping by the WhatsApp servers—whether by a technical attacker exploiting a software vulnerability, by a LOVEINT disgruntled employee, or by a court-ordered wiretap.

So, can third parties read your WhatsApp messages? No, not likely, and it would take an active attack by WhatsApp itself—similar to the kind that Apple refused to do for the FBI—for them to read your messages.


* WhatsApp claims to use the Signal protocol (technical documentation).

Technical details: $p$ is a safe prime, meaning it is a prime of the form $p = 2q + 1$ where $q$ is also prime. $g$ is a quadratic residue modulo $p$, and thus has order $q$, meaning there are $q$ distinct possible remainders of $g^x$ when divied by $p$ for any integer $x$. The powers of $g$ modulo $p$ form a subgroup of the multiplicative group of integers $(\mathbb Z/p\mathbb Z)^\times$. None of this is important for you to understand how to share secrets over a public channel, but I provide the keywords for you to follow if you want to study further.

Alice does not actually compute, e.g., 2^13805771959684693407656077397889219317288456747119690312451189306384849479687628613222950288427889322679415500741971589068616989911210949597114445259398229588157002772876797268276100622563299377498600497546320786879884333079126581727906347769889606788799518360227168951984468071470187490408276074397578464837282521956615118563389889631151319459158126320262667606793413409480951816493115818911703426164912115254095626026747790743791560327229116656590818054138360168383331595495242709153295834514181328053967320381842712608527965926684083141420258332671624779764031721576291538707703835661166957458717002972300906725181, and then divide the result by $p$; instead she uses a modular exponentiation algorithm so that it is possible to get an answer before the universe burns itself out.

§ Some of this is configurable: conscientious users who must use WhatsApp, like a journalist whose only contact with a source is WhatsApp, should study WhatsApp's privacy and security options.

Squeamish Ossifrage
  • 49,816
  • 3
  • 122
  • 230
8

There are different types of encryption.

  • Symmetric encryption

This is where one key is used for encrypting and decrypting, often called a shared secret. This is what many of us are familiar with from our time playing with codes (technically ciphers) as kids.

To use symmetric encryption (e.g. AES), we need to agree on the key beforehand, which is the catch.

  • Asymmetric encryption

One key is used for encrypting, and another corresponding key is need for decryption. The matching keys are referred to as a keypair, the encryption key as a public key and the decryption key as a private key.

When Alice wants to communicate to Bob secretly, she retrieves Bob's public key (which can be posted on the internet) and only Bob with his corresponding private key can read it. This solves the problem of symmetric encryption of having to agree on secret first without anyone eavesdropping.

PGP uses asymmetric encryption to encrypt a random secret key used for symetic encryption of the message. This hybrid use also avoids to problem of securely establishing a share secret first.

  • There are also Key Agreement Schemes

These are similar to asymmetric encryption. Instead of en and decryption however, they allow 2 users to establish a shared secret over an insecure channel (even if someone is eavesdropping).

The most common is Diffie–Hellman-Merkle. The math involved on how and more importantly why it works is a bit heavy but have a look at: https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange#Cryptographic_explanation .


Now, if WhatsApp used symmetric encryption alone, you'd be quite right in thinking that the key must have been sent over their server first and that it's not secure. However, WhatsApp uses the signal protocol, which basically uses Diffie–Hellman to establish a shared secret session key which is then used for symmetric encryption (AES if I'm not mistaken). This allows secure end-to-end communication.

Pete
  • 180
  • 6
3

Public-key cryptography allows precisely the sort of thing that's puzzling you: two parties can, over a public channel that eavesdroppers listen on, come up with a shared secret key. “Diffie-Hellman Key Exchange” in plain English provides a few simplified explanations of this. What this comes down to is that if you and your contact really do have each others' public key (which doesn't need to be confidential), then you can communicate over a public channel without being subject to eavesdroppers or message manipulation.

But that requires you and your contact to actually have each others' public key, a problem that WhatsApp can't completely solve on its own. What WhatsApp provides in that front is what they call security codes, a mechanism that you and your contact can use to independently verify that you do have each other's correct public key, and not an impersonator's. Using these security codes and a trusted communications channel outside of WhatsApp (critical!), you and your contact can verify that your devices have the correct key for each other.

From the WhatsApp site:

What is the "Verify security code" screen in the contact info screen?

Each of your chats has its own security code used to verify that your calls and the messages you send to that chat are end-to-end encrypted.

Note: The verification process is optional and is used only to confirm that the messages you send are end-to-end encrypted.

This code can be found in the contact info screen, both as a QR code and a 60-digit number. These codes are unique to each chat and can be compared between people in each chat to verify that the messages you send to the chat are end-to-end encrypted. Security codes are just visible versions of the special key shared between you - and don't worry, it's not the actual key itself, that's always kept secret.

To verify that a chat is end-to-end encrypted

  1. Open the chat.
  2. Tap on the name of the contact to open the contact info screen.
  3. Tap Encryption to view the QR code and 60-digit number.

If you and your contact are physically next to each other, one of you can scan the other's QR code or visually compare the 60-digit number. If you scan the QR code, and the code is indeed the same, a green checkmark will appear. Since they match, you can be sure no one is intercepting your messages or calls.

[...] If you and your contact are not physically near each other, you can send them the 60-digit number. Let your contact know that once they receive your code, they should write it down and then visually compare it to the 60-digit number that appears in the contact info screen under Encryption. For Android, iPhone and Windows Phone, you can use the Share button from the QR code/60-digit number screen to send the 60-digit number via SMS, email, etc.

WhatsApp describes this verification process as "optional," but if you don't verify your contacts' security codes (and do it outside of WhatsApp), then you are indeed trusting the server as you surmise. What they offer is, broadly, the best that they can offer: a tool that you can use to independently verify they are doing what they claim. But they can't actually force you to do that verification, that you have to do on your own. The best they can do is make it easier to use (which they do with, e.g., the QR codes).

If this is something that bugs you in the least bit, you should get into the habit of verifying your contacts' security codes when you do meet them. One of the principles of the Signal protocol (the protocol that WhatsApp uses, which it adopted from the Signal Messenger application) is that we need a minority of users to be vigilant about security codes in order to defend the bulk of users against indiscriminate mass surveillance. If enough users do their part from time to time we can be confident that WhatsApp's servers are, at worst, being used only for targeted surveillance against narrow sets of users.

Luis Casillas
  • 14,703
  • 2
  • 33
  • 53
2

End to End encryption is secure even if the channel is not secure, that is kind of point. You can also negotiate a key over an insecure channel. WhatsApp uses the well reviewed https://en.m.wikipedia.org/wiki/Signal_Protocol And it uses internally Diffi Hellman key exchange. So if the clients implement the protocol correctly someone listening in can not get the negotiated key.

However we still need authentication. So we know who we are talking to. We still rely on Whatsapp for that, they hold public keys. Whatsapp also provides the software for the clients and we need to trusr it did a compotent and honest job. And the worst part is that on many devices the messages are encrypted in transit, stored in clear on the device which is then backed up to a Google or Apple servers. So even if WhatsApp can't read your messages. Google or Apple probably can.

Meir Maor
  • 12,053
  • 1
  • 24
  • 55
1

End to end encryption safety depends on two elements (two of which were described above):

  • The security of the encryption on each end device
  • and the security of the key exchange algorithm

I suspect that the end devices are the weakest.

Billal BEGUERADJ
  • 107
  • 2
  • 2
  • 7
0

End to end encryption(etee) works in such that Whatsapp is NOT involved. If it were that would be Point to Point (ptpe) encryption. In etee both parties establiah an mutual agreement on algos and such based on the lower common supported/allowed options, not too much unlike how you browser does with a webserver.

linuxdev2013
  • 113
  • 3
0

Not exactly the server, but the overall software system you're using has to be trusted. Sure, the data may be encrypted and you might be able to verify that, but there is no way you can prove that the key isn't being leaked or deliberately communicated to some third party.

ddyer
  • 509
  • 3
  • 5
0

It is impossible to communicate securely if all communication goes through an untrusted intermediary, as the intermediary could just pretend towards each party that it is the other party (the man-in-the-middle attack). So assuming that the parties cannot / don't want to communcate directly, even to the extent of swapping keys, the server does need to be trusted somewhat.

On the other hand, there are various asymmetric encryption schemes (described in detail in the other answers) which are secure as long as the intermediary does not actively interfere in the communication (ie. secure under the assumption that the server might eavesdrop on the messages but will forward them faithfully). Since interference can easily be detected if the parties do talk directly, so it is risky for an attacker to attempt, it is not too much of a concern in practice.

Tgr
  • 101
  • 2
-3

What you really need is to make sure your public key and the other partys is actually transmitted correctly. The only way to make this sure is to hand over the public key personally and physically on a device you trust, preferrably analog . The technology clearly exists (as in I could do it technologically speaking if I wanted to) to adaptively in transit over the open internet switch public keys for fake ones to allow for MITM attacks, if you just manage to bribe the right people.


And once you become a parent you will be very susceptible to bribes. Also most of the worlds voting population are parents, so good luck in changing any of that within the frames of democracy.

-3

If it's not end-to-end encrypted, it's not really encrypted at all! Writing a ciphertext right next to a plaintext is equivalent information-wise to just writing the plaintext and deleting the ciphertext. ie: Don't decrypt anything if it's decoded right in front of you. Writing the keys next to the ciphertext is only a little better. They will just combine the two to get the plaintext. Encrypt-At-Rest is a little better in that the key isn't sitting on disk; but it usually means that the key does pass through that machine, and that machine could have written the key off somewhere (ie: to an S3 bucket of crypto keys). End-To-End means that the machine has never seen the key, and does not perform encrypt/decrypt... that must hapen on a user's machine.

If you think about it, end-to-end is the only scenario in which you obeyed the first rule of crypto... to not give the key to somebody that isn't supposed to decrypt. Encrypt-At-Rest that did have the key at some point relies on a gentleman's agreement to not simply remember the key or the ciphertext.

Rob
  • 349
  • 1
  • 13