3

Suppose we define privacy as a game where a machine $M$ has a coin $b$, and on input $M_0, M_1$ always replies with encrypted $M_0$ if $b=0$ and encrypted $M_1$ if $b=1$. The adversary can send as many pairs $M_0, M_1$ as he wants. The goal is to guess $b$. If he cannot do better than guessing at random (i.e., with probability $1/2$), then privacy holds (Simplified Benaloh's ballot privacy definition, if anyone is curious).

Out toy machine $M$, works as follows:

  • samples a random permutation $P$.
  • applies $P$ to the base mapping ["Yes" = $A$; "No" = $B$;"Maybe" = $C$].
  • replies with a new code corresponding to an answer.

Example: if $P = [2,3,1]$, then new map is: "Yes" = $C$, "No" = $A$ and "Maybe" = $B$ and $M$ will reply with $C$ for "Yes", $A$ for "No" and $B$ for "Maybe".

We know $M$ will use one of $3!=6$ possible permutations, which means that "Yes" in $2/3$ of cases will not be $A$. However, I cannot construct a distinguisher for the privacy game from this fact.

So, on the one hand, intuition says that toy machine $M$ leaks information - looking at the code, we can make a reasonable guess. However, I cannot formally prove it's not private under the simplified definition.

I don't know if it's the (simplified) definition's drawback, faulty intuition, or my poor understanding of conditional probabilities. Any ideas?

Edit: No leakage will be if we cannot distinguish between 'real usage' ($N$ different replies with (possibly) unequal prior-known probabilities of distribution among "Yes"/"No"/"Maybe") and 'ideal case' (where $M$ outputs $N$ codes $A$, $B$ or $C$ at random regardless any probability distribution among answers).

pintor
  • 558
  • 3
  • 14

2 Answers2

2

No information can leak about the input of machine $M$ from the output of machine $M$, in any game. That's because $M$ can be viewed as a probabilistic Turing machine (with no memory of earlier runs), and for each fixed input of $M$ the distribution of the output is the same. It's easy to check the later by examination: there are three possible inputs, and for each there are six equaly likely permutations, of which two yield output A, two yield output B, two yield output C, so that each of these output has probability 1/3

Note: what matters is not that the output values have the same probability. What matters is that the probability of each output value for a given input is independent of the input.

Note: in the privacy game, the input of machine $M$ is restricted to two inputs A/Yes and B/No. Therefore, to prove that $M$ does not leak information in the privacy game, we only need to check that the distributions of the output of $M$ for each of these two inputs of $M$ are the same.

For a quantitative approach of information leakage, I think we could define it as the base-2 logarithm of the ratio of the probability of guessing input correctly knowing output of $M$, to the probability of guessing input correctly not knowing the output of $M$. That could be computed from the distribution of inputs in the experiment, and the conditional probabilities $P(o|i)$ of output $o$ given input $i$ for the machine under consideration. I wish I could tell exactly how, and hope it's in this.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
1

I’m going to assume that when you say $M$ samples a random permutation, it does so uniformly so that each permutation is equally likely. If this is not the case, then there is information leakage.

One way to measure information leakage is to compute the mutual information of the input and output. If we write $p_y$, $p_n$ and $p_m$ for the probability of input “Yes”, “No” and “Maybe” respectively, we can easily see that $$\mathbb P(\mathrm{input}=\mathrm{“Yes”},\mathrm{output}=A)=p_y\times\frac13=\mathbb P(\mathrm{input}=\mathrm{“Yes”})\mathbb P(\mathrm{output}=A)$$ and similar independence in the other eight possible combinations. It follows that $$\log\left(\frac{\mathbb P(\mathrm{input}=\mathrm{“Yes”},\mathrm{output}=A)}{\mathbb P(\mathrm{input}=\mathrm{“Yes”})\mathbb P(\mathrm{output}=A)}\right)=0$$ and likewise for the other 8 logarithms in the expression for mutual information. Hence $$I(\mathrm{input},\mathrm{output})=0.$$

Daniel S
  • 29,316
  • 1
  • 33
  • 73