6

The naturals $\mathbb{N}$ can be encoded with a binary unary code such as

$$ \begin{align} 1&=0_b\\ 2&=10_b\\ 3&=110_b\\ &... \end{align} $$ The length this the encoding grows linearly with the natural number it represents. For example the length of $110_b$ is 3 and it also represents the number 3.

I am looking for an upper bound on the optimality of the prefix free encoding. For example, can we design a prefix coding of $\mathbb{N}$ such that the length of the encoding grows according to ~$\sqrt{n}$, instead of $n$. Is there an upper bound on the optimality that we can prove?

Anon21
  • 355
  • 1
  • 9

1 Answers1

12

I wrote a paper on this. The short answer is that there is no optimal encoding, nor even an optimal sequence of better and better encodings.

Kraft's inequality states that there is a prefix code with word lengths $k_0,k_1,\ldots$ if and only if $$ \sum_{n=0}^\infty 2^{-k_i} \leq 1. $$ This gives a positive answer to your question. Concretely, Elias gamma coding has code word length $O(\log n)$. The idea is simple: first you encode the bit-length of the integer using your method (using roughly $\log_2 n + 1$ bits), then you encode the number itself (using another roughly $\log_2 n$ bits), for a total of roughly $2\log_2n + 1$ bits.

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514