4

Suppose I have an infinite sequence of biased bits where the probability of $1$ is $2/3$ and the probability of $0$ is $1/3.$ If I view these as the digits in the binary expansion of a real number, then this sequence defines a real number in the interval $[0,1]$. So what kind of distribution does this real number have?

Some considerations I have made so far is that the probability between $0.5$ and $1$ should be twice the probability between $0$ and $0.5.$ Similarly the probability between $0.25$ and $0.5$ should be twice the probability between $0$ and $0.25.$ A general way of writing this is recursive relationship is

$$F(2x) - F(x) = 2F(x).$$

Adding boundary conditions I get the three equations

$$F(0)=0\\ F(1)=1\\ F(2x)=3F(x)$$

which, if viewed as a recurrence relation, has the solution $F(x) = x^{\log_2(3)}$. My question is: Is this really airtight? Setting up these equations and using the solution from a recurrence relation felt a little hand wavy. I can easily verify that $x^{\log_2(3)}$ satisfies the above conditions for real numbers in the interval $[0,1]$, but is this solution unique?

  • It looks like $F(x)$ is supposed to be the cumulative distribution function. Your argument for $F(2x)=3F(x)$ only applies when $x$ is of the form $2^{-n}$. – Ross Millikan Nov 15 '17 at 23:41
  • You're right. Is there a way to fix this? Or is the solution perhaps even wrong? – Sebastian Oberhoff Nov 15 '17 at 23:57
  • I'm not sure the distribution converges in any reasonable way. – Ross Millikan Nov 16 '17 at 00:43
  • No, the identity $F(2x)-F(x)=2F(x)$ is not valid for every $x$, not even for every $x$ in any small interval hence the rest of your reasoning does not hold. Of course the random variable $$Y_p=\sum_{n=1}^\infty 2^{-n}X_n^p$$ where $(X_n^p)$ is i.i.d. with $P(X_n^p=1)=p$ and $P(X_n^p=0)=1-p$ for every $n$, exists for every $p$. The distribution of $Y_p$ is the Lebesgue measure when $p=\frac12$ and is purely singular for every $p\ne\frac12$ (you are interested in the case $p=\frac23$). This means that there exists some Borel set $B_p$ with Lebesgue measure $0$ such that $$P(Y_p\in B_p)=1$$ – Did Nov 16 '17 at 01:15
  • ...and that $$P(Y_p=y)=0$$ for every $y$. Additionally, $$P(Y_p\in I)\ne0$$ for every interval $I\subseteq[0,1]$ with positive Lebesgue measure. A funny feature is that one can choose the family $(B_p)$ in a way such that, for every $p\ne q$ in $(0,1)$, $$B_p\cap B_q=\varnothing$$ and yet, for every $p$, $$P(Y_p\in B_p)=1$$ – Did Nov 16 '17 at 01:15
  • Can you provide a reference for these results? – Sebastian Oberhoff Nov 16 '17 at 06:16
  • I think $F(2x)=3F(x)$ may be correct but that this, together with $F(0)=0$ and $F(1)=1$, does not necessarily imply $F(x) = x^{\log_2(3)}$ – Henry Nov 16 '17 at 09:26
  • Indeed http://eqworld.ipmnet.ru/en/solutions/fe/fe1111.pdf suggests that the functional equation has a general solution involving an arbitrary periodic function of $\log(x)$ – Henry Nov 16 '17 at 11:49
  • Related: https://math.stackexchange.com/questions/1885633/distribution-of-a-random-real-with-i-i-d-bernoullip-binary-digits – Henry May 18 '18 at 07:42
  • And also https://mathoverflow.net/questions/250284/measure-induced-on-0-1-by-infinite-tosses-of-biased-coin – Henry May 18 '18 at 12:32

2 Answers2

2

Your recursion gives a condition the cumulative distribution function satisfies (in a sense, you have fractal copies of the function in itself), but there are several functions which satisfy this.

You would not expect the cumulative distribution function to be a smooth function since for example values of the binary form $0.0111xyz\ldots_2$ are four times as likely as those of the form $0.1000xyz\ldots_2$

The cumulative distribution function seems to look like this red line while $x^{\log_2(3)}$ is the blue line

enter image description here

and you can see that $x^{\log_2(3)}$ only gives the correct value when $x$ is a negative power of $2$, as Ross Millikan commented

When $x=\dfrac{k}{2^n}$ for some integers $k,n$, you have $F(x)=\dfrac{a(k)}{3^n}$ where $a(k)$ is OEIS A006046 (the number of odd entries in the first $k$ rows of Pascal's triangle). Other values can be found by limits since $F(x)$ is increasing, and looking at Michael Hardy's example it seems that you should have $F(\frac15)=\frac{5}{77},\, F(\frac25)=\frac{15}{77},\, F(\frac35)=\frac{29}{77},\, F(\frac45)=\frac{45}{77}$

Henry
  • 169,616
1

Suppose for $0\le x\le 1$ we have $F(x) = \Pr(X\le x) = x^{\log_2 3}.$

Then $F(0) = 0$ and $F(1/2) = 1/3$ and $F(1)=1,$ all as expected.

Let $D_1,D_2,D_3,\ldots$ be the binary digits of $X.$ We wanted these to be i.i.d. with each equal to $1$ with probability $2/3.$

Observe that this means $\displaystyle X = \sum_{k=1}^\infty \frac{D_k}{2^k}$ and $\displaystyle 2(X - D_1/2) = 2X-D_1 = \sum_{k=2}^\infty \frac{D_k}{2^k} $ both have the same distribution, since the two sequences $(D_1,D_2,D_3,\ldots)\vphantom{\dfrac11}$ and $(D_2,D_3,D_3,\ldots)$ both have the same distribution. The conditional distribution of $2X-D_1$ given $D_1$ has this same distribution, and since $2X-D_1$ is determined by $D_2,D_3,D_4,\ldots,$ we have $2X-D_1$ independent of $D_1.$ Since $2X-D_1$ is independent of $D_1$ and $2X-D_1$ has the same distribution that $X$ has, we can say \begin{align} F(x) & = \Pr(X\le x) = \Pr(2X-D_1\le x) = \Pr(2X-D_1\le x \mid D_1=1) \\[10pt] & = \frac{\Pr(2X-D_1 \le x\ \&\ D_1=1)}{\Pr(D_1=1)} = \frac{\Pr(1/2 \le X \le \frac{x+1} 2)}{2/3} = \frac{\left( \frac{x+1} 2 \right)^{\log_2 3} - 1/3}{2/3}. \end{align} Is the following true? $$ x^{\log_2 3} \, \overset{\Large\text{?}} = \frac{\left( \frac{x+1} 2 \right)^{\log_2 3} - 1/3}{2/3} $$ But some numerical computation gives counterexamples to this. They are not equal when $x=0.2.$