Covariance matrix of uniform distribution over the Sierpinski triangle

Question

Let $(X_1, X_2)$ be uniform over the unit Sierpinski triangle (represented in Cartesian coordinates). What is its covariance matrix?

This is a question I saw in a jobs ad. I would love some leads on solving it.

Certainly, the covariance must be zero, due to the relationship between covariance and correlation and the symmetry of the standard Sierpinski triangle. — Mark McClure, May 09 '18 at 15:25
@MarkMcClure any idea how one could show this "the long way around"? I'm very interested in probabilistic reasoning over fractal distributions! — Carl Patenaude Poulin, May 09 '18 at 16:03
Yeah - numerical estimations are easy but theoretical computations involve a bit more work. Are you sure the question asks for the full covariance matrix? That essentially means you need to compute the $X$ and $Y$ variances. That can be done, but is not totally easy. — Mark McClure, May 09 '18 at 16:10
This isn't homework, FWIW. I'd love to know how to do the variance of each distribution. I figure I should start with an iterative refinement and see if it converges? — Carl Patenaude Poulin, May 09 '18 at 18:22
You know how to express the variances as integrals with respect to the uniform distribution? If so, then an approach similar to the one I outline in my answer to this question should work. I believe that both variances are $1/18$. — Mark McClure, May 09 '18 at 18:31
Is this triangle bounded by the triangle with vertices $(0, 0), (1, 0), (0, 1)$? It's unclear to me. — Brian Tung, May 09 '18 at 20:51
@BrianTung Fair question - which is another reason that it would be great to see the original question quoted in full. Having said that, I do think that most references to the Sierpinski triangle would construct it as an equilateral triangle with base on the unit interval. That's certainly how I took it when doing my computations. — Mark McClure, May 10 '18 at 12:19
Find the original question here. It's less than clear, but to me it looks like equilateral triangle with unit vertices (like @MarkMcClure said). — Carl Patenaude Poulin, May 10 '18 at 18:41

score 5 · Accepted Answer · answered Nov 08 '18 at 23:20

For this answer, I will interpret the "unit Sierpinski triangle $S$" as being the one with vertices $(0,0)$, $(1,0)$, $(0,1)$. If you want to use the triangle with vertices $(0,0)$, $(1,0)$, $(\frac{1}{2}, \frac{\sqrt{3}}{2})$ instead (i.e. beginning with an equilateral triangle) then a very similar method should work.

I will also interpret the "uniformity" condition on the probability measure $\mu$ on $S$ as meaning that the contraction of $S$ to each of its subtriangles of level 1, $S_{00}$, $S_{01}$, and $S_{10}$, respects measure except that it multiplies it by $\frac{1}{3}$.

Now, this will imply that for any measurable and $L^1$ function $f : S \to \mathbb{R}$, we have $$ \int_{S_{00}} f(x,y) d\mu = \frac{1}{3} \int_S f \left(\frac{1}{2}x, \frac{1}{2}y \right) d\mu; \\ \int_{S_{01}} f(x,y) d\mu = \frac{1}{3} \int_S f \left(\frac{1}{2} x, \frac{1}{2}y + \frac{1}{2} \right); \\ \int_{S_{10}} f(x,y) d\mu = \frac{1}{3} \int_S f \left(\frac{1}{2} x + \frac{1}{2}, \frac{1}{2} y \right).$$ The reason: it must hold for any characteristic function of a set by the uniformity condition; then, by linearity it must hold for simple functions; then, the definitions of Lebesgue integrals for nonnegative functions and for $L^1$ functions will extend the equations to all $L^1$ functions.

We can also show that the function $x : S \to \mathbb{R}$ must be measurable: for any dyadic rational $q$, $x^{-1}([q, \infty)) = \{ (x, y) \in S \mid x \ge q \}$ is a finite union of subtriangles of $S$ and therefore measurable. However, the intervals $[q, \infty)$ with $q$ a dyadic rational generate the Borel $\sigma$-algebra on $\mathbb{R}$. Similarly, the function $y : S \to \mathbb{R}$ must be measurable. And then, since both $x$ and $y$ are bounded functions on $S$ and $\mu$ is a finite measure, it follows that they are also in $L^1(\mu)$.

Using this, we can now calculate: $$E(x) = \int_S x\,d\mu = \int_{S_{00}} x\,d\mu + \int_{S_{01}} x\,d\mu + \int_{S_{10}} x\,d\mu = \\ \frac{1}{3} \int_S \frac{1}{2}x\,d\mu + \frac{1}{3} \int_S \frac{1}{2}x\,du + \frac{1}{3} \int_S \left( \frac{1}{2}x + \frac{1}{2} \right) d\mu = \\ \frac{1}{2} \int_S x\,d\mu + \frac{1}{6} = \frac{1}{2} E(x) + \frac{1}{6}.$$ It follows that $E(x) = \frac{1}{3}$. Very similarly, $E(y) = \frac{1}{3}$.

The calculation of $E(x^2), E(xy), E(y^2)$ will be very similar (using the previous results for $E(x)$ and $E(y)$ in intermediate steps).

I concur with this and get$$\left(\begin{array}{cc} 2/27 & -1/27 \ -1/27 & 2/27 \ \end{array}\right)$$ for the final covariance matrix. The original ad did use an equilateral triangle however. — Mark McClure, Nov 09 '18 at 00:05
Actually, I guess since the equilateral Sierpinski triangle is the image of this "binary representation" Sierpinski triangle under a linear transformation, you wouldn't need to restart the calculations from scratch. If the linear transformation is $P$, then the covariance matrix should transform as $\Omega' = P^T \Omega P$. — Daniel Schepler, Feb 12 '20 at 16:22

Hooked · Answer 2 · 2018-05-11T13:44:49.923

1

As mentioned by @MarkMcClure in the comments (and the wonderful linked answer), both the numerical and exact answer of points sampled from the Sierpinski triangle seems to be $(1/18) \mathbf{I}_2$ when you consider the points as N-dimensional samples $X=(x_1, x_2, \ldots, x_N)^T$. If however, you consider the transpose of that $Y = X^T$ you get the answer below. While not correct, it was fun to work through and I'll leave it up as long as people still think it has some value.

Original Answer

Not directly an answer, but there are simple direct insights you can make into the problem by sampling. First, sample a few thousand points using the chaos game.

Compute the covariance matrix $C = \textrm{Cov}(X_1,X_2)$ and then the eigenvectors, $Cv = \lambda v$. You'll quickly find that the largest eigenvalue dominates all of the rest. In this sample $\lambda_1 / \lambda_2 \approx 10^{15}$. Sort $C$ by this eigenvector $v_1$ and you get a beautiful and smooth matrix:

For the final visual, color the points by this $v_1$

All of this suggests that the answer to the original question has a nice closed form.

edited May 11 '18 at 13:44

answered May 10 '18 at 20:16

Hooked

6,785

Hmm... shouldn't the covariance matrix be 2 by 2? – Mark McClure May 10 '18 at 20:24
@MarkMcClure it's possible that there is a nomenclature error on my part here. I'm using this definition https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.cov.html where N points in d dimensions gives me an NxN matrix. Was the question about the transpose giving a dxd matrix? – Hooked May 10 '18 at 20:43
1

@MarkMcClure taking the 2x2 matrix gives ~ $0.05 * \mathbf{I}$ – Hooked May 10 '18 at 20:45
That sounds right - I computed it to be $I/18$ using the techniques on the comments. If you modifiy, I'll upvote. :) – Mark McClure May 10 '18 at 21:05
The definition you pointed to in the numpy documentation looks good, by the way. In this situation, we have $N=2$. – Mark McClure May 10 '18 at 21:11
1

@MarkMcClure turns out, I totally misread the problem as stated in the prior comment. I like the pictures for the other problem I worked on so I thought I'd leave it up. I've brought attention to your comment/answer at the top and thanks for showing me the Strichartz paper (it's really interesting!). – Hooked May 11 '18 at 13:47

Covariance matrix of uniform distribution over the Sierpinski triangle

2 Answers2

Original Answer

Linked