4

Background

Grimmett & Stirzaker’s Probability and Random Processes (4th ed. 2020), exercise 4.2.5, reads:

Peripheral points. Let $P_i = (X_i, Y_i)$, $1\le i \le n$, be independent, uniformly distributed points in the unit square $[0,1]^2$. A point $P_i$ is called peripheral if, for all $r=1, 2, \dots, n$, either $X_r\le X_i$ or $Y_r\le Y_i$, or both. Show that the mean number of peripheral points is $n\left(\frac{3}{4}\right)^{n-1}$.

The proof they give is easy: Define an indicator function $I_i$ that is 1 if the point $P_i$ is peripheral. Then $\mathbb{E}(I_i) = \mathbb{P}(I_i=1) = \left(\frac{3}{4}\right)^{n-1}$, and setting the number of peripheral points $X:=\sum I_i$, the result follows from the linearity of expectation.

Here is a plot of $\mathbb{E}(X)= n\left(\frac{3}{4}\right)^{n-1}$:

Plot of n(3/4)^{n-1}

Question

Notably, for $n\ge 9$, we have $\mathbb{E}(X) = n\left(\frac{3}{4}\right)^{n-1} < 1$.

This is a contradiction to the following argument:

  • Assume $n$ points placed in the unit square (never mind how they are distributed).
  • Since the set of points is finite, there exist at least one point with a maximal X coordinate, and at least one point with a maximal Y coordinate. (These points might be the same.)
  • Therefore the number of peripheral points is $X \ge 1$ always, and thus $\mathbb{E}(X)\ge 1$.

Put differently, if for $n\ge 9$ we have $\mathbb{E}(X) < 1$ then there should exist a configuration of points such that no point is peripheral. I can’t see how that is true?

Any help clearing up my confusion would be greatly appreciated, thanks!

  • My third edition of Grimmett & Stirzaker only has four exercises for section 4.2. Your 4.2.5 solution seems to be wrong, and your check that the expected number should exceed $1$ shows it is wrong. – Henry Feb 18 '23 at 17:36
  • This is the 4th edition from 2020, edited my post accordingly. I don’t have the third edition, but it seems this exercise was newly added. Than you for the very detailed answer below! – Julius Plenz Feb 18 '23 at 21:03
  • The correct answer below is the same as that in a couple of earlier questions https://math.stackexchange.com/questions/206866/expected-number-of-pareto-optimal-points and https://math.stackexchange.com/questions/691868/expected-number-of-half-corners-in-a-plane – Henry Feb 18 '23 at 22:53

2 Answers2

5

The $\left(\frac34\right)^{n-1}$ probability your calculation is wrong as it assumes independence of pairwise orderings. If you were restricted to just looking at the $X_i$s then the same argument would give a probability of $\left(\frac12\right)^{n-1}$ and an expectation of $n\left(\frac12\right)^{n-1}$ when clearly the correct probability is $\frac1n$ and the correct expectation is $1$ peripheral point.

If $n=3$:

  • the first point is the only peripheral point if $X_1\gt X_2$ and $X_1\gt X_3$ and $Y_1\gt Y_2$ and $Y_1\gt Y_3$, which has probability $\left(\frac{1}{3}\right)^2=\frac19$, so the overall probability there is just one peripheral point is three times this, i.e. $\frac13$

  • all three points are peripheral if $X_a \le X_b \le X_c$ and $Y_a \ge Y_b \ge Y_c$ for one of the six permutations of $1,2,3$ so has probability $\frac{6}{3!^2}= \frac16$

  • in any other case there are two peripheral points, with probability $1-\frac13-\frac16=\frac 12$

  • the expected number of peripheral points is then $1\times \frac13+2 \times \frac12 + 3\times \frac16 = \frac{11}{6} \approx 1.833$, not your $3(\frac34)^2=\frac{27}{16} = 1.6875$.

For a solution for general $n$:

  • the first point is the $k$th largest ranked on $X$ is $\frac{1}{n}$ since all $n$ positions are equally likely

  • given it is the $k$th largest ranked on $X$, the first point is peripheral with probability $\frac{1}{k}$ since it has to be ranked higher on $Y$ than the points above it ranked on $X$

  • so the probability the first point is peripheral is $\sum\limits_{k=1}^n \frac1{nk} = \frac1nH_n$ where $H_n$ is a harmonic number

  • and thus the expected number of peripheral points is $n$ times that, by linearity of expectation, so is $H_n \approx \log_e(n)+\gamma+ \frac1{2n}$

For $n=3$ this is $\frac11+\frac12+\frac13=\frac{11}{6}$ as before. For $n=9$ it is $\frac{7129}{2520} \approx 2.829$, well above $1$. Perhaps a simulation using R might be persuasive, confirming these results up to the noise of simulation:

periph <- function(n){
  X <- runif(n)
  Y <- runif(n)
  isperiph <- logical(n) 
  for (i in 1:n){
    isperiph[i] <- all(X[i] >= X | Y[i] >= Y)
    }
  sum(isperiph)
  }

set.seed(2023)

sims <- replicate(10^5, periph(3)) table(sims)

sims

1 2 3

33475 49899 16626

mean(sims)

1.83151

sum(1/(1:3))

1.833333

sims <- replicate(10^5, periph(9)) table(sims)

sims

1 2 3 4 5 6 7 8 9

11149 29951 32675 18613 6191 1282 132 6 1

mean(sims)

2.83156

sum(1/(1:9))

2.828968

Henry
  • 169,616
2

I think the book answer is wrong.

Let $A_{i,j}$ be the event that $P_i$ "is peripheral with respect" to $P_j$ , that is, $A_{i,j} \equiv X_i \ge X_j \cup Y_i \ge Y_j$.

Obviously $P(A_{i,j})=1$ if $i=j$. And, by symmetry, $P(A_{i,j})=\frac34$ for $i\ne j$.

Now, we are interested in $P(I_i=1)=P( A_{i,1} \cap A_{i,2} \cdots \cap A_{i,n}) $

The books seems to assume that this equals $\prod_{j=1}^n P(A_{i,j})=(3/4)^{n-1}$

But this is wrong, because the events are not independent. (Knowing that $P_1$ is peripheral wrt $P_2$ increases the probability that $P_1$ is peripheral wrt other points).

Your contradiction argument is right. The actual number of peripheral points is always $1$ or $2$, hence the expected number must lie in that range. Actually, it's easy to conjecture that it should tend to $2$ for large $n$.

(Edit: As noted by Henry in the comments, the last sentences are wrong)

leonbloy
  • 66,202
  • I got caught by the "actual number of peripheral points is always 1 or 2" interpretation, but I do not think it is correct. That says there is one maximal $X_i$ and one maximal $Y_j$ and they may or may not have such $i=j$. In that cases the expected number of maximal points is clearly $2-\frac1n$. But this question is subtly different: with $n=3$ you could have three peripheral points such as $(0.9, 0.2), (0.6,0.4), (0.3,0.8)$ since the second exceeds the first on $Y$ and exceeds the third on $X$ so counting as peripheral. Your non-independence issue is still correct. – Henry Feb 18 '23 at 16:53
  • @Henry You're right, of course. – leonbloy Feb 18 '23 at 17:46