In a nutshell: what is the (fully rigorous) definition of a confidence interval?
In page $92$ of Wasserman's All of Statistics, it is written that
A $1 − α$ confidence interval for a parameter $θ$ is an interval $C_n = (a, b)$ where $a = a(X_1,...,X_n)$ and $b = b(X_1,...,X_n)$ are functions of the data such that $$P_θ(θ ∈ C_n) ≥ 1 − α, \ \ \ \ \text{ for all } θ ∈ Θ.$$ In words, $(a, b)$ traps $θ$ with probability $1 − α$. We call $1 − α$ the coverage of the confidence interval. Warning! $C_n$ is random and $θ$ is fixed.
I cannot understand the expression $P_\theta(\theta\in C_n)$. In general, if we have a random variable $X:(\Omega,\mathcal{F},P)\to (\mathbb{R},\mathcal{B})$, we define $$P(X\in S) := X_*P(S) = P(X^{-1}(S))$$ for $S$ in the Borel $\sigma$-algebra $\mathcal{B}$. Note the expression "$P(X\in S)$" requires that
- $X$ be a random variable.
- $S$ be a fixed set.
Neither of these conditions seem to be met with the expression "$P(\theta\in C_n)$", as
- $\theta$ is an element of the parameter space $\Theta$, which is itself a subset of $\mathbb{R}^n$ for some $n$. That is, it seems to me that $\theta$ is a (fixed) vector, not a function (and thus not a random variable either).
- As $a$ and $b$ are functions of $X_1,\ldots,X_n$, the interval $C_n := (a,b)$ seems to be "variable", when it should it be a fixed set for the expression to make sense.
In case it is relevant, on page $89$ Wasserman explains that
... $P_\theta(X\in A) = \int_A f(x;\theta) dx$ ...
which does make sense, as $\theta$ is fixed here, so that $f(x;\theta)$ is some random variable, while $A$ is a fixed set. However, the author is using the expressio $P_\theta$ differently in the main (first) quote of this post.