I was curious as to the autocorrelation of prime gaps when viewed as a time series; I understand this makes no sense, but I would love to hear your thoughts.
As per Raymond Manzoni's response to this question, Is the $n$-th prime smaller than $n(\log n + \log\log n-1+\frac{\log\log n}{\log n})$?:
$p_n\ge n(ln(n)+ln(ln(n))-1+\frac{ln(ln(n))-21/10}{ln(n)})$ for $n\ge3$.
Let the right hand side be $f(n)$. Assume that: $g_n=p_{n+1}-p_n\sim f(n+1) - f(n)$.
Let this $f(n+1)-f(n) = t_n$, $t$ being the uncoditional estimate, or "asymptotical/long-run" likeness of prime gaps.
To normalize the gaps, consider:
$x_n = ln(g_n/t_n)$
The log of how many times wider the gap is than it "should be".
The histogram of this for the first 1,000,000 primes is as follows: Histogram of $x_n$
More interestingly, if we difference $x_n$, or: $y_n = x_{n}-x_{n-1}$, the histogram of $y_n$ is symmetric (I don't think its over-differencing, as I couldn't recreate it in simulations of gaussian noise - I haven't tried exponential noise yet) :
Histogram of $y_n = x_n - x_{n-1}$
In addition, for the first 1,000,000 primes, the ACF of $x_n$ seems to have statistically significant autocorrelations for lags 1-5:
R estimated the optimal ARIMA model for $x_n$ to be an AR(5): ar1 -0.8928 (0.0035), ar2 -0.7302 (0.0046), ar3 -0.5494 (0.0049), ar4 -0.3618 (0.0046), ar5 -.1788 (0.0035), where the standard errors are in parentheses.
It seems that the autocorrelations for lags 1-5 of $x_n$ are significant. If prime gaps are indeed random, must this necessarily be a fluke attributed to the small sample size? In the limit as $n$ grows large, would these autocorrelations disappear? If they don't, what is the implication on the distribution of primes? Even if we are confident that these $\rho_i$'s are non-zero, is it possible to inductively prove that the autocorrelations are invariant (i.e., adding the $n+1$'th prime to the sample will still imply non-zero AR coefficients)?