Do All Stochastic Processes have underlying Probability Distributions?

Question

I am trying to understand how a Stochastic Process can be created based on real world data. Specifically, do all Stochastic Processes have underlying Probability Distributions? If this is the case, I think it might be possible to create a valid mathematical likelihood function based on this underlying Probability Distribution Function - and then estimate the required parameters using some estimation technique (e.g. Maximum Likelihood Estimation).

Now, I will add some more context to my question.

In a previous question (Simulating a Function (that is naturally contained) Within an Interval $a,b$), I tried to define a Stochastic Process that "naturally" only exists between points $(a,b)$. This process was based on a modified version of the Ornstein-Uhlenbeck process (whereas the Ornstein-Uhlenbeck process itself is based on the Weiner Process).

Part 1: Definitions

Wiener Process: As I understand, the Wiener Process $W_t$ can be thought of as successive differences between a Brownian Motion. Here are some standard properties about to Brownian Motion:
- $W_0 = 0$ almost surely.
- $W$ has independent increments: for every $t > 0$, the future increments $W_{t+u} - W_t$, $u \geq 0$, are independent of the past values $W_s$, $s < t$.
- $W$ has Gaussian increments: $W_{t+u} - W_t$ is normally distributed with mean 0 and variance $u$, $W_{t+u} - W_t \sim N(0, u)$.
Ornstein-Uhlenbeck Process: The Ornstein-Uhlenbeck process $X_t$ is defined by the following stochastic differential equation:

$$dx_t = \theta (\mu - x_t) \, dt + \sigma \, dW_t$$

where:

$\theta > 0$ and $\sigma > 0$ are parameters
$W_t$ denotes the Wiener process.
$\mu$ is a drift constant.

Modifying the Ornstein-Uhlenbeck process to be constrained between two points $(a,b)$:

In this answer (https://math.stackexchange.com/a/4828166/791334), I learned that the Ornstein-Uhlenbeck process can transformed into a new process $Y_t$ that will now be contained between $(0,1)$:

$$Y_t = \frac{e^{X_t}}{1 + e^{X_t}}$$

By scaling $Y_t$, I think we should now be able to define it between two points $(a,b)$:

$$Y_t = a + \left(\frac{e^{X_t}}{1 + e^{X_t}}\right) \cdot (b - a)$$

My Question: In this post here (https://stats.stackexchange.com/questions/605530/estimate-parameters-in-brownian-motion-with-drift-dx-t-mu-dt-sigma-dw-t), an approach is outlined as to how we estimate the parameters of a Brownian Motion using a Likelihood based approach (i.e. consecutive increments in the Brownian Motion are iid Normally Distributed, i.e. Wiener Process - thus, estimating the parameters of a Brownian Motion should correspond to estimating the parameters of a Normal Distribution via Maximum Likelihood Estimation):

It is well known that (note that $\{W_t\}$ by definition is a Gaussian process) for $0 < t_1 < \cdots < t_k$, the joint density of $(W_{t_1}, > \ldots, W_{t_k})$ is (where $t_0 = w_0 = 0$) \begin{align} > f_{t_1\cdots t_k}(w_1, \ldots, w_k) = \prod_{i = > 1}^k\frac{1}{\sqrt{2\pi(t_i - t_{i - 1})}} \exp\left[-\frac{(w_i - > w_{i - 1})^2}{2(t_i - t_{i - 1})}\right]. \end{align} Since the transformation $\mathbf{X} = \mu\mathbf{t} + \sigma\mathbf{W}$ is affine (where $\mathbf{W} = (W_{t_1}, \ldots, W_{t_k})$, $\mathbf{X} > = (X_{t_1}, \ldots, X_{t_k})$, $\mathbf{t} = (t_1, \ldots, t_k)$), the joint density of $(X_{t_1}, \ldots, X_{t_k})$ is then given by (where $t_0 = x_0 = 0$): \begin{align} & g_{t_1\cdots t_k}(x_1, > \ldots, x_k) \\ > =& \frac{1}{\sigma^k} \prod_{i = 1}^k\frac{1}{\sqrt{2\pi(t_i - t_{i - 1})}} \exp\left[-\frac{((\sigma^{-1}(x_i - \mu t_i) - \sigma^{-1}(x_{i > - 1} - \mu t_{i - 1}))^2}{2(t_i - t_{i - 1})}\right] \\ > =& \frac{1}{\sigma^k} \prod_{i = 1}^k\frac{1}{\sqrt{2\pi(t_i - t_{i - 1})}} \exp\left[-\frac{(x_i - x_{i - 1} - \mu(t_i - t_{i - > 1}))^2}{2\sigma^2(t_i - t_{i - 1})}\right]. \end{align}

This means that given data $x_1, \ldots, x_k$ observed at $0 < t_1 < > \cdots < t_k$, the log-likelihood function of $(\mu, \sigma)$ is \begin{align} > -k\log\sigma - \frac{1}{2}\sum_{i = 1}^k\log(2\pi(t_i - t_{i - 1})) - \frac{1}{2\sigma^2}\sum_{i = 1}^k((x_i - x_{i - 1} - \mu(t_i - t_{i - > 1}))^2. \tag{1} \end{align}

From $(1)$ it is easy to determine the MLE of $\mu$ and $\sigma$.

In my modified process $Y_t$, I have an additional parameter $\theta$ (note that $a$ and $b$ are pre-defined and do not need to be estimated).

Differences between consecutive increments of the Brownian Motion are i.i.d Normal - thus making it possible to create a valid likelihood function. However, I am not sure if differences between consecutive increments of my process $Y_t$ are i.i.d. Normal, thus allowing for the existence and construction of a valid likelihood function?
Is it somehow possible to still create a valid mathematical likelihood function corresponding to $Y_t$, such that all parameters can be estimated via Maximum Likelihood Estimation?

Thanks!

score 1 · Accepted Answer · answered Dec 28 '23 at 20:16

Note that the transformation $f(x) = \frac{e^x}{1 + e^x}$ is invertible. Given any realization $Y_t$, you can back out what $X_t$ must have been.

Now $dX_t = \theta(\mu - X_t) dt + \sigma dW_t$. So the increment, given $X_t$ is normally distributed with mean $\theta(\mu - X_t) dt$ and variance $\sigma^2 dt$.

This suggests a simple inference scheme. Choose a small $\Delta t$ and divide up your sample into $\Delta X_t$ increments. Create the list of tuples $(X_{t_i}, \Delta X_{t_i})$ for samples $i \in 1 \ldots N$. Use least squares to fit the model $\min_{A, B} \sum_{i} ||A - B X_{t_i} - \Delta X_{t_i}||^2$.

Then $\hat \theta = B$ and $\hat \mu = A/B$.

Then the estimator $\hat \sigma^2 = \frac{1}{N} \sum_i||A - B X_{t_i} - \Delta X_{t_i}||^2$ which is just the residual variance.

I believe this will approximate the MLE estimator given the data, but I haven't done the math to confirm this. The approximation comes in because the $E[\Delta X_t | X_t]$ is actually $X_t (e^{- \theta \Delta t}-1) + \mu(1 - e^{-\theta \Delta t})$ (this formula is given in the Wikipedia article) and for small $\Delta t$, this is $ \approx -X_t \theta \Delta + \mu \theta \Delta t $ and a similar effect with the variances.

@ Mark: thank you so much for your answer! I am trying to learn more about "mean reversion" in general. Here, I posted a question trying to understand the following point: what makes a stochastic process "mean reverting"? Can you please take a look at it if you have time? thank you so much for all your help over the past few weeks - I really appreciate it! https://math.stackexchange.com/questions/4835878/in-mathematics-what-is-meant-by-mean-reversion — stats_noob, Dec 30 '23 at 04:48

Do All Stochastic Processes have underlying Probability Distributions?

1 Answers1

Linked