1

I'm reading a paper about model uncertainty quantification. Specifically, it says epistemic uncertainty is a kind of uncertainty due to lack of knowledge about a particular region in the input space.

Also, it makes a mathematical characterization of uncertainty. Denoting $x$ and $y$ the input and output points, and $\cal{D}$ the training data, the total uncertainty $\cal{U}$ of a model with respect to the input $x$ can be measured by the predictive entropy:

${\cal U}(x)={\cal H}( P(y|x,{\cal D}))=-\sum_y P(y|x,{\cal D})\log P(y|x,\cal{D})$

Denoting $P(\theta | \cal D)$ the posterior distribution over the model parameters $\theta$, it also decomposes the uncertainty $\cal U$ in other two terms:

${\cal U}(x)=({\cal H}( P(y|x,{\cal D})) -\mathbb{E}_{P(\theta|{\cal D})}[{\cal H}(P(y|x,\theta)]) + \mathbb{E}_{P(\theta|\cal D)}[{{\cal H}(P(y|x,\theta)}]$

My question is, why the first term, as it calls mutual information between $\theta$ and $y$, is capable of expressing the epistemic uncertainty?

reference

Multivac
  • 3,199
  • 2
  • 10
  • 26
piero
  • 61
  • 5

0 Answers0