I am interested in knowing what really happens in Hellinger Distance (in simple terms). Furthermore, I am also interested in knowing what are types of problems that we can use Hellinger Distance? What are the benefits of using Hellinger Distance?
Asked
Active
Viewed 2.1k times
1 Answers
13
Hellinger distance is a metric to measure the difference between two probability distributions. It is the probabilistic analog of Euclidean distance.
Given two probability distributions, $P$ and $Q$, Hellinger distance is defined as:
$$h(P,Q) = \frac1{\sqrt2}\cdot \|\sqrt{P}-\sqrt{Q}\|_2$$
It is useful when quantifying the difference between two probability distributions. For example, if you estimate a distribution for users and non-users of a service. If the Hellinger distance is small between those groups for some features, then those features are not statistically useful for segmentation.
timleathart
- 3,960
- 22
- 35
Brian Spiering
- 23,131
- 2
- 29
- 113