3

I was investigating scikit-learn's implementation of the EM algorithm for fitting Gaussian Mixture Models and I was wondering how they did come up with using the average log likelihood instead of the sum of the log likelihoods to test convergence.

I see that it should cause the algorithm to converge faster (given their default parameters), but where does that idea come from ?

Does anyone know if they based this part of the implementation on a specific paper or if they just came up with it and used it ?

In most explanations of the EM algorithm I have come across, they would have used log_likelihoods.sum() instead of log_likelihoods.mean().

eliasah
  • 545
  • 5
  • 16
nyro_0
  • 153
  • 5

1 Answers1

4

It makes unit testing easier; invariant to the size of the sample.

Reference: the github discussion that led to the change.

Emre
  • 10,541
  • 1
  • 31
  • 39