Gaussian Mixture Models EM algorithm use average log likelihood to test convergence

Question

I was investigating scikit-learn's implementation of the EM algorithm for fitting Gaussian Mixture Models and I was wondering how they did come up with using the average log likelihood instead of the sum of the log likelihoods to test convergence.

I see that it should cause the algorithm to converge faster (given their default parameters), but where does that idea come from ?

Does anyone know if they based this part of the implementation on a specific paper or if they just came up with it and used it ?

In most explanations of the EM algorithm I have come across, they would have used log_likelihoods.sum() instead of log_likelihoods.mean().

score 4 · Accepted Answer · answered Jul 01 '16 at 21:36

4

It makes unit testing easier; invariant to the size of the sample.

Reference: the github discussion that led to the change.

answered Jul 01 '16 at 21:36

Emre

10,541
1
31
39

Gaussian Mixture Models EM algorithm use average log likelihood to test convergence

1 Answers1