Difference between using RMSE and nDCG to evaluate Recommender Systems

Question

What kind of error measures do RMSE and nDCG give while evaluating a recommender system, and how do I know when to use one over the other? If you could give an example of when to use each, that would be great as well!

score 9 · Accepted Answer · answered Jun 18 '14 at 22:58

nDCG is used to evaluate a golden ranked list (typically human judged) against your output ranked list. The more is the correlation between the two ranked lists, i.e. the more similar are the ranks of the relevant items in the two lists, the closer is the value of nDCG to 1.

RMSE (Root Mean Squared Error) is typically used to evaluate regression problems where the output (a predicted scalar value) is compared with the true scalar value output for a given data point.

So, if you are simply recommending a score (such as recommending a movie rating), then use RMSE. Whereas, if you are recommending a list of items (such as a list of related movies), then use nDCG.

score 3 · Answer 2 · answered Oct 09 '14 at 02:35

nDCG is a ranking metric and RMSE is not. In the context of recommender systems, you would use a ranking metric when your ratings are implicit (e.g., item skipped vs. item consumed) rather than explicit (the user provides an actual number, a la Netflix).

Difference between using RMSE and nDCG to evaluate Recommender Systems

2 Answers2