4

I have a logistic regression classifier that shows differing levels of performance for precision and recall at different probability boundaries as follows:

enter image description here

The default threshold for the classifier to decide which class something belongs to is 0.5. However, am I right in understanding that in order to get the best performance trade-off I should set the decision boundary to be about 0.82 below? That can be done in Scikit-Learn, but I want to make sure that I am drawing the correct conclusions. Any advice would be appreciated.

Sandy Lee
  • 267
  • 2
  • 9

1 Answers1

3

The intersection of the precision and recall curves is certainly a good choice, but it's not the only one possible.

The choice depends primarily on the application: in some applications having very high recall is crucial (e.g. a fire alarm system), whereas in some other applications precision is more important (e.g. deciding if somebody needs a risky medical treatment). Of course if your application needs high recall you'd choose a threshold before 0.6, if it needs high precision you'd choose a threshold around 0.85-0.9.

If none of these cases apply, people usually choose an evaluation metric to optimize: F1-score would be a common one, sometimes accuracy (but don't use accuracy if there is strong class imbalance). It's likely that the F1-score would be optimal around the point where the two curves intersect, but it's not sure: for example it might be a bit before 0.8, when the recall decreases slowly and the precision increases fast (this is just an example, I'm not sure of course).

My point is that even if it's a perfectly reasonable choice in this case, in general there's no particular reason to automatically choose the point where precision and recall are equal.

Erwan
  • 26,519
  • 3
  • 16
  • 39