2

Ml models must strike a balance between predictive power and generalization power. Therefore, I split the data into train/test and calculate metrics on both. Often I see instructions in someone else's code like "if the overfitting is too big, then try to reduce the learning rate".

But what if the overfitting model is higher on test? I am using hyperparameter search with overfitting penalty. roc_auc_test - abs (roc_auc_train-roc_auc_test )*a, a=1/4. Something like you can overfit by 4 points for every point on the test. I also tried to run the search only by the metric on the test. And the result is that the overfitting model is still more efficient than the overfitting penalty model. For example roc_auc_train=0.7,roc_auc_test=0.7 vs roc_auc_train=0.95 roc_auc_train=0.75.

Which model should I choose? What are the pitfalls of choosing an overfitting model? Maybe it will be less stable over time or other problems.

Andrew
  • 406
  • 2
  • 7

0 Answers0