2

I am training a CNN model.

In the first one I got a training accuracy of 87%(0.29 loss) and validation accuracy of 87%(0.30 loss) at 5th epoch, I kept training it for total of 15 epochs and as expected it started overfitting with training accuracy increasing to 97%(0.01 loss) and validation remaining at 87%(0.35 loss).

In the second model, I used Data Augmentation and Dropout layer to deal with overfitting (trained for total 10 epochs). These are the results : 5th Epoch : Train Accuracy 77% (0.45 loss) and Validation accuracy 77% (0.41 loss). 10th Epoch : Train Accuracy 82% (0.38 loss) and Validation accuracy 82% (0.35 loss)

From the loss and accuracy graph which you can see below, it's clear that Model is overfitting in first scenario, but in second it's not overfitting.

Scenario One

enter image description here

Scenario Two

enter image description here

My question is, which model is better in real world based on accuracy ? Model one stopped at epoch 5 with 87% accuracy or model 2 which has not overfit with 82% accuracy (validation)? I understand just based on accuracy model 1 sounds better but it's ultimately overfitting, but if I stop the training using early stopping or something similar, will this be a better model than my second one ?

Rishabh Sharma
  • 669
  • 2
  • 8
  • 18

2 Answers2

1

Right now I would say model 1 is better since it has better validation results (If you take the model at epoch 5, not the overfitted one).

However, I would procceed by retraining model 2 for more epochs and check if the results improve.

Let's try
  • 121
  • 5
0

What you need to consider here is to think about cost of wrong predictions from your model and consider F1 score. Also consider your class distribution. Otherwise these two models looks not very different at the first glance.