2

I am using transfer learning to train a binary image classification model using keras' pretrained VGG16 model. The code can be found below :

training_dir = '/Users/rishabh/Desktop/CyberBoxer/data/train'
validation_dir = '/Users/rishabh/Desktop/CyberBoxer/data/validation'
image_files = glob(training_dir + '/*/*.jpg')
valid_image_files = glob(validation_dir + '/*/*.jpg')
# importing the libraries
from keras.models import Model
from keras.layers import Flatten, Dense
from keras.applications import VGG16
#from keras.preprocessing import image

IMAGE_SIZE = [64, 64]  # we will keep the image size as (64,64). You 
can increase the size for better results. 

# loading the weights of VGG16 without the top layer. These weights are 
trained on Imagenet dataset.
vgg = VGG16(input_shape = IMAGE_SIZE + [3], weights = 'imagenet', 
include_top = False)  # input_shape = (64,64,3) as required by VGG

# this will exclude the initial layers from training phase as there are 
already been trained.
for layer in vgg.layers:
    layer.trainable = False

x = Flatten()(vgg.output)
#x = Dense(128, activation = 'relu')(x)   # we can add a new fully 
connected layer but it will increase the execution time.
x = Dense(num_classes, activation = 'softmax')(x)  # adding the output 
layer with softmax function as this is a multi label classification 
problem.

model = Model(inputs = vgg.input, outputs = x)

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.vgg16 import preprocess_input

training_datagen = ImageDataGenerator(
                                rescale=1./255,   # all pixel values will be between 0 an 1
                                shear_range=0.2, 
                                zoom_range=0.2,
                                horizontal_flip=True,
                                preprocessing_function=preprocess_input)

validation_datagen = ImageDataGenerator(rescale = 1./255, preprocessing_function=preprocess_input)

training_generator = training_datagen.flow_from_directory(training_dir, 
target_size = IMAGE_SIZE, batch_size = 200, class_mode = 'categorical')
validation_generator = 
validation_datagen.flow_from_directory(validation_dir, target_size = 
IMAGE_SIZE, batch_size = 200, class_mode = 'categorical')

training_images = 3717
validation_images = 885

history = model.fit_generator(training_generator,
               steps_per_epoch = 3717,  # this should be equal to total number of images in training set. But to speed up the execution, I am only using 10000 images. Change this for better results. 
               epochs = 1,  # change this for better results
               validation_data = validation_generator,
               validation_steps = 885)  # this should be equal to total number of images in validation set.

I am training it on just 3700 images but still a single epoch is taking around 10-12 hours. Is this supposed to happen ? Am I doing anything wrong ? I had to downgrade my keras to 2.1.4 for the code to run so is it something affecteing the learning ?

Ethan
  • 1,657
  • 9
  • 25
  • 39
Rishabh Sharma
  • 669
  • 2
  • 8
  • 18

1 Answers1

1

It might have to do with ImageDataGenerator. Data augmentation can be computational expensive. Remove that code and see if the model trains faster.

Brian Spiering
  • 23,131
  • 2
  • 29
  • 113