27

I am newbie on machine learning and keras and now working a multi-class image classification problem using keras. The input is tagged image. After some pre-processing, the training data is represented in Python list as:

[["dog", "path/to/dog/imageX.jpg"],["cat", "path/to/cat/imageX.jpg"], 
 ["bird", "path/to/cat/imageX.jpg"]]

the "dog", "cat", and "bird" are the class labels. I think one-hot encoding should be used for this problem but I am not very clear on how to deal it with these string labels. I've tried sklearn's LabelEncoder() in this way:

encoder = LabelEncoder()
trafomed_label = encoder.fit_transform(["dog", "cat", "bird"])
print(trafomed_label)

And the output is [2 1 0], which is different that my expectation output of somthing like [[1,0,0],[0,1,0],[0,0,1]]. It can be done with some coding, but I'd like to know if there is some "standard" or "traditional" way to deal with it?

Ethan
  • 1,657
  • 9
  • 25
  • 39
Dracarys
  • 393
  • 1
  • 3
  • 5

3 Answers3

20

Sklearn's LabelEncoder module finds all classes and assigns each a numeric id starting from 0. This means that whatever your class representations are in the original data set, you now have a simple consistent way to represent each. It doesn't do one-hot encoding, although as you correctly identify, it is pretty close, and you can use those ids to quickly generate one-hot-encodings in other code.

If you want one-hot encoding, you can use LabelBinarizer instead. This works very similarly:

 from sklearn.preprocessing import LabelBinarizer
 encoder = LabelBinarizer()
 transfomed_label = encoder.fit_transform(["dog", "cat", "bird"])
 print(transfomed_label)

Output:

[[0 0 1]
 [0 1 0]
 [1 0 0]]
Neil Slater
  • 29,388
  • 5
  • 82
  • 101
0

With the imagegenerator feature in keras we can leverage that directly giving a sample code:

datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    validation_split=0.2)

img_size=128 train_generator = datagen.flow_from_directory('train', target_size=(img_size, img_size), subset='training', batch_size=32) X, y = next(train_generator)

print('Input features shape', X.shape) print('Actual labels shape', y.shape)

The other advantage of using this is that when we do prediction on a new file then we can use train_generator.class_indices to map back labels from the prediction to actual string names.

Ethan
  • 1,657
  • 9
  • 25
  • 39
Vivek
  • 87
  • 3
0

Also you can use sparse_categorical_crossentropy as loss function, and then you don't need onehot-encoding.
sample code: model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')

more info at Keras website

ebrahimi
  • 1,305
  • 7
  • 20
  • 40
Mo Abbasi
  • 1
  • 1