1

I'm following Aurélion Géron's book on Machine Learning. The following code tries to evaluate a neural network with a sparse categorical cross entropy loss function, on the Fashion Mnist data set. How come I get such a strange value for loss?

import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

fashion_mnist=keras.datasets.fashion_mnist

(X_train_full,y_train_full),(X_test,y_test)=fashion_mnist.load_data()

X_valid , X_train = X_train_full[:5000]/255.0 , X_train_full[5000:]/255.0
y_valid , y_train = y_train_full[:5000] , y_train_full[5000:]

model=keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
model.add(keras.layers.Dense(300,activation="relu"))
model.add(keras.layers.Dense(100,activation="relu"))
model.add(keras.layers.Dense(10,activation="softmax"))

model.compile(loss="sparse_categorical_crossentropy",optimizer="sgd",metrics=["accuracy"])
history=model.fit(X_train,y_train,epochs=30,validation_data=(X_valid,y_valid))

#some instructions with *history* for plotting a graph

model.evaluate(X_test,y_test)

and the output I get is [55.21640347443819, 0.8577]

How come I get a loss over 1?

1 Answers1

0

The reason you are getting a very high loss has to do with the normalization you are performing. You are normalizing both the X_train and X_valid, however you do not normalize X_valid. Adding normalization of X_valid to your code the final loss for X_valid is much lower.

fashion_mnist = keras.datasets.fashion_mnist

(X_train_full,y_train_full), (X_test,y_test) = fashion_mnist.load_data()

X_valid , X_train = X_train_full[:5000]/255.0 , X_train_full[5000:]/255.0
y_valid , y_train = y_train_full[:5000] , y_train_full[5000:]

# Normalizing X_test as well
X_test = X_test/255.0

model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
model.add(keras.layers.Dense(300,activation="relu"))
model.add(keras.layers.Dense(100,activation="relu"))
model.add(keras.layers.Dense(10,activation="softmax"))

model.compile(loss="sparse_categorical_crossentropy",optimizer="sgd",metrics=["accuracy"])
history = model.fit(X_train,y_train,epochs=30,validation_data=(X_valid,y_valid))

model.evaluate(X_test,y_test)
# [0.328696608543396, 0.8866999745368958]
Oxbowerce
  • 8,522
  • 2
  • 10
  • 26