I'm following Aurélion Géron's book on Machine Learning. The following code tries to evaluate a neural network with a sparse categorical cross entropy loss function, on the Fashion Mnist data set. How come I get such a strange value for loss?
import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fashion_mnist=keras.datasets.fashion_mnist
(X_train_full,y_train_full),(X_test,y_test)=fashion_mnist.load_data()
X_valid , X_train = X_train_full[:5000]/255.0 , X_train_full[5000:]/255.0
y_valid , y_train = y_train_full[:5000] , y_train_full[5000:]
model=keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
model.add(keras.layers.Dense(300,activation="relu"))
model.add(keras.layers.Dense(100,activation="relu"))
model.add(keras.layers.Dense(10,activation="softmax"))
model.compile(loss="sparse_categorical_crossentropy",optimizer="sgd",metrics=["accuracy"])
history=model.fit(X_train,y_train,epochs=30,validation_data=(X_valid,y_valid))
#some instructions with *history* for plotting a graph
model.evaluate(X_test,y_test)
and the output I get is [55.21640347443819, 0.8577]
How come I get a loss over 1?