I am trying to follow along with the guide at http://www.hackevolve.com/where-cnn-is-looking-grad-cam/, using Tensorflow's new eager execution mode. One line in particular has me stumped:
grads = K.gradients(class_output, last_conv_layer.output)[0]
I understand that it is finding the gradients between the last convolutional layer and the output for the particular class. However, I cannot figure out how to accomplish this using GradientTape, since (a) both are tensors and not variables, and (b) one is not directly derived from the other (their feature maps already exist, so without a graph they are effectively independent).
Edit: Some more information. No takers yet on answering, so I'll go ahead and add what I have tried since I posted the question:
The obvious steps are reproducing the first part with Eager execution.
import numpy as np
import cv2
import tensorflow as tf
tf.enable_eager_execution()
model = tf.keras.models.load_model("model.h5")
print(type(model))
# tensorflow.python.keras.engine.sequential.Sequential
from dataset import prepare_dataset
_, ds, _, _, _, _ = prepare_dataset() # ds is a tf.data.Dataset
print(type(ds))
# tensorflow.python.data.ops.dataset_ops.DatasetV1Adapter
it = train_ds.make_one_shot_iterator()
img, label = it.get_next()
print(type(img), img.shape)
# <class 'tensorflow.python.framework.ops.EagerTensor'> (192, 192, 3)
print(type(label), label.shape)
# <class 'tensorflow.python.framework.ops.EagerTensor'> (2,)
img = np.expand_dims(img, axis=0)
print(img.shape)
# (1, 192, 192, 3)
predictions = model.predict(img)
print(predictions)
# array([[0.9711799 , 0.02882008]], dtype=float32)
class_idx = np.argmax(predictions[0])
print(class_idx)
# 0
class_output = model.output[:, class_idx]
print(model.output, class_output)
# Tensor("Softmax:0", shape=(?, 2), dtype=float32) Tensor("strided_slice_5:0", dtype=float32)
# I use tf.keras.layers.Activation instead of the activation parameter of conv2d,
# so last_conv_layer actually points to the layer after the last conv layer.
# Is that not correct?
last_conv_layer = model.get_layer('activation_6')
"""
Now, the fun part: how do I compute the gradient of class_output with respect to
the output of the last convolutional layer?
"""
One attempt is using reduce_sum and multiply to get the desired gradient (ignore the class_output step):
with tf.GradientTape() as tape:
print(label)
# tf.Tensor([1. 0.], shape=(2,), dtype=float32)
y_c = tf.reduce_sum(tf.multiply(model.output, label))
print(y_c)
# Tensor("Sum_4:0", shape=(), dtype=float32)
last_conv_layer = model.get_layer('activation_6')
grad = tape.gradient(y_c, last_conv_layer.output)
However, grad is None in this setup.