Tensorflow Eager Execution - Compute gradient between two layers of a sequential model

Question

I am trying to follow along with the guide at http://www.hackevolve.com/where-cnn-is-looking-grad-cam/, using Tensorflow's new eager execution mode. One line in particular has me stumped:

grads = K.gradients(class_output, last_conv_layer.output)[0]

I understand that it is finding the gradients between the last convolutional layer and the output for the particular class. However, I cannot figure out how to accomplish this using GradientTape, since (a) both are tensors and not variables, and (b) one is not directly derived from the other (their feature maps already exist, so without a graph they are effectively independent).

Edit: Some more information. No takers yet on answering, so I'll go ahead and add what I have tried since I posted the question:

The obvious steps are reproducing the first part with Eager execution.

import numpy as np
import cv2
import tensorflow as tf
tf.enable_eager_execution()

model = tf.keras.models.load_model("model.h5")
print(type(model))
# tensorflow.python.keras.engine.sequential.Sequential

from dataset import prepare_dataset
_, ds, _, _, _, _ = prepare_dataset() # ds is a tf.data.Dataset
print(type(ds))
# tensorflow.python.data.ops.dataset_ops.DatasetV1Adapter

it = train_ds.make_one_shot_iterator()
img, label = it.get_next()
print(type(img), img.shape)
# <class 'tensorflow.python.framework.ops.EagerTensor'> (192, 192, 3)

print(type(label), label.shape)
# <class 'tensorflow.python.framework.ops.EagerTensor'> (2,)

img = np.expand_dims(img, axis=0)
print(img.shape)
# (1, 192, 192, 3)

predictions = model.predict(img)
print(predictions)
# array([[0.9711799 , 0.02882008]], dtype=float32)

class_idx = np.argmax(predictions[0])
print(class_idx)
# 0

class_output = model.output[:, class_idx]
print(model.output, class_output)
# Tensor("Softmax:0", shape=(?, 2), dtype=float32) Tensor("strided_slice_5:0", dtype=float32)

# I use tf.keras.layers.Activation instead of the activation parameter of conv2d,
# so last_conv_layer actually points to the layer after the last conv layer.
# Is that not correct?
last_conv_layer = model.get_layer('activation_6') 

"""
Now, the fun part: how do I compute the gradient of class_output with respect to
the output of the last convolutional layer?
"""

One attempt is using reduce_sum and multiply to get the desired gradient (ignore the class_output step):

with tf.GradientTape() as tape: 
    print(label)
    # tf.Tensor([1. 0.], shape=(2,), dtype=float32)
    y_c = tf.reduce_sum(tf.multiply(model.output, label))
    print(y_c)
    # Tensor("Sum_4:0", shape=(), dtype=float32)
    last_conv_layer = model.get_layer('activation_6')

grad = tape.gradient(y_c, last_conv_layer.output)

However, grad is None in this setup.

score 0 · Answer 1 · answered Nov 06 '19 at 08:50

Have you tried putting code from predictions = model.predict(img) onwards into the GradientTape context manager?

The thing is, if you did not record the gradients going from last_conv_layer.output to model.output, the backprop chain is effectively broken.

score 0 · Answer 2 · answered Dec 12 '22 at 16:14

You were almost there!! Just try the code below.

grad_model = tf.keras.models.Model(
        [model.inputs], [model.get_layer(last_conv_layer_name).output, model.output]
    )

    # Then, we compute the gradient of the top predicted class for our input image
    # with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
    last_conv_layer_output, preds = grad_model(img)
    if pred_index is None:
       pred_index = tf.argmax(preds[0])
    class_channel = preds[:, pred_index]

    # This is the gradient of the output neuron (top predicted or chosen)
    # with regard to the output feature map of the last conv layer
grads = tape.gradient(class_channel, last_conv_layer_output)

Tensorflow Eager Execution - Compute gradient between two layers of a sequential model

2 Answers2