I am in the process of implementing a Quasi-Newton optimizer for tensorflow, and my question is when Optimizer apply_gradients function is called inside of the minimize function, are the gradients applied at whatever values the tensors happen to have at that moment in time?
Cheers, Sergey