3

Gradient descent can be used to minimize an objective function $\Phi:\mathbb{R}^d \to \mathbb{R}$, if we know how to evaluate $\Phi$ on any input of our choice.

However, my situation is a little different. I have an objective function $\Phi$ of the form

$$\Phi(x) = \Phi_1(x) + \Phi_2(x),$$

where I can evaluate $\Phi_1$ on any input of my choice, but I don't have the ability to do that for $\Phi_2$. Instead, for $\Phi_2$, I have only a thresholded (quantized) version of $\Phi_2$: I can evaluate $f_2:\mathbb{R}^d \to \{0,1\}$ on any input of my choice, where $f_2$ is defined by

$$f_2(x) = \begin{cases} 0 &\text{if } \Phi_2(x)\le t\\ 1 &\text{if } \Phi_2(x) > t\\ \end{cases}$$

and $t$ is fixed. You can assume that $\Phi_2$ is smooth and has all the nice properties you might like, but I can only evaluate $f_2$, not $\Phi_2$. How can I search for an $x$ that's likely to make $\Phi(x)$ as small as possible, in this situation? Is there any way to adapt gradient descent or other mathematical optimization method to this setting?


Why I think there might be some hope: if we find $x',\delta \in \mathbb{R}^d$ such that $f(x')=0$ and $f(x'+\delta)=1$, where $x' \approx x$ and $\delta \approx 0$, then we've learned some information about $\Phi_2$, e.g., that the partial derivative of $\Phi_2$ is likely to be large in the $\delta$ direction. It seems like it might be possible to build an algorithm to exploit this kind of information. Are there any techniques to handle this kind of situation?

Raphael
  • 73,212
  • 30
  • 182
  • 400
D.W.
  • 167,959
  • 22
  • 232
  • 500

1 Answers1

2

I guess $f_2(x)$ is some sort of oracle that tells you if $\Phi_2(x)$ is greater than $t$? If the set $\{\mathbf{x} : \Phi_2(x) \le t\}$ is convex, then I think projected gradient algorithm might be helpful. Could you provide more information on $\Phi_2(x)$? The question is kinda confusing with the information provided.

H.S.
  • 21
  • 1