I have two functions $g(x)$ and $h(z)$ where $h:\mathbb R^n\to \mathbb R$ and $g: \mathbb R\to \mathbb R$. Both are convex, neither are smooth. How can I apply the chain rule to find $\partial (g\circ h)$? (in terms of $\partial g$ and $\partial h$).
What I have so far: the definition for subdifferential (that's all) $$ \partial(g\circ h)(x) = \{ z : g(h(y)) \geq g(h(x)) + z^T(y-x), \forall y\} $$
If in fact $h$ was smooth, we do have this result: $$ \partial(g\circ h)(x) = \nabla h(x)^T \partial g(h(x)) $$
My guess is that the answer will be something like $\partial(g\circ h)(x)=S$ where $$ S = \{b \cdot a: a\in \partial h(x), b\in \partial g(h(x))\} $$ though I'm not sure...
Also does the problem become easier if I restrict the domain of $g$ to nonnegative scalars and claim that $g$ is monotonic?
Edit: Ok one of the special cases I am thinking of is that $h$ is a norm, and is nonsmooth only at 0, with $h(0)=0$. So, we only need to consider $\partial(g\circ h)(0)$; the rest follows the chain rule for smooth $h$. If, in addition, $g$ is monotonic, then $$ a\in \partial h(0) \iff a^Ty \leq h(y) \forall y. $$ $$ b\in \partial g(0) \Rightarrow b(a^Ty) \leq g(a^Ty) \leq g(h(y)) $$ which gives $S \subseteq \partial (g\circ h)$.
The general case is still open!