I'm trying to find the derivative of the following function: $$ f(\beta) = \textbf{1}^Th(X\beta) = \sum_i^n \ln(1+e^{\beta^T \textbf{x}_i}) \\h(t) = \ln(1+e^t) $$ Where $\beta$ is a $(p,1)$ vector, $X$ is a $(n,p)$ vector, and $h(X\beta)$ is the element wise use of the function $h(t)$, i.e. it is a $(n,1)$ vector.
I want to find $\nabla_{\beta}f$, and $\nabla^2_{\beta}f$.
This is easy if I ignore the summation, do this per element wise, and then add it in the end. But I was wondering if there's a way to do this strictly in matrix notation?
The results should come out $X^T \sigma(X\beta)$ (where $\sigma$ is the element-wise sigmoid function) and $X^T \Sigma X$, where $\Sigma$ is a diagonal matrix with $\sigma(X\beta)(1-\sigma(X\beta))$ as it's diagonal.
I think this is related to this question, only instead of differentiating w.r.t. $A$, it's w.r.t. $x$.