The proof of the second derivative test at a critical point ($Df_a = 0$) runs as follows: for a given sufficiently smooth map $f: \Bbb{R}^n \to \Bbb{R}$, and a point $a \in \Bbb{R}^n$, we write a second order Taylor expansion at the point $a$:
\begin{align}
f(a+h) - f(a) &= \dfrac{1}{2}(D^2f_a)(h,h) + o(\lVert h\rVert^2).
\end{align}
In other words, there is a "remainder term", which is a function $\rho$, such that $\lim_{h \to 0} \rho(h) = 0$, and
\begin{align}
f(a+h) - f(a) &= \dfrac{1}{2}(D^2f_a)(h,h) + \rho(h) \lVert h\rVert^2.
\end{align}
If the Hessian $D^2f_a$ is positive definite say, then there is a positive constant $\lambda$ such that for all $h \in \Bbb{R}^n$, $D^2f_a(h,h) \geq \lambda \lVert h\rVert^2$ (with equality if and only if $h=0$). Hence,
\begin{align}
f(a+h) - f(a) &\geq \dfrac{\lambda}{2} \lVert h\rVert^2 + \rho(h) \lVert h\rVert^2 \\
&= \left( \dfrac{\lambda}{2} + \rho(h)\right) \lVert h\rVert^2.
\end{align}
Since $\rho(h) \to 0$ as $h \to 0$ and $\lambda > 0$, the term in brackets will be strictly positive if $h$ is sufficiently small in norm. Hence, for all $h$ sufficiently small in norm, $f(a+h) - f(a) \geq 0$ (with equality if and only if $h =0$). This is the proof for why a positive-definite Hessian implies you have a strict local minimum at a critical point $a$.
Of course, a similar proof holds for a negative-definite Hessian implying a strict local maximum.
Roughly speaking, the idea of the proof is that the local behaviour of $f(a+h) - f(a)$ is entirely determined by the behaviour of the Hessian, in the term $D^2f_a(h,h)$ (because the error term is "small"). So, to answer your questions,
The proof of the theorem above shows that we need to ensure that the entire term $D^2f_a(h,h)$ is positive (in fact bounded below by a positive multiple of $\lVert h \rVert^2$), so that we can conclude that $f(a+h) - f(a) \geq 0$. But just because an $n \times n$ matrix has all positive entires, it doesn't mean it is positive-definite (Robert's answer gives an explicit counter example).
Hopefully the proof I gave above justifies why definiteness comes into play (it's to ensure you have a good lower/upper bound on the $D^2f_a(h,h)$ term).
A matrix is positive(negative) definite if and only if all its eigenvalues are strictly positive (strictly negative). If there are some positive and some negative, then the matrix is indefinite. If this is the case for your Hessian, it means you have a saddle point (because the function is increasing along some directions while decreasing along others).