Main prerequisites: Jordan normal form, basic properties of the ring of complex polynomials
The key fact is that in the Jordan normal form of $A$,
if there is more than one block, then these blocks must have distinct eigenvalues.
Two blocks $B,C$ cannot have the same eigenvalue $\lambda$, or else $\text{Ker}(A-\lambda I)$ would have at least two linearly independent vectors, one coming from $B$ and one coming from $C$.
By the way, any block that is $n$-by-$n$ for $n>1$ must have $1$'s above the diagonal. In other words, you can't have $\begin{bmatrix}\lambda &0\\0&\lambda\end{bmatrix}$ as one of the blocks, because this really just is two blocks each of size $1$-by-$1$ with the same eigenvalue.
$\\$
So let's say we've written out the Jordan normal form according to a suitable basis of the vector space $V$. We have a block $C_i$, of size $c_i$ and of eigenvalue $\mu_i$, and let the subspace corresponding to $C_i$ be $S_i$. Now, $C_i$ is a linear transformation on $S_i$. Also, the vector space is a direct sum of $S_i$, so according to that direct sum let $b_i$ be the $S_i$-component of $b$.
$\\$
First case: $c_i>1$. In other words, $C_i$ is not a $1$-by-$1$ block.
When you write out $b$ as a linear combination of the basis vectors of $V$, the coefficient attached to the bottom vector for $S_i$ (which is one of the basis vectors) cannot be $0$. Otherwise, if that coefficient were $0$, then when you write out $[A-\lambda I\;|\;b]$, the row that has that vector in $S_i$ would have all its entries as $0$, and the rank of $[A-\lambda I\;|\;b]$ would be at most $n-1$.
If you're confused about what the bottom vector means, write out an example of a Jordan block with $1$'s above the diagonal.
Now we talk about polynomials. For a complex polynomial $p$, if $p(A)b=0$, then $p(C_i)b_i=0$.
Now, what polynomials $p(z)\in\mathbb{C}[z]$ satisfy $p(C_i)b_i=0$? Call the set of such polynomials $P_i$. Now, $P_i$ is an ideal in $\mathbb{C}[z]$ and therefore a principal ideal. There is a unique $t_i(z)\in\mathbb{C}[z]$ such that $t(z)$ has leading coefficient $1$ and $P_i$ is precisely the set of multiples of $t_i(z)$.
By the definition of a Jordan block, the polynomial $(z-\mu_i)^{c_i}$ is in $P_i\subset \mathbb{C}[z]$. Therefore, $t_i(z)$ is a factor of $(z-\mu_i)^{c_i}$. The only possibilities consist of $t_i(z)=(z-\mu_i)^{d}$, where $d\leq c_i$.
However, for $d$ less than $c_i$, because of the non-zero bottom-vector coefficient I mentioned earlier, when you calculate $(C-\mu_i)^{d}b_i$ you will still be left with a non-zero coefficient for one of the basis vectors.
Therefore, $t_i(z)=(z-\mu_i)^{c_i}$.
$\\$
Second case: $C_i$ is a $1$-by-$1$ block. In other words, $c_i=1$.
Again, for similar reasons, when you write out $b$ as a linear combination of the basis vectors of $V$, the coefficient corresponding to the vector for $S_i$ cannot be $0$. Furthermore, $t_i(z)=z-\mu_i$.
$\\$
Now, we go back to the point that for a complex polynomial $p$, if $p(A)b=0$, then $p(C_i)b_i=0$.
We can now deduce that if $p(A)b=0$, then $p(z)$ is a multiple of $(z-\mu_i)^{c_i}$ for each $i$. Remember that these $\mu_i$ are all distinct.
So $\displaystyle\prod_i (z-\mu_i)^{c_i}$ is the characteristic polynomial of $A$, which we denote as $\text{char}_A(z)$.
So if $p(A)b=0$, then $\text{char}_A(z)$ divides $p(z)$.
But equally, if $\text{char}_A(z)$ divides $p(z)$, then we know (by the basic properties of the characteristic polynomial) that $p(A)b=0$.
So $p(A)b=0$ if and only if $\text{char}_A(z)$ divides $p(z)$.
Therefore, there is no non-zero polynomial $r(z)$ of degree less than the dimension of the vector space $V$ such that $r(A)b=0$. Let's call the dimension of the vector space $n>1$, because $n=1$ would be a trivial case.
This implies that $b,Ab,\ldots,A^{n-1}b$ are linearly independent.