In this lecture by Frederic schuller, he introduces the the covariant derivative axiomatically:
A connection $\nabla$ on a smooth manifold $(M,O,A)$ is a map that takes a pair consisting of a vector (field) $X$ and a $(p,q)$ - Tensor field $T$ and sends them to a $(p,q)$ tensor field $\nabla_X T$ following these axioms:
For a scalar function $\nabla_X f = Xf$ where $Xf$ denotes the directioncal derivative of $f$ in the direction of $X$
$\nabla_X (T+S) =\nabla_X T + \nabla_X S$
$\nabla_X T( \omega,Y) = (\nabla_X T)(\omega,Y) + T(\nabla_X \omega, Y) + T(\omega, \nabla_X Y)$ (Leibniz law)
$\nabla_{fX+Z} T = f \nabla_X T+ \nabla_Z T$
Later in the lecture prof. Schuller remarks that for the of operators satisfying the above axioms, the only freedom we have left in saying how it behaves is the christoffel symbols, i.e: how to the connection acts on the basis vectors of the chart (See 34:22). This action is specified by the Christoffel symbols.
The motif in the above is the above, to my understanding, is that when we specify the Christoffel symbols we say exactly how the surface sits in space. (Related)
A yet another approach is taken by prof.Pavel Grinfeld in his lecture series for Tensor analysis, in his engineering style of explanation, he says that the Christoffel's is defined as the derivative of the covariant basis with respect to coordinates (1:07 here)
The Christoffel comes as:
$$ \frac{\partial \vec{Z_i} }{\partial Z_i} = \Gamma_{kj}^i \vec{Z_i} \tag{1}$$
Earlier in the lecture series, it was defined that the covariant basis is given by derivative of the position vector parameterized with coordinates (4:52, he takes about an example of this)i.e:
$$ \vec{Z_i} =\frac{\partial R(Z_1,Z_2,Z_3) }{\partial Z^i} \tag{2}$$
When one looks at (2) then (1), it seems that Christoffel would naturally come when one tries to take derivative of a parameterized position vector. But, in prof. Schuller's lecture, it seems that one must to define what a connection is and only when one defines the action of connection do Christoffel's get defined.
Could someone explain how these two different approaches fit together?