It is interesting that one of the earliest ways of thinking about linearity was also one of the most abstract. Fourier formulated the principle of superposition in a way that very closely follows the definition of linear (vector) space that we use today. Fourier used this principle to formulate his theory of Heat Conduction.
Suppose that you are studying Heat Conduction in some system. You can imagine applying some source of heat $s$, and seeing what happens. This could be a point source of heat, or a general introduction of heat into the system, etc.. You can double the source of heat by applying $2s$. You can add two different sources of heat $s_1+s_2$ and you can form combinations $3s_1+1.2s_2$. Fourier formulated heat conduction using his principle of superposition in a critical way. If $Es_1$ represents the effect on temperature over time due to the application of heat source $s_1$, then his principle was that, if you doubled the source, you would double the effect: $E(2s_1)=2E(s_1)$. And if you added sources, that would lead to adding the effects $E(s_1+s_2)=Es_1 + Es_2$. This was a very abstract principle, but virtual every Physical system has some linear regime where such basic rules apply: double the cause, and you double the effect; add causes and you add their individual effects. The causes will have to be "small" for linearity to apply, but virtually every Physical system has a linear regime. The derivative of Calculus is so important because it gives you a linear regime for the behavior of a function near a point; derivative is a linear approximation, and this generalizes to any number of dimensions.
So, one of the earliest linear space ideas would be the space of causes $s_1,s_2,s_3,\cdots$, and another space would be the space of effects. You can add causes. You can scale causes, etc.. And there is some linear operator $E$ that takes causes to effects through a principle of superposition: double the cause, you double the effect; add two causes and you add their effects.
To think in these terms, you just about have to use the most abstract, modern notion of Vector Space. There was a natural evolution leading to these ideas. And, once these ideas had sufficiently evolved, Vector Spaces because a natural setting for the theory of Quantum Mechanics. Oddly enough, Quantum Mechanics is one of the most linear of the physical models out there, and it requires complex scalars. You have little arrows in the complex plane that spin in time when you look at light, and to understand how to superimpose light, you have to use Complex scalars. Feynman formulated his Quantum Electrodynamics using this understanding.
A subspace is where you consider all the possible causes you can build from a fixed collection of sources using superposition. The corresponding effects would then be the subspace of all possible effects formed by superposition of the corresponding effects.