An equation of the form $a_1x_1 + \cdots a_nx_n = d$ defines an $(n-1)$-dimensional flat (affine subspace) in $\Bbb R^n$. Remember that it requires at least $n$ of these $(n-1)$-D flats to uniquely define a point. Let's try to build some intuition for that.
Consider $\Bbb R^2$. An $(n-1)$-D flat in $\Bbb R^2$ is a line. One line isn't enough to specify a point. But if you have two lines, there are three possibilities:
$(1)$ the lines coincide. Well two lines right on top of each other isn't really any better than $1$ line. We can't specify a unique point with this. Two lines that coincide will be scalar multiples of each other. That is if $l_1: ax+by=c$ is one of your lines, then your other line $l_2$ must be of the form $l_2: k(ax+by) = k(c)$.
$(2)$ the lines are parallel, but don't coincide. If two lines in $\Bbb R^2$ are parallel then they never intersect, so there is no point that they share. In this case, the equations which specify these lines will have to be of the form $l_1: ax+by=c$ and $l_2: ax+by=d$, where $c\ne d$.
$(3)$ the lines intersect, but not everywhere. Then they define a unique point in $\Bbb R^2$ because two lines that don't coincide can only intersect once.
Now consider $\Bbb R^3$. An $(n-1)$-D flat in $\Bbb R^3$ is a plane. One plane can't specify a unique point. Two planes, can't even specify a unique point. The intersection of two planes is always either a line or the entire plane if they coincide. We need at least three planes to specify a unique point. If you have three planes in $\Bbb R^3$, there are $5$ possibilies:
$(1)$ At least two of the planes coincide. Again, this is basically like having only having $2$ (or less) planes. If $\pi_1$ and $\pi_2$ are two planes that coincide then their equations will be of the form $\pi_1: ax + by +cz = d$ and $\pi_2: k(ax+by+cz) = kd$.
$(2)$ At least two of the planes are parallel. Well if two of the planes are parallel, then those two planes can't intersect each other. So each may intersect with the third plane, but those intersections will either be lines or the entire plane -- thus a unique point cannot be specified by these three planes. If $\pi_1$ and $\pi_2$ are parallel then $\pi_1: ax+by+cz = d \implies \pi_2: ax+by+cz = f$ for $f\ne d$.
$(3)$ Each of the three planes intersect the other two, but all three lines of intersection are parallel and don't coincide. Then there can't be a point of intersection. This occurs when $\pi_1: a_1x+b_1y+c_1z = d_1$, $\pi_2: a_2x+b_2y+c_2z=d_2$, and $\pi_3: (ja_1 +ka_2)x + (jb_1 +kb_2)y + (jc_1 +kc_2)z = (jd_1 +kd_2)$
$(4)$ The lines of intersection are parallel and all $3$ coincide. Then they form a line which doesn't specify one point. Thus $\pi_1 \cap \pi_2 = \pi_2 \cap \pi_3$ for instance.
$(5)$ None of the above. Then each of the three planes intersect with the other two and the lines of intersection aren't all parallel. Then because these planes are just $2$-D subspaces, we can use what we know from our consideration of $\Bbb R^2$ earlier to see that at least two of these lines intersect. Thus all three planes intersection at a single point.
Solving a system of linear equations is just finding this one point of intersection of $(n-1)$-D flats. Hopefully you get the point from the above that you need at least $n$ $(n-1)$-D flats to specify a point -- but just because you have $n$ of them doesn't mean you will.
One other thing to notice is that every time the collection of flats didn't specify a unique point, the equations of at least one of those flats was a linear combination of the others. I won't prove it, but this is a general result: an equation of a flat is a linear combination of the other $n-1$ flats $\iff$ that collection of $n$ flats does not specify a unique point.
Now let's see how the determinant arises from looking analyzing whether a system of $n$ linear equations in $n$ variables has a unique solution.
You've already done the $2\times 2$ case, so let's look at the $3\times 3$ case then we'll see if we can figure out how to generalize it.
Consider the following system:
$$\begin{cases}ax_1 + bx_2 + cx_3 = y_1 \\ dx_1 + ex_2 + fx_3 = y_2 \\ gx_1 + hx_2 + ix_3 = y_3\end{cases}$$
where $a\ne 0$. If your system has $a=0$ in its first equation, then rearrange your equations so that the coefficient on $x_1$ in the first equation is nonzero. If you can't, then you already know that this system can't uniquely specify a point, so you're already done.
The only operations we will do on these are the Gaussian elimination operations. Then we can see that
$$\begin{cases}ax_1 + bx_2 + cx_3 = y_1 \\ dx_1 + ex_2 + fx_3 = y_2 \\ gx_1 + hx_2 + ix_3 = y_3\end{cases} \\ \implies \begin{cases}ax_1 + bx_2 + cx_3 = y_1 \\ (ae-bd)x_2 + (af-cd)x_3 = ay_2-dy_1 \\ (ah-bg)x_2 + (ai-cg)x_3 = ay_3-gy_1\end{cases} \\ \implies \begin{cases}ax_1 + bx_2 + cx_3 = y_1 \\ (ae-bd)x_2 + (af-cd)x_3 = ay_2-dy_1 \\ [(ai-cg)(ae-bd)-(ah-bg)(af-cd)]x_3 = (ay_3-gy_1)(ae-bd)-(ay_2-dy_1)(ah-bg)\end{cases} \\ \implies \begin{cases}ax_1 + bx_2 + cx_3 = y_1 \\ (ae-bd)x_2 + (af-cd)x_3 = ay_2-dy_1 \\ a[\color{red}{a(ei-fh)-b(di-fg)+c(dh-eg)}]x_3 = a[\color{blue}{a(ey_3-hy_2)-b(dy_3-gy_2)+y_1(dh-ge)}]\end{cases} $$
Because we already assumed that $a\ne 0$, we can cancel it off both sides. Then what are we left with? Cramer's rule. That last equation just says
$$\det(A)x_3 = \det(A_3)$$
where $A_3$ is the coefficient matrix of this system where $\begin{bmatrix} y_1 \\ y_2 \\ y_3\end{bmatrix}$ takes the place of the third column.
What is this telling us? Well the value of $\det(A_3)$ doesn't effect whether or not $x_3$ has a unique solution -- even if it is zero, it doesn't rule out a unique $x_3$. On the other hand, $\det(A)$ does effect $x_3$'s value. We can see that we can always solve for $x_3$ here UNLESS $\det(A)=0$. Thus if $\det(A)$, there is no number $x_3$ which is a unique solution to this system.
Now you should be able to see that just by performing the same Gaussian reduction methods that I used above to separate out the a different variable, you'd find a very similar formula pops up (for instance if you tried solving for $x_2$, you'd eventually get $\det(A)x_2 = \det(A_2)$).
So in both the $2\times 2$ (look at your own question for verification) and in the $3\times 3$ case, Gaussian elimination eventually yields Cramer's rule. This continues to hold for any system of $n$ equations in $n$ unknowns. To verify this, simply find a good proof of Cramer's rule OR come up with one on your own.