Synthesizing some of the other answers, I think many of these "magical" group actions come from just trying to rewrite common group theoretic concepts in the orbit-stabilizer language. (The fact that one can do so, is in my opinion an unexpected small miracle.)
For instance (as I learned from this MSE answer Orbit stabiliser theorem as an analogue to first isomorphism theorem), the kernel and image of the group homomorphism $\phi: G \to H$ can be thought of (resp.) as the stabilizer and orbit of $e_H$ under the group action $G \curvearrowright H$ defined by $g \cdot h := \phi(g) h$. Then the orbit-stabilizer theorem gives us precisely the 1st isomorphism theorem.
The famous $p$-group fixed point theorem that Qiaochu enthuses over essentially promises us non-trivial fixed points (in fact a decent number of them; in some applications it's furthermore important to know that there's a multiple-of-$p$ many total fixed points). So, in fact we can take advantage of this powerful theorem, if only we could manhandle important concepts into being related to fixed points of some action.
Example: the normalizer of a subgroup $H$ in a group $G$. Obviously,
$$g\in \text{N}(H) \iff g^{-1}Hg=H$$
it is instantaneous to see the normalizer as a stabilizer: $g\in \text{N}(H) \iff g\in \text{stab}_G(H)$ w.r.t the action $G \curvearrowright 2^G$ (all subsets of $G$) given by $g \mathbin{‣} S := g^{-1} Sg$.
But a slightly further meditation on the definition
$$g\in \text{N}(H) \iff g^{-1}hg \in H \text{ for all } h\in H \iff hgH = gH \text{ for all } h\in H $$
reveals the normalizer as the fixed points: $g\in \text{N}(H) \iff gH\in \text{fix}((G/H) \curvearrowleft H)$ w.r.t the action $H \curvearrowright G/H$ (all left cosets of $H$) given by $h \cdot xH := h xH$.
Somehow, the 2nd perspective is "more naturally group theoretic" than the first "more obvious" perspective; the first one has a group action on the power set of $G$ (a set theoretic notion, not a group theoretic notion), but the 2nd one has a group action on left cosets (indeed a group theoretic notion). A little mysterious.
Conclusion: $\text{fix}((G/H) \curvearrowleft H) = \text{N}(H)/H$ (as sets).
If $H$ is a $p$-group, the $p$-group fixed point theorem tells us that $[G:H] \equiv [\text{N}(H):H] \mod p$, which in particular tells us that the quotient group $\text{N}(H)/H$ has order divisible by $p$, as long as $H$ is not a $p$-Sylow subgroup of $G$.
This forms the basis of the induction step (base step is Cauchy's theorem) in this KConrad writeup of Sylow I.
Above was about the stabilizers of the action $\mathbin{‣} : G \curvearrowright 2^G$. What about orbits? Two subgroups $H',H$ are conjugate iff $H' \in \text{orb}_G(H) \iff \exists g$ s.t. $g^{-1}H'g = H$. But doing same "meditation" as above,
$$g^{-1}H'g = H \iff g^{-1}h' g \in H \text{ for all } h'\in H \iff h'g H = gH \text{ for all } h'\in H.$$
Aha! We recognize again that this is a fixed point equation!
We have shown that $H',H$ are conjugate $\iff$ there is a fixed point to the action $H' \curvearrowright G/H$ (all left cosets of $H$) given by $h' \cdot xH := h' xH$.
This is exactly what leads to Sylow II (as discussed in anon's answer, and also KConrad's writeup)
Something very interesting I noticed: in Sylow I, we use that $0\equiv [G:H] \equiv [\text{N}(H):H] \mod p$ to show that in fact $\text{N}(H) \supsetneq H$ (i.e. strictly larger, i.e. non-trivial elements of $\text{N}(H)$).
But in Sylow II, we use that $0 \not\equiv [G:P] \equiv \text{fix}((G/P) \curvearrowleft P') \mod p$ to show existence of non-trivial fixed points.
So in both cases, we conclude non-trivial fixed points of some kind, but in one case we get the conclusion by being $\equiv 0 \mod p$ and in the other, we get the same conclusion by being $\not\equiv 0 \mod p$!!! That's kind of crazy!
Sylow II spoon feeds the correct next action: conjugation on $\mathcal {Syl}_p(G)$, the set of $p$-Sylow subgroups. Start with conjugation $G \curvearrowright \mathcal {Syl}_p(G)$ (which already gives us 2 results, one of which KConrad calls Sylow III* --- see KConrad's table). Then as anon's answer says, is not a far walk to the idea of considering conjugation of just $P \curvearrowright \mathcal {Syl}_p(G)$, and then Sylow III follows.
Summary: the Sylow theorems pop right out, if you try to rephrase normalizers (and conjugation of subgroups) in terms of fixed points (O glorious $p$-group fixed point theorem!!!)