First, let us consider a special case rewriting of the theorem presented in your link.
Let $(X,\mathcal{M})$ be a measurable space, $(Y,\mathcal{N},\nu)$ a measure space, $F:X\to Y$ measurable map with measurable inverse. Also, let $F^*\nu:=(F^{-1})_{*}\nu$ be the pullback measure. Then, for any measurable function $f:Y\to [0,\infty]$, and any measurable subset $A\subset X$, we have
\begin{align}
\int_{F[A]}f\,d\nu=\int_A(f\circ F)\,d(F^*\nu).
\end{align}
I leave it to you to see how this follows from the version in the link. In order to ‘remember’ this theorem, I like to think of the following chain of equalities instead:
\begin{align}
\int_{F[A]}f\,d\nu=\int_{A}F^*(f\,d\nu)=\int_A(F^*f)\,d(F^*\nu):=\int_A(f\circ F)\,d(F^*\nu).
\end{align}
In this chain, the first and second terms are equal essentially by the definition of pullback of a measure. The first and last terms are equal by the quoted formula above. Finally, $F^*f:=f\circ F$ is standard notation for pullback of functions. I simply arranged the equalities in this manner because it is so reminiscent of the natural behavior exhibited by differential forms under pullback.
I'm also going to assume you know the following measure-theory results
- If $\phi:B\subset\Bbb{R}^n\to\Bbb{R}^m$, with $n\leq m$ is locally Lipschitz then it sends (Lebesgue) measure zero sets to (Lebesgue) measure zero sets, and hence Lebesgue-measurable sets to Lebesgue-measurable sets.
- If $T:\Bbb{R}^n\to\Bbb{R}^n$ is a linear transformation, then for every Lebesgue-measurable set $A\subset\Bbb{R}^n$, we have $\lambda(T(A))=|\det T|\lambda(A)$.
- The Radon-Nikodym theorem
- Lebesgue's differentiation theorem
Our hypothesis is that $F:\Omega\to F[\Omega]\subset\Bbb{R}^n$ is a $C^1$ diffeomorphism between open subsets of $\Bbb{R}^n$, and $f:F[\Omega]\to[0,\infty]$ is Lebesgue measurable.
Note that $F$ and its inverse are local diffeomorphisms and hence are locally-Lipschitz, so by (1), they are both Lebesgue-Lebesgue measurable (i.e preimages and direct images of Lebesgue-measurable sets are Lebesgue measurable), and they send Lebesgue measure-zero sets to Lebesgue measure-zero sets. In particular, this implies that for any Lebesgue-measurable subset $A\subset \Omega$, if $\lambda(A)=0$, then
\begin{align}
(F^*\lambda)(A):=((F^{-1})_{*}\lambda)(A):= \lambda\left((F^{-1})^{-1}(A)\right)=\lambda(F(A))=0.
\end{align}
Thus, $F^*\lambda\ll\lambda$. So, by the abstract change of variables theorem, and the Radon-Nikodym theorem, we have
\begin{align}
\int_{F[\Omega]}f\,d\lambda&=\int_{\Omega}(f\circ F)\cdot d(F^*\lambda)\tag{COV}\\
&=\int_{\Omega}(f\circ F)\cdot \frac{d(F^*\lambda)}{d\lambda}\,d\lambda.\tag{Radon-Nikodym}
\end{align}
Note that strictly speaking, the non-trivial Radon-Nikodym theorem is used to assert the existence of the Radon-Nikodym derivative $\frac{d(F^*\lambda)}{d\lambda}$; the subsequent equality of the integrals is actually a simple exercise.
By Lebesgue's differentiation theorem, for $\lambda$-a.e $x\in \Omega$, we have
\begin{align}
\frac{d(F^*\lambda)}{d\lambda}(x)&=\lim_{r\to 0^+}\frac{1}{\lambda(B(x,r))}\int_{B(x,r)}\frac{d(F^*\lambda)}{d\lambda}\,d\lambda\\
&=\lim_{r\to 0^+}\frac{1}{\lambda(B(x,r))}\int_{B(x,r)}d(F^*\lambda)\\
&=\lim_{r\to 0^+}\frac{(F^*\lambda)(B(x,r))}{\lambda(B(x,r))}\\
&=\lim_{r\to 0^+}\frac{\lambda(F(B(x,r)))}{\lambda(B(x,r))}.
\end{align}
Our job is thus to calculate this ratio. So, fix a point $x\in \Omega$, and observe that by writing $F=DF_x\circ \underbrace{(DF_x)^{-1}\circ F}_{:=\phi}$, and using point (2) above, we have
\begin{align}
\frac{\lambda(F(B(x,r)))}{\lambda(B(x,r))}=|\det DF_x|\frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))}.
\end{align}
Now, observe that by the chain rule, and the fact that $(DF_x)^{-1}$ is a fixed linear transformation, we have for all points $y\in \Omega$, $D\phi_y=DF_x^{-1}\circ DF_y$. So, in particular, at the point $x$ we have $D\phi_x=\text{id}_{\Bbb{R}^n}$. Thus, our entire problem has been reduced to proving the following lemma (which is interesting in its own right):
Let $\phi:\Omega\to\Bbb{R}^n$ be a $C^1$ function, where $\Omega\subset\Bbb{R}^n$ is open, such that at a point $x\in \Omega$, we have $D\phi_x=\text{id}_{\Bbb{R}^n}$. Then, $\lim\limits_{r\to 0^+}\frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))}=1$.
For the proof, fix $0<\epsilon<1$. Then, by definition of $D\phi_x=\text{id}_{\Bbb{R}^n}$, there exists a $\delta_1>0$ such that if $\|h\|\leq \delta_1$ then
\begin{align}
\|\phi(x+h)-\phi(x)-h\|\leq \epsilon\|h\|.
\end{align}
So (triangle inequality), for any $r<\delta_1$, we have $\phi(B(x,r))\subset B(\phi(x),(1+\epsilon)r)$.
Next, by the inverse function theorem, $\phi$ is a local $C^1$ diffeomorphism with $D(\phi^{-1})_{\phi(x)}=\text{id}_{\Bbb{R}^n}$ as well, so we can apply the same reasoning as above to deduce there exists $\delta_2>0$ such that for any $r<\delta_2$, we have $\phi^{-1}(B(\phi(x),r))\subset B(x,(1+\epsilon),r)$, or equivalently, $B(\phi(x),r)\subset \phi(B(x,(1+\epsilon)r))$. Thus, if $0<r<\min(\delta_1,\delta_2)$ then
\begin{align}
B(\phi(x),(1-\epsilon)r)&\subset \phi\bigg(B(x, (1+\epsilon)(1-\epsilon)r)\bigg)\\
&\subset\phi(B(x,r))\\
&\subset B(\phi(x),(1+\epsilon)r).
\end{align}
Thus, taking measures and dividing by the measure of $B(x,r)$, we get the inequality
\begin{align}
\frac{\lambda(B(\phi(x),(1-\epsilon)r))}{\lambda(B(x,r))}\leq \frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))}
\leq \frac{\lambda(B(\phi(x),(1+\epsilon)r))}{\lambda(B(x,r))}.
\end{align}
Now, recall that the Lebesgue measure is translation-invariant and that the measure of balls scales as the $n^{th}$ power of the radius. Thus,
\begin{align}
(1-\epsilon)^n\leq \frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))} \leq (1+\epsilon)^n.
\end{align}
Since $0<\epsilon<1$ was arbitrary, this shows $\lim\limits_{r\to 0^+}\frac{\lambda(\phi(B(x,r)))}{\lambda(B(x,r))}$ exists and equals $1$. Thus, the entire proof is complete.
To recap:
- We first invoked the general change of variables theorem.
- Next, we used the Radon-Nikodym theorem to express the integral with respect to $F^*\lambda$ as an integral with respect to $\lambda$, so our problem is now to calculate the Radon-Nikodym derivative $\frac{d(F^*\lambda)}{d\lambda}$.
- Lebesgue's differentiation theorem tells us that the Radon-Nikodym derivative can actually be calculated as a limit of a quotient of the two measures applied to balls (or really any other nicely shrinking set).
- We then used the fact that a linear transformation $T$ (in our case $DF_x$ for a fixed $x\in\Omega$) distort Lebesgue measure by a factor of $|\det T|$, in order to reduce to the case where $DF_x=\text{id}_{\Bbb{R}^n}$.
- Finally, we used that lemma to show the limit is $1$.