I don't know the answer, so this is just a long comment.
If I understand correctly, your formula only works if $f$ is bijective. I also think we need for $f$ be (continuously?) differentiable with non-zero derivative everywhere. If someone could comment on these requirements, that would be great.
We could try firstly weakening this to allow $f$ to be a surjective continuously differentiable function. In this case, we expect that $p_Y(y)$ should be a sum over preimages of $y$. Like so:
$$p_{f(X)}(y) = \sum_{x \in f^{-1}(y)} p_X(x) \cdot |\det f'(x)|$$
For a function $f : \mathbb{R}^2 \rightarrow \mathbb{R}$, I expect that the appropriate variant on this would usually be some kind of an integral. Something like $$p_{f(X)}(y) = \int_{x \in f^{-1}(y)} p_X(x) \cdot |\det f'(x)|$$
with possibly some further subtleties involved.
Another direction this might be generalized is to allow $f$ to be injective but not necessarily surjective. This is to cover cases like $f : \mathbb{R} \rightarrow \mathbb{R}^2$. The notion of a Hausdorff measure seems relevant. For example, it might be possible to get a density function for $f(X)$ not with respect to $H^2_{\mathbb{R}^2}$ (the Lebesgue measure) but with respect to $H^1_{\mathbb{R}^2}$. In the comments, drhab suggests a different and more technical proposal of using "local Lebesgue measures". I'm not qualified to comment on such things, unfortunately, but that might be worth reading about.