If $X$ is a discrete random variable, its entropy $H(X)$ is usually defined as something along the lines of $-\sum \def\P{\mathbb{P}}\P(x) \log_2( \P(x))$, where the sum ranges over all the possible values $x$ of $X$.
I have seen a few expositions of the problem of extending this definition to continuous random variables. I take it that the standard counterpart to $H$ in the continuous case is the so-called differential entropy, denoted by $h(X)$, and defined as
$$ \int f(x)\log_2(f(x)) dx\,, $$
where $f$ is the probability density of $X$, and the integral runs over the (continuous) set of values of $X$.
This definition is always contingent on the existence of the integral, but even putting aside this existence question, I'm a bit confused about the proper way to interpret this integral.
None of the derivations (of the passage to the continuous case) has been terribly concerned over mathematical subtleties, such as well-definedness, convergence, etc. A possible clue to this attitude is that all these derivations are based on the Riemann integral formalism, which strikes me as particularly ill-suited for thinking about this particular integral's convergence.
I'm looking for a (hopefully mathematically rigorous) measure-theoretic approach to this generalization of the entropy to the continuous case. I'd appreciate any pointers.