Do probability distributions form a comonad?

Question

$\def\unit{{\rm unit}}\def\join{{\rm join}}$It's well known that (discrete) probability distributions form a monad. Specifically, if we let $PX$ be the set of discrete probability distributions on elements of $X$, and notate them as a set of pairs $(x,p)$ such that $\sum p=1$, then we have natural transformations

$$\begin{align} \unit : X & \to PX \\ \unit : x & \mapsto \{ (x,1) \} \\ \\ \join: P(PX) & \to PX \\ \join: D & \mapsto \{(y,pq)| (x,p) \in D, (y,q)\in x \} \end{align}$$

that satisfy the monad laws.

Can probability distributions be made into a comonad as well? For that, we would need to provide natural transformations

$$\begin{align} {\rm counit} : PX & \to X \\ {\rm cojoin} : PX & \to P(PX) \end{align}$$

that satisfy the comonad laws. It seems that the role of counit can be played by mathematical expectation (as long as $X$ is an $\mathbb{R}$-module), but in that case what is the correct definition of cojoin?

Edit:

Zhen Lin pointed out in the comments that if you want to have counit being expectation, then you need an $\mathbb{R}$-module structure on $PX$ as well as on $X$. The module operations on $PX$ are inherited from those on $X$ in the following way:

Addition

$$D_1 + D_2 = \{ (x+y,pq) | (x,p)\in D_1, (y,q)\in D_2\}$$

Multiplication by a scalar

$$qD = \{ (qx,p) | (x,p)\in D \}$$

Your description of the unit of $P$ monad seems to be incorrect. Do you mean to have $1 / |{X}|$ instead of $1$? In that case, I suppose you must also assume that $X$ is finite. — Zhen Lin, Jun 28 '12 at 07:42
No - here $X$ is a set (the sample space), and ${\rm unit}(x)$ is the trivial distribution over $x\in X$, i.e. the distribution which always selects $x$. — Chris Taylor, Jun 28 '12 at 07:46
Ah, yes. That makes sense. I don't think can make expectation into the comonad unit, because $P X$ is not a vector space even when $X$ is. — Zhen Lin, Jun 28 '12 at 07:49
It's possible to define a vector space structure on $PX$. I'll edit the question to include that. — Chris Taylor, Jun 28 '12 at 07:59
Your formula for $D_1 + D_2$ will misbehave when there are $x_1, x_2, y_1, y_2$ such that $x_1 + y_1 = x_2 + y_2$. — Zhen Lin, Jun 28 '12 at 09:13
Yes. There's an implicit assumption being made that like terms are collected and their probabilities summed. An alternative is to consider the containing data structure to be a multiset rather than a set, I suppose. You're touching on the problem of restricted monads which has been the bane of implementors of functional programming languages for the last 15 or so years :) — Chris Taylor, Jun 28 '12 at 09:17
Either way, it doesn't define a vector space structure. $D - D \ne 0$ in general. — Zhen Lin, Jun 28 '12 at 09:20
Good point. Perhaps I don't actually need the vector space structure at all - to compute expectations you just need scalar multiplication and addition. It seems as though an $\mathbb{R}$-module is probably sufficient. More edits coming... — Chris Taylor, Jun 28 '12 at 09:26
An $\mathbb{R}$-module is the same thing as a vector space over $\mathbb{R}$. I don't see any way of making scalar multiplication and addition behave nicely – even if we pass from rings to rigs. — Zhen Lin, Jun 28 '12 at 09:30
But crucially, it doesn't have a requirement for additive inverses, so that fact that $D-D\neq 0$ is not important, because we haven't even bothered to define $D-D$. — Chris Taylor, Jun 28 '12 at 09:31
Oops, I don't mean $\mathbb{R}$-module. I'm not exactly sure what structure I'm looking for. Something like an additive abelian group combined with an $\mathbb{R}$-set, I think. — Chris Taylor, Jun 28 '12 at 09:33
I've already explained why your proposed addition cannot be the addition operation of an abelian group. There is an algebraic structure for which your functor is a comonad – namely the structure of a $P$-algebra. But this is completely tautological. — Zhen Lin, Jun 28 '12 at 09:52
This is an interesting question. Could you tell us your motivation? Is it just plain curiosity or do you have a patricular application in mind where you need some (comonadic) notion of composition on probabilities? I have not found an answer yet. One thought I had was, perhaps you could define cojoin to be a conditional probability with itself? e.g. cojoin [(10, 0.25), (5, 0.75)] = [([(10, 0.0625), (5, 0.1875)], 0.25), ([(10, 0.1875), (5, 0.5625)], 0.75)], but this does not satisfy the laws (only right identity) — dorchard, Jul 26 '13 at 10:05
Thinking "contextually" (as one might do to give a computational interpretation for comonads), cojoin is then like saying, "what is the probability of me following this path given that my 'context' is that I have already taken a particular path". — dorchard, Jul 26 '13 at 10:15
@dorchard I don't remember what my motivation was when I asked this - and I don't think I ever resolved the question one way or the other, either. I'll try to think about it over the weekend! — Chris Taylor, Jul 26 '13 at 10:24
@ChrisTaylor With the module definition you give in your question there is a comonad-like structure which has the associativity and right-unit laws, but not left unit, where $\delta, D \mapsto { { (x, q) , | , (y, q) \in D } | , (x, p) \in D }$, e.g., $\delta {(1, 0.25), (2, 0.75)} \mapsto {{(1, 0.25), (1, 0.75)}, {(2, 0.25), (2, 0.75)}}$. There are different definitions of the module and $\delta$ which give a comonad-like structure with a left-unit, however I have not yet found one that has both units. — dorchard, Aug 02 '13 at 11:31
It's worth noting that counit could also be sampling rather than expectation, if you use a sampling function-based probability monad à la Park et al 2008. — jtobin, Nov 22 '15 at 00:58
I don't think counit is valid. counit would have to be a natural transformation. Take a uniform distribution over ${-1,1}$. If you take the expected value you get $0$, and then square that and you get $0$. Square it first and you get a guaranteed value of $1$, square that and you get $1$. Therefore, counit isn't a natural transformation. — Christopher King, Mar 03 '16 at 11:21

score 2 · Answer 1 · answered Jan 24 '20 at 18:33

As you point out, in order for having a counit $PX\to X$, you need $X$ to be equipped with an operation of taking "midpoints" of some kind. Otherwise, it is not clear what that map may be. In other words, it would work if you want $X$ to be an algebra over $P$.

Now, note that the algebras of the distribution monad $P$ have been characterized, they are sometimes called "convex spaces", e.g. here. A natural choice of objects on which $P$ induces a comonad is then just restricting to the category of $P$-algebras (where $P$ is considered as a monad). The counit is then the algebra structure map $PX\to X$, which forms convex combinations. Note that if we want such map to be natural, then we have to restrict the morphisms too: only the morphisms of $P$-algebras would do. In other words, $P$ induces a comonad on the Eilenberg-Moore category $\mathrm{Set}^P$.

This way, $P$ induces a comonad in the following way. Let's first see this abstractly. The forgetful functor $\mathrm{Set}^P\to\mathrm{Set}$ has a left-adjoint induced by $P$. Every adjunction induces a monad as well as a comonad: the monad on $\mathrm{Set}$ is just $P$, and the comonad that you get on $\mathrm{Set}^P$ is the one we want.

Now more concretely: the functor of the monad is really just $P$, except that we consider it on $\mathrm{Set}^P$. The counit of the monad is the map we just explained, and the comultiplication $PX\to PPX$ is given by $P\delta$, where $\delta:X\to PX$ is the unit of the old monad.

Note that in general $PX$ is not a vector space, but it's still a convex space, so it has a notion of midpoint (as a $P$-algebra, it is a free one). The operation of midpoint is given exactly by the multiplication $\mu:PPX\to PX$ of the monad.

Do probability distributions form a comonad?

1 Answers1