1

I want to find the number of ordered, unlabeled binary rooted trees with $n$ nodes and $k$ leafs as an exercise.

To be more precise. I am interested in objects like this ((c) 2015 M. Fulmek, PS Kombinatorik) where the line below $n=4$ reads “… and the 7 trees above vertically filpped”

Assigning the weight $w(x)W = z^n y^k$ to every rooted tree $W$ with $n$ nodes and $k$ leafs. The generating function in two indeterminates should start like this

$$ T(z, y) = \sum_W w(x)W = zy + z^2 2y + z^3\left(y^2 + 4y\right) + z^4\left(6y^2 + 8y\right) + \ldots $$

I know that the number of ordered, unlabeled binary rooted trees with $n$ nodes is

$$ \bar T (z) = T(z, 1) = \sum_{n \geq 0} \frac{1}{n + 1} {2n \choose n} z^n. $$

However, I do not know how to proceed from here. How can I express the number of leafs in terms of species?

I am aware of this question. However, we did not cover Lagrange inversion in class. So I believe there should be a solution without applying it.

tim6her
  • 125

2 Answers2

1

With this question we run into the problem of determining exactly what the notation is supposed to mean and which family of trees from among the many possibilities is being referenced. Note that the quoted series with leaves not marked includes a tree of size zero, which does not match the quoted expansion. If we do use the quoted expansion as the problem definition it appears the species here is

$$\mathcal{T} = \mathcal{Z}\mathcal{Y} + \mathcal{Z}\mathcal{T} + \mathcal{Z}\mathcal{T} + \mathcal{Z}\mathcal{T}^2.$$

We chose this interpretation because the OP says that there are four trees on three nodes with one leaf, which upon making a diagram reveals itself to be four paths ending in a node marked with a leaf. For this to happen we must permit internal nodes that have one rather than two children, so these trees are not full. This is what the species equation does: we have the base case of a leaf node, an internal node with a left child, an internal node with a right child or an internal node having two children.

This yields the following equation for the bivariate generating function:

$$T(z,y) = yz + 2zT(z,y) + zT(z,y)^2.$$

Now clearly Lagrange or similar is the preferred way to treat this, but according to OP it may not be used. Solving this equation and choosing the branch that yields Catalan numbers we obtain

$$T(z, y) = \frac{1-2z-\sqrt{1-4z(1-z+yz)}}{2z}.$$

Extracting coefficients we have with $n\ge 1$ (no empty trees in this species):

$$[z^n] T(z, y) = [z^n] \frac{1-2z-\sqrt{1-4z(1-z+yz)}}{2z} \\ = [z^{n+1}] \frac{1-2z-\sqrt{1-4z(1-z+yz)}}{2} \\ = - [z^{n+1}] \frac{1}{2} \sqrt{1-4z(1-z+yz)}.$$

Observe that

$$\sqrt{1-4w} = 1 + \sum_{q\ge 1} {1/2\choose q} 4^q (-1)^q w^q$$

and

$$2^{2q} (-1)^q {1/2\choose q} = \frac{2^{2q}}{q!} (-1)^q \prod_{p=0}^{q-1} (1/2-p) = \frac{2^{q}}{q!} \prod_{p=0}^{q-1} (2p-1) \\ = - \frac{2^{q}}{q!} \prod_{p=1}^{q-1} (2p-1) = - \frac{2^{q}}{q!} \frac{(2q-2)!}{2^{q-1}\times (q-1)!} = - \frac{2}{q} {2q-2\choose q-1}.$$

Returning to the coefficient extraction we have

$$[z^{n+1}] \sum_{q\ge 1} \frac{1}{q} {2q-2\choose q-1} z^q (1-z+yz)^q \\ = \sum_{q=1}^{n+1} \frac{1}{q} {2q-2\choose q-1} [z^{n+1}] z^q (1-z+yz)^q \\ = \sum_{q=1}^{n+1} \frac{1}{q} {2q-2\choose q-1} [z^{n+1-q}](1-z(1-y))^q \\ = \sum_{q=1}^{n+1} \frac{1}{q} {2q-2\choose q-1} {q\choose n+1-q} (-1)^{n+1-q} (1-y)^{n+1-q}.$$

We thus have for the statistic of $n$ nodes and $k$ leaves the closed form

$$\bbox[5px,border:2px solid #00A000]{ (-1)^{n+1-k} \sum_{q=1}^{n+1} \frac{(-1)^q}{q} {2q-2\choose q-1} {q\choose n+1-q} {n+1-q\choose k}.}$$

We continue by observing that

$$\frac{1}{q} {2q-2\choose q-1} {q\choose n+1-q} {n+1-q\choose k} \\ = \frac{(2q-2)!}{(q-1)!\times (2q-n-1)! \times k! \times (n+1-q-k)!} \\ = {n-k\choose q-1} \frac{(2q-2)!}{(n-k)! \times (2q-n-1)! \times k!} \\ = {n-k\choose q-1} {n\choose k} \frac{1}{2q-1} {2q-1\choose n}.$$

We get for the sum

$$(-1)^{n+1-k} {n\choose k} \sum_{q=1}^{n+1} \frac{(-1)^q}{2q-1} {n-k\choose q-1} {2q-1\choose n} \\ = (-1)^{n+1-k} {n\choose k} \sum_{q=0}^{n} \frac{(-1)^{q+1}}{2q+1} {n-k\choose q} {2q+1\choose n}$$

or (recall that $n\ge 1$)

$$\bbox[5px,border:2px solid #00A000]{ \frac{1}{n} (-1)^{n-k} {n\choose k} \sum_{q=0}^{n} (-1)^q {n-k\choose q} {2q\choose n-1}.}$$

Working on the inner term we find

$$\sum_{q=0}^{n} (-1)^q {n-k\choose q} [w^{n-1}] (1+w)^{2q} \\ = [w^{n-1}] \sum_{q=0}^{n} (-1)^q {n-k\choose q} (1+w)^{2q} \\ = [w^{n-1}] (1-(1+w)^2)^{n-k} = (-1)^{n-k} [w^{n-1}] w^{n-k} (2+w)^{n-k} \\ = (-1)^{n-k} [w^{k-1}] (2+w)^{n-k} = (-1)^{n-k} {n-k\choose k-1} 2^{n+1-2k}.$$

We thus obtain for the sum

$$\bbox[5px,border:2px solid #00A000]{ \frac{2^{n+1-2k}}{n} {n\choose k} {n-k\choose k-1}.}$$

This may be re-written one last time if desired:

$$2^{n+1-2k} \frac{(n-1)!}{k!\times (k-1)! \times (n+1-2k)!} \\ = \frac{2^{n+1-2k}}{k} {2k-2\choose k-1} {n-1\choose 2k-2}$$

and we have in terms of Catalan numbers

$$\bbox[5px,border:2px solid #00A000]{ {n-1\choose 2k-2} C_{k-1} 2^{n+1-2k}.}$$

Surprising to see that we got this far without complex variables. We did consult OEIS A091894, which proved to be a valuable resource.

Post Scriptum. Does it sum to Catalan numbers? Start with

$$\frac{2^{n+1}}{n} \sum_{k\ge 1} {n\choose k} {n-k\choose k-1} 2^{-2k} = \frac{2^{n+1}}{n} \sum_{k\ge 1} {n\choose k} 2^{-2k} [w^{k-1}] (1+w)^{n-k} \\ = \frac{2^{n+1}}{n} \sum_{k\ge 1} {n\choose k} [w^0] \frac{(1+w)^n}{2^{2k} (1+w)^k w^{k-1}} \\ = [w^0] w(1+w)^n \frac{2^{n+1}}{n} \sum_{k\ge 1} {n\choose k} \frac{1}{2^{2k} (1+w)^k w^{k}}.$$

For $k=0$ we get $[w^0] w(1+w)^n 2^{n+1}/n = 0$ and we may lower the index to include zero, obtaining

$$[w^0] w(1+w)^n \frac{2^{n+1}}{n} \left(1+\frac{1}{4w(1+w)}\right)^n \\ = [w^0] w(1+w)^n \frac{2^{n+1}}{n} \frac{(1+2w)^{2n}}{2^{2n} w^n (1+w)^n} \\ = [w^0] \frac{1}{w^{n-1}} \frac{1}{n} \frac{(1+2w)^{2n}}{2^{n-1}} = [w^{n-1}]\frac{1}{n} \frac{(1+2w)^{2n}}{2^{n-1}} = \frac{1}{n} {2n\choose n-1} \\ = \frac{1}{n+1} {2n\choose n},$$

and indeed it does.

Consulting with combstruct on these we have the following program.

with(combstruct);

GFENUM :=
proc(n)
    option remember;
    local trees, leaves;

    trees := { T=Union(Prod(Z, Y),
                       Prod(Z, T), Prod(Z, T),
                       Prod(Z, T, T)),
               Z=Atom, Y=Epsilon };

    leaves :=
    proc(struct)
        if struct = Y then return 1 fi;
        if struct = Z then return 0 fi;

        return add(leaves(op(q, struct)),
                   q=1..nops(struct));
    end;

    add(y^leaves(t), t in allstructs([T, trees], size=n));
end;

TZY := (1-2*z-sqrt(1-4*z*(1-z+y*z)))/2/z;

GFX := n -> coeftayl(TZY, z=0, n);

GFBINOM :=
n -> add(2^(n+1-2*k)/n*binomial(n,k)*binomial(n-k,k-1)*y^k,
         k=1..floor((n+1)/2));

This will produce e.g. for $n=7$ the generating function

$$5\,{y}^{4}+120\,{y}^{3}+240\,{y}^{2}+64\,y$$

which gives the same result in all three cases, from the closed form of $T(z, y)$, by the binomial coefficient formula and by enumeration.

Remark. I consulted the linked-to document and the diagram included there would seem to confirm that we have the correct species.

Marko Riedel
  • 64,728
0

The number of ordered, unlabeled binary rooted trees with n nodes and k leaves is the number of Dyck paths of semilength n, having k-1 ddu's [here u = (1,1) and d = (1,-1)]. See https://oeis.org/A091894

Clear["Global`*"];

Join[{1}, Select[ Flatten[ Table[ 2^(n - 2 k - 1) Binomial[n - 1, 2 k] Binomial[2 k, k]/(k + 1), {n, 20}, {k, 0, n} ] ], # != 0 & ] ]

Table[2^(n - 2 k - 1)  Binomial[n - 1, 2 k]  Binomial[2 k, k]/(k + 1), 
    {n, 20}, {k, 0, n}
] // TableForm