I'm not aware of such a result in any paper, but that may be because it is relatively clear that this operation is non-polynomial.
And here is why. Assume that we have a satisfiability problem with $n$ variables and $c$ clauses. We allocate variables $v_1, \ldots, v_n$ and $v'_1, \ldots, v'_c$ in the BDD manager, build a BDD
$$B = \bigwedge_{i \in \{1, \ldots, c\}} v'_i$$
and also BDDs $x_1, \ldots, x_c$ that encode the clauses over $v_1, \ldots, v_n$. Note that all BDDs are of size $O(c)$ or $O(n)$, which is even independent of the variable ordering.
Now computing
$$C = B[x_1/v'_1,x_2/v'_2, \ldots, x_c/x'_c]$$
gives us a BDD that represents the set of all solutions to the SAT problem (where [...] is the application of the synchronous composition operator). If we could compute $C$ in time polynomial in the size of the SAT instance, that would give us a polynomial time algorithm for SAT. Even more, with $C$, we can count the number of solutions, which lies in a higher complexity class that finding them. Also, making use of a fixed variable ordering of the BDD, the argument can be extended to talk about the QBF problem, which is PSPACE complete.
Obviously, this does not answer your question to 100%. But in the light of this line of reasoning, I would find it surprising if the algorithm actually implemented in BDD libraries was polynomial-space (or even polynomial time).