Introduction: I recently learned that a multi-tape Turing Machine $\text{TM}_k$ is no more "powerful" than a single tape Turing machine $\text{TM}$. The proof that $\text{TM}_k \equiv \text{TM}$ is based on the idea that a $\text{TM}$ can simulate a $\text{TM}_k$ by using a unique character to separate the respective areas of each of the $k$ tapes.
Given this idea, how would we prove that a process taking $t(n)$ time on a $\text{TM}_k$ can be simulated by a 2-tape Turing machine $\text{TM}_2$ with $ O(t(n))\log(t(n))$ time?