2

I am trying to understand this Expected linear time MST algorithm, and I have a problem in the implementation of the Borůvka's step.

My problem is with the removal of duplicate edges between merged connected components and keep the one with minimal weight. To get the right complexity, one would need to do it in $\mathcal{O}(|V| + |E|)$ total time.

The original article by Karger, Klein and Tarjan and the wikipedia article say nothing about this step. The 'Randomized Algorithms' book by Raghavan and Motwani leaves this as an exercise.

This anwer is quite vague about how to do it:

look at all edges from the new component to each other new component, and record the smallest we have seen.

This document talks about a double radix sort in base $|V|$ (see paragraph 4.12, page 19 of the document, 21 of the pdf).

I also had the idea of using a hashtable as the following: for each meta-vertex $C$ corresponding to a connected component:

  • create a hashtable $H$;
  • for each edge of weight $w$ to another connected component $C'$:
    • if there is no entry in $H$ with key $C'$ or if the entry in $H$ with key $C'$ has a weight $>w$, add or replace an entry with key $C'$ and value $w$.
  • keep only edges in the hashtable $H$.

This gives a linear time complexity, but only on average and not worst case (but this is not a problem, since the MST algorithm is a Las Vegas algorithm).

I wondered if there are simpler ways to eliminate duplicate edges in linear time.

Nathaniel
  • 18,309
  • 2
  • 30
  • 58

0 Answers0