Getting parallel items in dependency resolution

Question

I have implemented a topological sort based on the Wikipedia article which I'm using for dependency resolution, but it returns a linear list. What kind of algorithm can I use to find the independent paths?

score 21 · Accepted Answer · answered Jun 28 '12 at 12:19

I assume that an edge $(u,v)$ means that $u$ has to be executed before $v$. If this is not the case, turn around all edges. I furthermore assume that you are less interested in paths (those are already given by the DAG) than in a good execution strategy given the dependencies.

You can easily adapt the topological sort procedure: instead of appending, merge all items of the same "depth" to one set. You get a list of sets, each of which contains items you can execute/install in parallel. Formally, the sets $S_i$ are defined thus for the graph $G = (V,E)$:

$\qquad \displaystyle \begin{align} S_0 &= \{ v \in V \mid \forall u \in V. (u,v) \notin E \} \\ S_{i+1} &= \{v \in V \mid \forall u \in V. (u,v) \in E \to u \in \bigcup_{k=0}^i S_k \} \end{align}$

You can then execute your tasks like this (let's assume there are $k$ sets):

for i=0 to k
  parallel foreach T in S_k
    execute T

Of course, this does not yield maximum throughput if tasks take different amounts of time: two parallel, independent linear chains sync after each element. To circumvent this, I suggest you work directly on the DAG with a parallel traversal starting in the source nodes -- those in $S_0$ -- and syncing/forking in nodes with several incoming/outgoing edges:

parallel foreach T in S_0
  recursive_execute T

where

recursive_execute T {
  atomic { if T.count++ < T.indeg then return }
  execute T
  parallel foreach T' in T.succ
    recursive_execute T'
}

and T.count is a simple counter holding the number of predecessors of T which have been executed already, T.indeg the number of predeccessors and T.succ the set of successors.

Getting parallel items in dependency resolution

1 Answers1

Linked

Related