3

I've been researching ways of modeling and executing tasks which are dependent on each other (but in an acyclic way) and came up with task graphs. But the question that's bugging me is how can I find out the maximum degree of concurrency in a given task graph.

In my case, I'm talking of a relatively small graph, around 100 nodes, but nodes, representing tasks, are long running tasks. So the occuracy, more then complexity of such an algorithm would matter.

Assuming I came up of such a degree, the second problem, is how should I distrubute tasks? I've read about topological sort, and transforming the result in a list of sets, with each set being run in parallel. But again, I suspect if this is the best approach.

Raphael
  • 73,212
  • 30
  • 182
  • 400
SelimOber
  • 133
  • 1
  • 6

3 Answers3

5

If you turn an activity-on-node task graph into a partial order (by taking the transitive closure), then the largest independent set of tasks is what you are looking for.

(Taking a topological sort, as suggested in another answer, does not work in general. Consider the series-parallel task graph $((a|b)c)|(d(e|f))$, where $\alpha|\beta$ means parallel composition of task graphs and $\alpha\beta$ means every task in task graph $\alpha$ precedes every task in task graph $\beta$. Here $\{a,b,e,f\}$ is the largest independent set, yet the topological sort will produce $\{a,b,d\}$.)

Although finding largest independent sets is NP-complete in general, it can be done quickly for partial orders. This starts by noting the equivalence of an independent set in a poset with a set of witnesses that realise the width of the poset, and applying König's theorem to compute the witnesses by a perfect matching.

Some of these basic algorithms are already part of software toolkits, like the Graph CPAN module for Perl, and the Boost Graph Library for C++.

András Salamon
  • 3,532
  • 1
  • 21
  • 37
2

Running a topological sort on the graph is the right thing to do. Topological sorting is just the linear time version of the following very simple greedy approach:

while there are nodes left:
    let S be the set of all nodes with indegree 0
    run all tasks in S in parallel
    remove all nodes in S from the graph

(I use the convention that Task A depends on Task B if there is an edge B->A)

It's fairly obvious that you can't do better than the above algorithm, and the maximum concurrency you can achieve is the size of the largest S that you encounter.

adrianN
  • 5,991
  • 19
  • 27
1

Since you stated that, by hypothesis, your task dependency graph is acyclic, you can easily determine the maximum degree of concurrency by running the breadth-first search algorithm. The algorithm starts from the root node (if the initial level contains more then one node then any node can be chosen as the root node) and proceeds towards the leafs level-wise: before inspecting nodes located on the next level it explores all of the nodes belonging to the same level. Just modify it trivially to count the number of nodes on each level, and update a variable holding the maximum number of nodes per level as the algorithms descends through the levels. The worst-case complexity is $O(V + E)$.

Massimo Cafaro
  • 4,360
  • 19
  • 27