2

Summary

I am using a DAG to compress a tree structure with many repeated nodes (the repeated nodes only very seldomly do not also have repeated edges out.)

Normally, when attempting to add an edge to a DAG that would cause a cycle, you instead detect the situation and abort. I'm seeking an algorithm that will instead attempt to add the new edge anyway, modifying the graph such that it still represents the same tree, potentially partially decompressing parts of the graph in order to avoid the cycle.

Does an efficient algorithm to do this already exist? I have been unable to find one in a perfunctory literature search, though I am not aware of the proper terminology for this operation, if one exists.

Simple Example

enter image description here

In this example, we are adding the edge F->C, which creates a cycle. To break this cycle, we can split the E node into one version of E with C as a parent and one version of E with D as a parent (notated E'), similarly with F and C.

Slightly more complicated example

enter image description here

In this example we have several more cases. The offending edge from F to D is highlighted. But there are several paths through the graph, some involving nodes upstream from D, some involving nodes downstream from D, and potentially some not involving D at all.

This graph is a compressed version of this tree:

enter image description here

where the highlighted copies of F are where placing D as a child is permissible. As you can see, these three nodes correspond to the three ways of reaching F' in the previous figure.

Motivating use case

In the game of Go, there is, in certain rulesets, the idea of superko, which forbids repetition of a previous board state. Such positions are usually very rare in practice. In order to efficiently search the game tree, we would want to take into account transpositions of sets of moves which leave the board the same, which is why a graph structure is useful, but in situations where a superko is possible the history of the position is also important, not just the current situation. So while F->C would be a legal move normally, it is only a legal move in situations where C is not part of the node's history, i.e. we went through node D instead of node C. So we would need to consider the cases separately.

Known Caveats

I am aware of the DAG cycle detection algorithm, and it seems like it might be easy to adapt this algorithm to perform this task, but I cannot seem to make it work. I am also aware that it is not always possible to split a graph in this manner to remove the cycle.

2 Answers2

1

Any graph of this type can be represented using this simplified model (left):

enter image description here

where M1, M2, and M3 are metanodes that can represent 0 or more nodes, and edges involving M1, M2, and M3 can represent 0 or more edges.

In order to split the graph in the manner I am seeking, it's required to classify these nodes using the reachability algorithm. The members of M1 are all nodes $k \not \in \{A, B\}$ such that $A \not \le k$. The members of M2 are all nodes $k \not \in \{A, B\}$ such that $A \le k \land B \not \le k$. Finally the members of M3 are the remaining nodes $k \not \in \{A, B\}$ and $B \le k$.

The members of M2 are duplicated along with B and A. All edges from M1 to M2 or B are routed to M2' and B' instead. And the edges from M2 and B to M3 are duplicated for M2' and B'. Finally, an edge from B' to A' is made.

The algorithm fails to place A' if and only if there are no edges connecting M1 to B and there are no edges from M1 to M2 or M2 to B, which represents the fact that there is no path to B without going through A, so there is no possible way to place A'.

So this problem can be solved with $O(n)$ lookups using pretty much any general reachability algorithm.

1

I think one possibility is do the naive thing, and whenever you find a backedge $u \to v$ (i.e., where $v$ is an ancestor of $u$ in the search tree), duplicate the subtree rooted at $v$, one duplicate per edge out of $v$.

Given an edge $u \to v$, there are various ways to test whether it is a backedge. The naive way is to traverse the path from $u$ to the root (by following parent pointers) and see if you ever visit $v$. A fancy way is to use an algorithm for least-common-ancestors on a dynamic tree.

D.W.
  • 167,959
  • 22
  • 232
  • 500