1

I am trying to solve a problem- Longest repeated substring in a string. Firstly, I built a suffix tree that takes O(n) time and then I traversed the suffix tree to find the deepest internal node. I am not sure whether traversing in a suffix tree would be O(n) or not?

These are the properties of a suffix tree-

  1. A suffix tree will contain n leaves which are numbered from 1 to n.
  2. Each internal node (except root) should have at least 2 children.

If each internal node can have more than 2 children,How can I say that number of nodes in a suffix tree is O(n)? So, How will the complexity be O(n)?

Here is the code that I have written for traversal. I am aware that this platform is language independent so I have added the relevant comments along with my code.

struct StNode
{
    string str;
    StNode* child[256];
};
void lrs(StNode* t,string s,string &res)
{
    if(leafNode(t)){    //leafNode means no child
        string str=t->str;
        s.erase(s.length()-str.length());   //erasing the leafNode's string
        int k=s.length();
        if(k>res.length())   res=s;
        return;
    }
    for(int i=0;i<256;i++){     //traversing through all childs
        if(t->child[i]!=NULL){
            string str=t->child[i]->str;
            lrs(t->child[i],s+str,res);
        }
    }
}
shiwang
  • 481
  • 1
  • 9
  • 24

1 Answers1

0

Yes, the number of nodes in the suffix tree is $O(n)$.

Here's how you could infer that. The suffix tree can be constructed in $O(n)$ time. There's no way to construct a tree with strictly more nodes than that, since it takes at least $\Omega(1)$ time per node you construct; so from the fact that the running time of the suffix tree algorithm is $O(n)$, you can conclude that its output must have size $O(n)$, i.e., the suffix tree must have $O(n)$ nodes.

Here's another way you could have inferred that. There are $n$ leaves. Also, each internal node has at least two children. In such a tree, the number of internal nodes is at most the number of leaves, so in this case, there are at most $n$ internal nodes and $n$ leaves. It follows that the number of nodes in the tree is at most $2n = O(n)$.

This information is actually found in the Wikipedia page on suffix trees, which explains that any suffix tree has at most $2n$ nodes. In the future, before asking a question here, please check standard resources, such as Wikipedia and a textbook.

It follows that the tree can be traversed in $O(n)$ time.

D.W.
  • 167,959
  • 22
  • 232
  • 500