Questions tagged [suffix-trees]
28 questions
6
votes
1 answer
Understanding Martin Farach's suffix tree algorithm
I feel stuck at this point. I have spent several days trying to get my head around the algorithm, but both resources I have [1] [2] seems to skip over whatever details that would make me comfortable trying to explain/implement it. I'm going to hope…
lsund
- 61
- 4
6
votes
1 answer
What are the effects of the alphabet size on construct algorithms for suffix trees?
For what size alphabet does it take longer to construct a suffix tree - for a really small alphabet size (because it has to go deep into the tree) or for a large alphabet size? Or is it dependent on the algorithm you use? If it is dependent, how…
John Smith
- 61
- 1
6
votes
1 answer
Finding a certain prefix of a string
Let $\Sigma = \{ \sigma_1 , ..., \sigma_t \}$ and let $S$ be a string from $\Sigma^*$. Denote: $n=|S|$, that is $S$ has $n$ letters. I'd like to find the shortest prefix $T$ of $S$ such that $S$ is a prefix of $T^n$ ($T^n= T \cdot .... \cdot T$, $n$…
Eric_
- 535
- 2
- 13
5
votes
1 answer
Suffix trees - "smaller half trick"
I have been reading a paper Finding Maximal Pairs with Bounded Gap:
An in there, there is a sentence (page 6 second paragraph):
The “smaller-half trick” is used in several methods for finding tandem repeats,
e.g. [2, 5, 26]. It says that the sum…
user6697
- 347
- 1
- 2
- 6
4
votes
1 answer
Where did Don Knuth say that suffix trees were the "Algorithm of the Year 1973?"
If you do a quick search for suffix trees on Google, you'll find a bunch of sources saying that Don Knuth called them the "Algorithm of the Year 1973." However, I can't seem to find a source on this.
Is there a reference to a paper, communication,…
templatetypedef
- 9,302
- 1
- 32
- 62
3
votes
1 answer
Why does a suffix tree have a linear number of nodes (relative to input string size)?
Aren't there $n^2$ unique substrings of a string (irrespective of the alphabet size)? Perhaps the number of unique suffix substrings is less than the number of unique substrings of a string.
Wuschelbeutel Kartoffelhuhn
- 513
- 5
- 12
3
votes
1 answer
What is the time complexity of determining maximal overlaps between bitmap images?
Given two bitmap images (two arbitrarily sized two-dimensional matrices of integer values ranging from 0 to some maximum number), I want to determine their maximum overlaps.
An overlap is a relative positioning of the images (a translation of the…
reinierpost
- 6,294
- 1
- 24
- 40
3
votes
0 answers
Constructing Generalised Suffix Tree from a large set of strings
Is there a published method to construct a generalised suffix tree from a large set of strings (~ 500 000) without the need of concatenating them?
I would like to use the resulting suffix tree for a pattern search problem.
PABLO
- 41
- 1
3
votes
3 answers
Longest common substring many strings to one
I'd like to find longest common substring (occurrences, start index) between one string and many others.
For example
source string - "abcdefghijklmncdop"
other strings - ["cd", "ghi",
"mn", "zw", "cdewxyz"].
Expected result -(original string,…
Fimka
- 31
- 2
3
votes
1 answer
Find shortest prefix to generate original string by overlapping
Given a string $S$, I want to find the prefix string $P$ of shortest length, such that the original string $S$ can be generated by concatenating copies of $P$ (where overlapping is allowed).
For example, if $S = atgatgatatgat$, I want to find $P =…
Robert Lee
- 31
- 1
2
votes
1 answer
How to create a suffix tree by hand
I will soon have to be able to draw a suffix tree of a word on an exam. I tried to understand the different algorithms available, but they seem rather complicated to do "by hand".
I do not see a simple way how to draw such a tree. Is there a simple…
user66875
- 203
- 1
- 2
- 6
2
votes
1 answer
Find repeated patterns in a string via lossy compression
Description
The task is to identify repeated patterns in a string and do lossy compression of the input string using the found patterns. The output is a list containing different ways of encoding the string (information loss VS description…
dizcza
- 123
- 5
1
vote
1 answer
Longest repeated substring
I am trying to solve a problem- Longest repeated substring in a string. Firstly, I built a suffix tree that takes O(n) time and then I traversed the suffix tree to find the deepest internal node. I am not sure whether traversing in a suffix tree…
shiwang
- 481
- 1
- 9
- 24
1
vote
1 answer
return all strings contains given string in suffix tree
Here is my question:
Given a compressed suffix tree of string S and a substring T.
I need to return all substrings of S that begins with the substring T
sorted by lexicographic order.
My approach:
I can traverse the suffix tree and find the edge /…
ms_stud
- 135
- 5
1
vote
0 answers
Substring problems in suffix trees
Assume that we have some efficient way of creating a suffix tree of a string. Then I'm interested in efficient ways of finding:
Length of longest repeated substring. Given some string $S$, then a repeated substring is a substring $R$ of $S$ that…
Eff
- 219
- 1
- 9