The Ukkonen class
For an implementation of Ukkonen’s algorithm.
This class implements Ukkonen’s algorithm for constructing a generalised suffix
tree consisting of list[int]. The implementation in this class is based on:
https://cp-algorithms.com/string/suffix-tree-ukkonen.html
The suffix tree is updated when the member function ukkonen.add_word is
invoked. Every non-duplicate word added to the tree has a unique letter appended
to the end. If a duplicate word is added, then the tree is not modified, but the
multiplicity of the word is increased.
Contents
| For an implementation of Ukkonen's algorithm. | |
| 
 | Copy a  | 
| Returns the distance of a node from the root. | |
| Find the index of a word in the suffix tree. | |
| 
 | Initialize an existing Ukkonen object. | 
| Check if a state corresponds to a suffix. | |
| Returns the sum of the lengths of the distinct words in the suffix tree. | |
| Returns the sum of the lengths of all of the words in the suffix tree. | |
| Returns the maximum length of word in the suffix tree. | |
| Returns the multiplicity of a word by index. | |
| Returns the nodes in the suffix tree. | |
| Returns the number of distinct non-empty words in the suffix tree. | |
| Returns the number of non-empty words in the suffix tree. | |
| Throw if the word w contains a letter equal to any of the unique letters added to the end of words in the suffix tree. | |
| Returns the unique letter added to the end of the  | |
| Overloaded function. | 
Full API
- class Ukkonen
- 
- distance_from_root(self: Ukkonen, n: Ukkonen.Node) int
- Returns the distance of a node from the root. - Parameters:
- n (Ukkonen.Node) – the node. 
- Returns:
- The distance from the root. 
- Return type:
- Complexity:
- At worst linear in the distance of the node n from the root. 
 
 - index(self: Ukkonen, w: str | list[int]) int | Undefined
- Find the index of a word in the suffix tree. - If the w is one of the words that the suffix tree contains (the words added to the suffix tree via - ukkonen.add_word), then this function returns the index of that word. If the word w is not one of the words that the suffix tree represents, then- UNDEFINEDis returned.
 - init(self: Ukkonen) Ukkonen
- Initialize an existing Ukkonen object. - This function puts an - Ukkonenobject back into the same state as if it had been newly default constructed.- Returns:
- self. 
- Return type:
 - See also 
 - is_suffix(self: Ukkonen, st: Ukkonen.State) int | Undefined
- Check if a state corresponds to a suffix. - This function returns a an int if the state st corresponds to a suffix of any word in the suffix tree. The value returned is the index of the word which the state is a suffix of if such a word exists, and - UNDEFINEDotherwise.- Parameters:
- st (Ukkonen.State) – the state. 
- Returns:
- The index of a word for which st is a suffix, or - UNDEFINED.
- Return type:
 
 - length_of_distinct_words(self: Ukkonen) int
- Returns the sum of the lengths of the distinct words in the suffix tree. - Returns:
- The length of the distinct words. 
- Return type:
- Complexity:
- Constant. 
 
 - length_of_words(self: Ukkonen) int
- Returns the sum of the lengths of all of the words in the suffix tree. - This is the total length of all the words added to the suffix tree including duplicates, if any. - Returns:
- The length of the words. 
- Return type:
- Complexity:
- \(O(n)\) where \(n\) is the return value of - number_of_distinct_words.
 
 - max_word_length(self: Ukkonen) int
- Returns the maximum length of word in the suffix tree. - Returns:
- The maximum length of a word. 
- Return type:
- Complexity:
- Constant. 
 
 - multiplicity(self: Ukkonen, i: int) int
- Returns the multiplicity of a word by index. - This function returns the number of times that the word corresponding to the index i was added to the suffix tree. 
 - nodes(self: Ukkonen) list[Ukkonen.Node]
- Returns the nodes in the suffix tree. - Returns:
- A list of nodes. 
- Return type:
- Complexity:
- Constant. 
 
 - number_of_distinct_words(self: Ukkonen) int
- Returns the number of distinct non-empty words in the suffix tree. - This is the number of distinct non-empty words added via - ukkonen.add_word.- Returns:
- The number of distinct non-empty words. 
- Return type:
- Complexity:
- Constant. 
 
 - number_of_words(self: Ukkonen) int
- Returns the number of non-empty words in the suffix tree. - This is the number of all words added via - ukkonen.add_wordincluding duplicates, if any.- Returns:
- The number of words. 
- Return type:
- Complexity:
- \(O(n)\) where \(n\) is the return value of - number_of_distinct_words.
 
 - throw_if_contains_unique_letter(self: Ukkonen, w: str | list[int]) None
- Throw if the word w contains a letter equal to any of the unique letters added to the end of words in the suffix tree. - This function throws an exception if the word w contains a letter equal to any of the unique letters added to the end of words in the suffix tree. - Parameters:
- Raises:
- LibsemigroupsError – if w contains a letter equal to any of the unique letters added to the end of words in the suffix tree. 
- Complexity:
- Linear in the length of w. 
 
 - unique_letter(self: Ukkonen, i: int) int
- Returns the unique letter added to the end of the - i-th distinct word in the suffix tree.
 - word_index(self: Ukkonen, i: int) int
- Overloaded function. - word_index(self: Ukkonen, i: int) int
- Returns the index of the word corresponding to a position. - This function returns the least non-negative integer - jsuch that the i-th letter in the underlying string appears in the- j-th word added to the suffix tree.
 - word_index(self: Ukkonen, n: Ukkonen.Node) int
- Returns the index of the word corresponding to a node. - This function returns the least non-negative integer - isuch that the node n corresponds to the- i-th word added to the suffix tree.- Parameters:
- n (Ukkonen.Node) – the node. 
- Returns:
- The index of a word. 
- Return type:
- Complexity:
- Constant.