Ukkonen helper functions¶
This page contains the documentation for various helper functions for
manipulating Ukkonen
objects. All such functions are contained in the
submodule libsemigroups_pybind11.ukkonen
.
Contents¶
Add all words in a |
|
Add all words in a |
|
|
Returns a string containing a GraphViz representation of a suffix tree. |
Check if a word is a piece. |
|
Check if a word is a piece. |
|
Check if a word is a subword of any word in a suffix tree. |
|
Check if a word is a subword of any word in a suffix tree. |
|
Check if a word is a suffix of any word in a suffix tree. |
|
Check if a word is a suffix of any word in a suffix tree. |
|
Find the maximal piece prefix of a word. |
|
Find the maximal piece prefix of a word. |
|
Find the maximal piece suffix of a word. |
|
Find the maximal piece suffix of a word. |
|
Returns the number of distinct subwords of the words in a suffix tree. |
|
Find the number of pieces in a decomposition of a word (if any). |
|
Find the number of pieces in a decomposition of a word (if any). |
|
Find the pieces in a decomposition of a word (if any). |
|
Find the pieces in a decomposition of a word (if any). |
Full API¶
- add_words_no_checks(u: _libsemigroups_pybind11.Ukkonen, words: List[List[int]]) None ¶
Add all words in a
list
to a Ukkonen object.- Parameters
- Returns
None
Warning
This function does no checks on its arguments whatsoever. In particular, if any of the words in
words
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.
- add_words(u: _libsemigroups_pybind11.Ukkonen, words: List[List[int]]) None ¶
Add all words in a
list
to a Ukkonen object.- Parameters
- Returns
None
Warning
This function does no checks on its arguments whatsoever. In particular, if any of the words in
words
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- dot(u: Ukkonen) str ¶
Returns a
graphviz.Digraph
containing a GraphViz representation of a suffix tree.- Parameters
u (Ukkonen) -- the Ukkonen object
- Returns
A value of type
graphviz.Digraph
.- Raises
RuntimeError -- if
u
does not contain any words.RuntimeError -- if the number of words in
u
is greater than 24.
- is_piece_no_checks(*args, **kwargs)¶
Overloaded function.
is_piece_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool
Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).
Returns
True
if the wordw
that occurs in at least f$2f$ different (possibly overlapping) places in the words contained inu
. If no such prefix exists, thenFalse
is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.is_piece_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool
Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).
Returns
True
if the wordw
that occurs in at least f$2f$ different (possibly overlapping) places in the words contained inu
. If no such prefix exists, thenFalse
is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.
- is_piece(*args, **kwargs)¶
Overloaded function.
is_piece(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool
Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).
Returns
True
if the wordw
that occurs in at least f$2f$ different (possibly overlapping) places in the words contained inu
. If no such prefix exists, thenFalse
is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
is_piece(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool
Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).
Returns
True
if the wordw
that occurs in at least f$2f$ different (possibly overlapping) places in the words contained inu
. If no such prefix exists, thenFalse
is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- is_subword_no_checks(*args, **kwargs)¶
Overloaded function.
is_subword_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool
Check if a word is a subword of any word in a suffix tree.
Returns
True
if the wordw
is a subword of one of the words in the suffix tree represented by theUkkonen
instanceu
.- Parameters
- Returns
A value of type
bool
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.- Complexity
Linear in the length of
w
.
is_subword_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool
Check if a word is a subword of any word in a suffix tree.
Returns
True
if the wordw
is a subword of one of the words in the suffix tree represented by theUkkonen
instanceu
.- Parameters
- Returns
A value of type
bool
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.- Complexity
Linear in the length of
w
.
- is_subword(*args, **kwargs)¶
Overloaded function.
is_subword(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool
Check if a word is a subword of any word in a suffix tree.
Returns
True
if the wordw
is a subword of one of the words in the suffix tree represented by theUkkonen
instanceu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
is_subword(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool
Check if a word is a subword of any word in a suffix tree.
Returns
True
if the wordw
is a subword of one of the words in the suffix tree represented by theUkkonen
instanceu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- is_suffix_no_checks(*args, **kwargs)¶
Overloaded function.
is_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool
Check if a word is a suffix of any word in a suffix tree.
Returns
True
if the wordw
is a suffix of one of the words in the suffix tree represented by the Ukkonen instanceu
.- Parameters
- Returns
A value of type
bool
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.- Complexity
Linear in the length of
w
.
is_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool
Check if a word is a suffix of any word in a suffix tree.
Returns
True
if the wordw
is a suffix of one of the words in the suffix tree represented by the Ukkonen instanceu
.- Parameters
- Returns
A value of type
bool
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.- Complexity
Linear in the length of
w
.
- is_suffix(*args, **kwargs)¶
Overloaded function.
is_suffix(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool
Check if a word is a suffix of any word in a suffix tree.
Returns
True
if the wordw
is a suffix of one of the words in the suffix tree represented by the Ukkonen instanceu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
is_suffix(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool
Check if a word is a suffix of any word in a suffix tree.
Returns
True
if the wordw
is a suffix of one of the words in the suffix tree represented by the Ukkonen instanceu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- maximal_piece_prefix_no_checks(*args, **kwargs)¶
Overloaded function.
maximal_piece_prefix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]
Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length prefix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such prefix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.maximal_piece_prefix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str
Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length prefix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such prefix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.
- maximal_piece_prefix(*args, **kwargs)¶
Overloaded function.
maximal_piece_prefix(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]
Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length prefix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such prefix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
maximal_piece_prefix(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str
Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length prefix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such prefix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- maximal_piece_suffix_no_checks(*args, **kwargs)¶
Overloaded function.
maximal_piece_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]
Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length suffix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such suffix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of
w
and \(n\) is the return value ofUkkonen.length_of_distinct_words()
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.maximal_piece_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str
Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length suffix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such suffix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of
w
and \(n\) is the return value ofUkkonen.length_of_distinct_words()
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.
- maximal_piece_suffix(*args, **kwargs)¶
Overloaded function.
maximal_piece_suffix(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]
Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length suffix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such suffix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of
w
and \(n\) is the return value ofUkkonen.length_of_distinct_words()
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
maximal_piece_suffix(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str
Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.
Returns the maximal length suffix of the word
w
that occurs in at least two different (possibly overlapping) places in the words contained inu
. If no such suffix exists, then an empty list is returned.- Parameters
- Returns
A value of type
bool
.- Complexity
At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of
w
and \(n\) is the return value ofUkkonen.length_of_distinct_words()
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- number_of_distinct_subwords(u: _libsemigroups_pybind11.Ukkonen) int ¶
Returns the number of distinct subwords of the words in a suffix tree.
Returns the total number of distinct subwords of the words in the suffix tree
u
.- param u
the Ukkonen object
- type u
Ukkonen
- returns
A value of type
size_t
.- complexity
Linear in
Ukkonen.length_of_distinct_words()
.
- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- number_of_pieces_no_checks(*args, **kwargs)¶
Overloaded function.
number_of_pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> int
Find the number of pieces in a decomposition of a word (if any).
Returns minimum number of pieces whose product equals the word
w
if such a product exists, and0
if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.number_of_pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> int
Find the number of pieces in a decomposition of a word (if any).
Returns minimum number of pieces whose product equals the word
w
if such a product exists, and0
if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.
- number_of_pieces(*args, **kwargs)¶
Overloaded function.
number_of_pieces(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> int
Find the number of pieces in a decomposition of a word (if any).
Returns minimum number of pieces whose product equals the word
w
if such a product exists, and0
if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
number_of_pieces(u: _libsemigroups_pybind11.Ukkonen, w: str) -> int
Find the number of pieces in a decomposition of a word (if any).
Returns minimum number of pieces whose product equals the word
w
if such a product exists, and0
if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
bool
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
- pieces_no_checks(*args, **kwargs)¶
Overloaded function.
pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> int
Find the pieces in a decomposition of a word (if any).
Returns a list of the pieces whose product equals the word corresponding to
first
and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
List[List[int]]
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> List[str]
Find the pieces in a decomposition of a word (if any).
Returns a list of the pieces whose product equals the word corresponding to
first
and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
List[List[int]]
.- Complexity
Linear in the length of
w
.
Warning
This function does no checks on its arguments whatsoever. In particular, if the word
w
contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.
- pieces(*args, **kwargs)¶
Overloaded function.
pieces(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[List[int]]
Find the pieces in a decomposition of a word (if any).
Returns a list of the pieces whose product equals the word corresponding to
first
and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
List[List[int]]
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.
pieces(u: _libsemigroups_pybind11.Ukkonen, w: str) -> List[str]
Find the pieces in a decomposition of a word (if any).
Returns a list of the pieces whose product equals the word corresponding to
first
and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix treeu
.- Parameters
- Returns
A value of type
List[List[int]]
.- Complexity
Linear in the length of
w
.- Raises
RunTimeError
ifUkkonen.validate_word()
raises.