Ukkonen helper functions

This page contains the documentation for various helper functions for manipulating Ukkonen objects. All such functions are contained in the submodule libsemigroups_pybind11.ukkonen.

Contents

add_words_no_checks()

Add all words in a list to a Ukkonen object.

add_words()

Add all words in a list to a Ukkonen object.

dot()

Returns a string containing a GraphViz representation of a suffix tree.

is_piece_no_checks()

Check if a word is a piece.

is_piece()

Check if a word is a piece.

is_subword_no_checks()

Check if a word is a subword of any word in a suffix tree.

is_subword()

Check if a word is a subword of any word in a suffix tree.

is_suffix_no_checks()

Check if a word is a suffix of any word in a suffix tree.

is_suffix()

Check if a word is a suffix of any word in a suffix tree.

maximal_piece_prefix_no_checks()

Find the maximal piece prefix of a word.

maximal_piece_prefix()

Find the maximal piece prefix of a word.

maximal_piece_suffix_no_checks()

Find the maximal piece suffix of a word.

maximal_piece_suffix()

Find the maximal piece suffix of a word.

number_of_distinct_subwords()

Returns the number of distinct subwords of the words in a suffix tree.

number_of_pieces_no_checks()

Find the number of pieces in a decomposition of a word (if any).

number_of_pieces()

Find the number of pieces in a decomposition of a word (if any).

pieces_no_checks()

Find the pieces in a decomposition of a word (if any).

pieces()

Find the pieces in a decomposition of a word (if any).

Full API

add_words_no_checks(u: _libsemigroups_pybind11.Ukkonen, words: List[List[int]]) None

Add all words in a list to a Ukkonen object.

Parameters
  • u (Ukkonen) -- the Ukkonen object

  • words (List[List[int]]) -- the words to add

Returns

None

Warning

This function does no checks on its arguments whatsoever. In particular, if any of the words in words contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

add_words(u: _libsemigroups_pybind11.Ukkonen, words: List[List[int]]) None

Add all words in a list to a Ukkonen object.

Parameters
  • u (Ukkonen) -- the Ukkonen object

  • words (List[List[int]]) -- the words to add

Returns

None

Warning

This function does no checks on its arguments whatsoever. In particular, if any of the words in words contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

Raises

RunTimeError if Ukkonen.validate_word() raises.

dot(u: Ukkonen) str

Returns a graphviz.Digraph containing a GraphViz representation of a suffix tree.

Parameters

u (Ukkonen) -- the Ukkonen object

Returns

A value of type graphviz.Digraph.

Raises
is_piece_no_checks(*args, **kwargs)

Overloaded function.

  1. is_piece_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool

    Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).

    Returns True if the word w that occurs in at least f$2f$ different (possibly overlapping) places in the words contained in u. If no such prefix exists, then False is returned.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

  2. is_piece_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool

    Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).

    Returns True if the word w that occurs in at least f$2f$ different (possibly overlapping) places in the words contained in u. If no such prefix exists, then False is returned.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

is_piece(*args, **kwargs)

Overloaded function.

  1. is_piece(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool

    Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).

    Returns True if the word w that occurs in at least f$2f$ different (possibly overlapping) places in the words contained in u. If no such prefix exists, then False is returned.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. is_piece(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool

    Check if a word is a piece (occurs in two distinct places in the words of the suffix tree).

    Returns True if the word w that occurs in at least f$2f$ different (possibly overlapping) places in the words contained in u. If no such prefix exists, then False is returned.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

is_subword_no_checks(*args, **kwargs)

Overloaded function.

  1. is_subword_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool

    Check if a word is a subword of any word in a suffix tree.

    Returns True if the word w is a subword of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

    Complexity

    Linear in the length of w.

  2. is_subword_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool

    Check if a word is a subword of any word in a suffix tree.

    Returns True if the word w is a subword of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    Returns

    A value of type bool.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

    Complexity

    Linear in the length of w.

is_subword(*args, **kwargs)

Overloaded function.

  1. is_subword(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool

    Check if a word is a subword of any word in a suffix tree.

    Returns True if the word w is a subword of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. is_subword(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool

    Check if a word is a subword of any word in a suffix tree.

    Returns True if the word w is a subword of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

is_suffix_no_checks(*args, **kwargs)

Overloaded function.

  1. is_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool

    Check if a word is a suffix of any word in a suffix tree.

    Returns True if the word w is a suffix of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

    Complexity

    Linear in the length of w.

  2. is_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool

    Check if a word is a suffix of any word in a suffix tree.

    Returns True if the word w is a suffix of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    Returns

    A value of type bool.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

    Complexity

    Linear in the length of w.

is_suffix(*args, **kwargs)

Overloaded function.

  1. is_suffix(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> bool

    Check if a word is a suffix of any word in a suffix tree.

    Returns True if the word w is a suffix of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. is_suffix(u: _libsemigroups_pybind11.Ukkonen, w: str) -> bool

    Check if a word is a suffix of any word in a suffix tree.

    Returns True if the word w is a suffix of one of the words in the suffix tree represented by the Ukkonen instance u.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

maximal_piece_prefix_no_checks(*args, **kwargs)

Overloaded function.

  1. maximal_piece_prefix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]

    Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length prefix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such prefix exists, then an empty list is returned.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

  2. maximal_piece_prefix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str

    Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length prefix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such prefix exists, then an empty list is returned.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

maximal_piece_prefix(*args, **kwargs)

Overloaded function.

  1. maximal_piece_prefix(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]

    Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length prefix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such prefix exists, then an empty list is returned.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. maximal_piece_prefix(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str

    Find the maximal prefix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length prefix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such prefix exists, then an empty list is returned.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

maximal_piece_suffix_no_checks(*args, **kwargs)

Overloaded function.

  1. maximal_piece_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]

    Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length suffix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such suffix exists, then an empty list is returned.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of w and \(n\) is the return value of Ukkonen.length_of_distinct_words().

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

  2. maximal_piece_suffix_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str

    Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length suffix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such suffix exists, then an empty list is returned.

    Parameters
    Returns

    A value of type bool.

    Complexity

    At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of w and \(n\) is the return value of Ukkonen.length_of_distinct_words().

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

maximal_piece_suffix(*args, **kwargs)

Overloaded function.

  1. maximal_piece_suffix(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[int]

    Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length suffix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such suffix exists, then an empty list is returned.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of w and \(n\) is the return value of Ukkonen.length_of_distinct_words().

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. maximal_piece_suffix(u: _libsemigroups_pybind11.Ukkonen, w: str) -> str

    Find the maximal suffix of a word occurring in two different places in a word in a suffix tree.

    Returns the maximal length suffix of the word w that occurs in at least two different (possibly overlapping) places in the words contained in u. If no such suffix exists, then an empty list is returned.

    Parameters
    Returns

    A value of type bool.

    Complexity

    At worst \(O(m ^ 2)\) or \(O(n)\) where \(m\) is the length of w and \(n\) is the return value of Ukkonen.length_of_distinct_words().

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

number_of_distinct_subwords(u: _libsemigroups_pybind11.Ukkonen) int

Returns the number of distinct subwords of the words in a suffix tree.

Returns the total number of distinct subwords of the words in the suffix tree u.

param u

the Ukkonen object

type u

Ukkonen

returns

A value of type size_t.

complexity

Linear in Ukkonen.length_of_distinct_words().

Raises

RunTimeError if Ukkonen.validate_word() raises.

number_of_pieces_no_checks(*args, **kwargs)

Overloaded function.

  1. number_of_pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> int

    Find the number of pieces in a decomposition of a word (if any).

    Returns minimum number of pieces whose product equals the word w if such a product exists, and 0 if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

  2. number_of_pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> int

    Find the number of pieces in a decomposition of a word (if any).

    Returns minimum number of pieces whose product equals the word w if such a product exists, and 0 if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

number_of_pieces(*args, **kwargs)

Overloaded function.

  1. number_of_pieces(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> int

    Find the number of pieces in a decomposition of a word (if any).

    Returns minimum number of pieces whose product equals the word w if such a product exists, and 0 if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. number_of_pieces(u: _libsemigroups_pybind11.Ukkonen, w: str) -> int

    Find the number of pieces in a decomposition of a word (if any).

    Returns minimum number of pieces whose product equals the word w if such a product exists, and 0 if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    Returns

    A value of type bool.

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

pieces_no_checks(*args, **kwargs)

Overloaded function.

  1. pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> int

    Find the pieces in a decomposition of a word (if any).

    Returns a list of the pieces whose product equals the word corresponding to first and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type List[List[int]].

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

  2. pieces_no_checks(u: _libsemigroups_pybind11.Ukkonen, w: str) -> List[str]

    Find the pieces in a decomposition of a word (if any).

    Returns a list of the pieces whose product equals the word corresponding to first and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    Returns

    A value of type List[List[int]].

    Complexity

    Linear in the length of w.

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

pieces(*args, **kwargs)

Overloaded function.

  1. pieces(u: _libsemigroups_pybind11.Ukkonen, w: List[int]) -> List[List[int]]

    Find the pieces in a decomposition of a word (if any).

    Returns a list of the pieces whose product equals the word corresponding to first and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    • u (Ukkonen) -- the Ukkonen object

    • w (List[int]) -- the possible subword

    Returns

    A value of type List[List[int]].

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. pieces(u: _libsemigroups_pybind11.Ukkonen, w: str) -> List[str]

    Find the pieces in a decomposition of a word (if any).

    Returns a list of the pieces whose product equals the word corresponding to first and p last if such a product exists, and an empty vector if no such product exists. Recall that a piece is a word that occurs in two distinct positions (possibly overlapping) of the words in the suffix tree u.

    Parameters
    Returns

    A value of type List[List[int]].

    Complexity

    Linear in the length of w.

    Raises

    RunTimeError if Ukkonen.validate_word() raises.