Ukkonen

This page describes the functionality in the Ukkonen class.

Ukkonen instance can be used to construction generalised suffix trees using Ukkonen's algorithm. Some related functionality is available in libsemigroups_pybind11.ukkonen.

Contents

Ukkonen

Overloaded function.

Ukkonen.add_word

Overloaded function.

Ukkonen.add_word_no_checks

Overloaded function.

Ukkonen.is_unique_letter

Check if a letter is a unique letter added to the end of a word in the suffix tree.

Ukkonen.length_of_distinct_words

Returns the sum of the lengths of the distinct words in the suffix tree.

Ukkonen.length_of_words

Returns the sum of the lengths of all of the words in the suffix tree.

Ukkonen.max_word_length

Returns the maximum length of word in the suffix tree.

Ukkonen.multiplicity

Returns the multiplicity of a word by index.

Ukkonen.number_of_distinct_words

Returns the number of distinct non-empty words in the suffix tree.

Ukkonen.number_of_words

Returns the number of non-empty words in the suffix tree.

Ukkonen.unique_letter

Returns the unique letter added to the end of a word in the suffix tree.

Ukkonen.validate_word

Validate a word.

Ukkonen.word_index

Returns the index of the word corresponding to a position.

Full API

class Ukkonen(*args, **kwargs)

Overloaded function.

  1. __init__(self: _libsemigroups_pybind11.Ukkonen) -> None

  2. __init__(self: _libsemigroups_pybind11.Ukkonen, arg0: _libsemigroups_pybind11.Ukkonen) -> None

add_word(*args, **kwargs)

Overloaded function.

  1. add_word(self: _libsemigroups_pybind11.Ukkonen, arg0: List[int]) -> None

    Add a word to the suffix tree.

    Calling this function immediately invokes Ukkonen's algorithm to add the given word to the suffix tree (if it is not already contained in the tree). If an identical word is already in the tree, then this function does nothing except increase the multiplicity of that word. If len(w) == 0, then this function does nothing.

    Parameters

    w (List[int]) -- the word to add.

    Complexity

    Linear in the length of w.

    Returns

    None

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

  2. add_word(self: _libsemigroups_pybind11.Ukkonen, arg0: str) -> None

    Add a word to the suffix tree.

    Calling this function immediately invokes Ukkonen's algorithm to add the given word to the suffix tree (if it is not already contained in the tree). If an identical word is already in the tree, then this function does nothing except increase the multiplicity of that word. If len(w) == 0, then this function does nothing.

    Parameters

    w (str) -- the word to add.

    Complexity

    Linear in the length of w.

    Returns

    None

    Raises

    RunTimeError if Ukkonen.validate_word() raises.

add_word_no_checks(*args, **kwargs)

Overloaded function.

  1. add_word_no_checks(self: _libsemigroups_pybind11.Ukkonen, arg0: List[int]) -> None

    Add a word to the suffix tree.

    Calling this function immediately invokes Ukkonen's algorithm to add the given word to the suffix tree (if it is not already contained in the tree). If an identical word is already in the tree, then this function does nothing except increase the multiplicity of that word. If len(w) == 0, then this function does nothing.

    Parameters

    w (List[int]) -- the word to add.

    Complexity

    Linear in the length of w.

    Returns

    None

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

  2. add_word_no_checks(self: _libsemigroups_pybind11.Ukkonen, arg0: str) -> None

    Add a word to the suffix tree.

    Calling this function immediately invokes Ukkonen's algorithm to add the given word to the suffix tree (if it is not already contained in the tree). If an identical word is already in the tree, then this function does nothing except increase the multiplicity of that word. If len(w) == 0, then this function does nothing.

    Parameters

    w (str) -- the word to add.

    Complexity

    Linear in the length of w.

    Returns

    None

    Warning

    This function does no checks on its arguments whatsoever. In particular, if the word w contains any of the unique letters appended to the end of any existing word in the tree, then bad things will happen.

is_unique_letter(self: _libsemigroups_pybind11.Ukkonen, l: int) bool

Check if a letter is a unique letter added to the end of a word in the suffix tree.

Parameters

l (int) -- the letter to check.

Returns

A value of type bool.

length_of_distinct_words(self: _libsemigroups_pybind11.Ukkonen) int

Returns the sum of the lengths of the distinct words in the suffix tree.

Returns

A value of type int.

length_of_words(self: _libsemigroups_pybind11.Ukkonen) int

Returns the sum of the lengths of all of the words in the suffix tree.

Returns

A value of type int

max_word_length(self: _libsemigroups_pybind11.Ukkonen) int

Returns the maximum length of word in the suffix tree.

Returns

A value of type int.

multiplicity(self: _libsemigroups_pybind11.Ukkonen, i: int) int

Returns the multiplicity of a word by index.

Parameters

i (int) -- the index

Returns

A value of type int.

number_of_distinct_words(self: _libsemigroups_pybind11.Ukkonen) int

Returns the number of distinct non-empty words in the suffix tree.

Returns

A value of type int.

number_of_words(self: _libsemigroups_pybind11.Ukkonen) int

Returns the number of non-empty words in the suffix tree.

Returns

A value of type int.

unique_letter(self: _libsemigroups_pybind11.Ukkonen, i: int) int

Returns the unique letter added to the end of a word in the suffix tree.

Parameters

i (int) -- the index of an added word

Returns

A value of type int.

validate_word(self: _libsemigroups_pybind11.Ukkonen, arg0: List[int]) None

Validate a word.

word_index(self: _libsemigroups_pybind11.Ukkonen, i: int) int

Returns the index of the word corresponding to a position.

Parameters

i (int) - the position.

Returns

A value of type word_index_type.