lambeq.tokeniser¶
- class lambeq.tokeniser.SpacyTokeniser[source]¶
Bases:
Tokeniser
Tokeniser class based on SpaCy.
- split_sentences(text: str) list[str] [source]¶
Split input text into a list of sentences.
- Parameters:
- textstr
A single string that contains one or multiple sentences.
- Returns:
- list of str
List of sentences, one sentence in each string.
- tokenise_sentence(sentence: str) list[str] ¶
Tokenise a sentence.
- Parameters:
- sentencestr
An untokenised sentence.
- Returns:
- list of str
A tokenised sentence given as a list of tokens - strings.
- class lambeq.tokeniser.Tokeniser[source]¶
Bases:
ABC
Base Class for all tokenisers
- abstract split_sentences(text: str) list[str] [source]¶
Split input text into a list of sentences.
- Parameters:
- textstr
A single string that contains one or multiple sentences.
- Returns:
- list of str
List of sentences, one sentence in each string.