LemmatizerInterface
in
Interface for lemmatization strategies.
Implementations can use dictionaries, rule-based stemming, or external NLP services to find the base form (lemma) of words.
Tags
Table of Contents
Methods
- getSupportedLanguages() : array<string|int, string>
- Get the list of supported language codes.
- lemmatize() : string|null
- Find the lemma (base form) of a word.
- lemmatizeBatch() : array<string, string|null>
- Lemmatize multiple words in batch.
- supportsLanguage() : bool
- Check if this lemmatizer supports a given language.
Methods
getSupportedLanguages()
Get the list of supported language codes.
public
getSupportedLanguages() : array<string|int, string>
Return values
array<string|int, string> —Array of ISO language codes
lemmatize()
Find the lemma (base form) of a word.
public
lemmatize(string $word, string $languageCode) : string|null
Parameters
- $word : string
-
The word to lemmatize
- $languageCode : string
-
ISO language code (e.g., 'en', 'de', 'fr')
Return values
string|null —The lemma, or null if not found
lemmatizeBatch()
Lemmatize multiple words in batch.
public
lemmatizeBatch(array<string|int, string> $words, string $languageCode) : array<string, string|null>
Parameters
- $words : array<string|int, string>
-
Array of words to lemmatize
- $languageCode : string
-
ISO language code
Return values
array<string, string|null> —Word => lemma mapping
supportsLanguage()
Check if this lemmatizer supports a given language.
public
supportsLanguage(string $languageCode) : bool
Parameters
- $languageCode : string
-
ISO language code
Return values
bool —True if the language is supported