Documentation

LemmatizerInterface

Interface for lemmatization strategies.

Implementations can use dictionaries, rule-based stemming, or external NLP services to find the base form (lemma) of words.

Tags
since
3.0.0

Table of Contents

Methods

getSupportedLanguages()  : array<string|int, string>
Get the list of supported language codes.
lemmatize()  : string|null
Find the lemma (base form) of a word.
lemmatizeBatch()  : array<string, string|null>
Lemmatize multiple words in batch.
supportsLanguage()  : bool
Check if this lemmatizer supports a given language.

Methods

getSupportedLanguages()

Get the list of supported language codes.

public getSupportedLanguages() : array<string|int, string>
Return values
array<string|int, string>

Array of ISO language codes

lemmatize()

Find the lemma (base form) of a word.

public lemmatize(string $word, string $languageCode) : string|null
Parameters
$word : string

The word to lemmatize

$languageCode : string

ISO language code (e.g., 'en', 'de', 'fr')

Return values
string|null

The lemma, or null if not found

lemmatizeBatch()

Lemmatize multiple words in batch.

public lemmatizeBatch(array<string|int, string> $words, string $languageCode) : array<string, string|null>
Parameters
$words : array<string|int, string>

Array of words to lemmatize

$languageCode : string

ISO language code

Return values
array<string, string|null>

Word => lemma mapping

supportsLanguage()

Check if this lemmatizer supports a given language.

public supportsLanguage(string $languageCode) : bool
Parameters
$languageCode : string

ISO language code

Return values
bool

True if the language is supported


        
On this page

Search results