LemmatizerFactory
in package
Factory for creating lemmatizers based on language configuration.
This factory implements a hybrid approach:
- Dictionary-based lookup (fast, predictable)
- NLP service fallback (spaCy, high accuracy)
The lemmatizer type can be configured per language in the database.
Table of Contents
Constants
- TYPE_DICTIONARY = 'dictionary'
- TYPE_HYBRID = 'hybrid'
- TYPE_NONE = 'none'
- Lemmatizer types
- TYPE_SPACY = 'spacy'
Properties
- $dictionaryLemmatizer : DictionaryLemmatizer|null
- $instances : array<string, LemmatizerInterface>
- $nlpLemmatizer : NlpServiceLemmatizer|null
Methods
- clearCache() : void
- Clear cached instances (useful for testing).
- createHybridLemmatizer() : LemmatizerInterface
- Create a hybrid lemmatizer (dictionary + NLP fallback).
- createLemmatizer() : LemmatizerInterface
- Create a lemmatizer by type.
- getAllNlpLanguages() : array<string|int, string>
- Get all potentially supported NLP languages (including uninstalled).
- getBestAvailable() : LemmatizerInterface
- Get the best available lemmatizer for a language code.
- getDictionaryLemmatizer() : DictionaryLemmatizer
- Get the dictionary-based lemmatizer.
- getForLanguage() : LemmatizerInterface|null
- Get the appropriate lemmatizer for a language.
- getNlpServiceLemmatizer() : NlpServiceLemmatizer
- Get the NLP service lemmatizer (spaCy).
- getNlpSupportedLanguages() : array<string|int, string>
- Get list of languages supported by the NLP service.
- isNlpServiceAvailable() : bool
- Check if NLP service is available.
- getLanguageLemmatizerType() : string
- Get the configured lemmatizer type for a language.
Constants
TYPE_DICTIONARY
public
mixed
TYPE_DICTIONARY
= 'dictionary'
TYPE_HYBRID
public
mixed
TYPE_HYBRID
= 'hybrid'
TYPE_NONE
Lemmatizer types
public
mixed
TYPE_NONE
= 'none'
TYPE_SPACY
public
mixed
TYPE_SPACY
= 'spacy'
Properties
$dictionaryLemmatizer
private
static DictionaryLemmatizer|null
$dictionaryLemmatizer
= null
Cached dictionary lemmatizer
$instances
private
static array<string, LemmatizerInterface>
$instances
= []
Cached lemmatizer instances
$nlpLemmatizer
private
static NlpServiceLemmatizer|null
$nlpLemmatizer
= null
Cached NLP service lemmatizer
Methods
clearCache()
Clear cached instances (useful for testing).
public
static clearCache() : void
createHybridLemmatizer()
Create a hybrid lemmatizer (dictionary + NLP fallback).
public
static createHybridLemmatizer() : LemmatizerInterface
Return values
LemmatizerInterfacecreateLemmatizer()
Create a lemmatizer by type.
public
static createLemmatizer(string $type) : LemmatizerInterface
Parameters
- $type : string
-
Lemmatizer type
Return values
LemmatizerInterfacegetAllNlpLanguages()
Get all potentially supported NLP languages (including uninstalled).
public
static getAllNlpLanguages() : array<string|int, string>
Return values
array<string|int, string>getBestAvailable()
Get the best available lemmatizer for a language code.
public
static getBestAvailable(string $languageCode) : LemmatizerInterface
Uses a fallback chain:
- Try NLP service (spaCy) if available for this language
- Fall back to dictionary-based lemmatizer
Parameters
- $languageCode : string
-
ISO language code
Return values
LemmatizerInterfacegetDictionaryLemmatizer()
Get the dictionary-based lemmatizer.
public
static getDictionaryLemmatizer() : DictionaryLemmatizer
Return values
DictionaryLemmatizergetForLanguage()
Get the appropriate lemmatizer for a language.
public
static getForLanguage(int $languageId[, string|null $type = null ]) : LemmatizerInterface|null
Parameters
- $languageId : int
-
Language ID
- $type : string|null = null
-
Force specific type (null = use language config)
Return values
LemmatizerInterface|null —Lemmatizer or null if none configured
getNlpServiceLemmatizer()
Get the NLP service lemmatizer (spaCy).
public
static getNlpServiceLemmatizer() : NlpServiceLemmatizer
Return values
NlpServiceLemmatizergetNlpSupportedLanguages()
Get list of languages supported by the NLP service.
public
static getNlpSupportedLanguages() : array<string|int, string>
Return values
array<string|int, string>isNlpServiceAvailable()
Check if NLP service is available.
public
static isNlpServiceAvailable() : bool
Return values
boolgetLanguageLemmatizerType()
Get the configured lemmatizer type for a language.
private
static getLanguageLemmatizerType(int $languageId) : string
Parameters
- $languageId : int
-
Language ID
Return values
string —Lemmatizer type