ParserInterface
in
Interface for text parsers that tokenize text into words and sentences.
Implementations of this interface provide language-specific parsing strategies. Each parser handles text tokenization differently based on the language's characteristics (e.g., space-separated words, character-based, morphological).
Tags
Table of Contents
Methods
- getAvailabilityMessage() : string
- Get a description of why this parser might not be available.
- getName() : string
- Get human-readable name for UI display.
- getType() : string
- Get unique identifier for this parser type.
- isAvailable() : bool
- Check if this parser is available on the current system.
- parse() : ParserResult
- Parse text into a structured result with sentences and tokens.
Methods
getAvailabilityMessage()
Get a description of why this parser might not be available.
public
getAvailabilityMessage() : string
Called when isAvailable() returns false to provide helpful feedback.
Return values
string —Description of missing dependencies or empty if available
getName()
Get human-readable name for UI display.
public
getName() : string
This name is shown to users when selecting a parser for a language. Examples: 'Standard (Regex)', 'Character-by-Character', 'MeCab (Japanese)'
Return values
string —Human-readable parser name
getType()
Get unique identifier for this parser type.
public
getType() : string
Used for registration in the ParserRegistry and selection by Language entities. Examples: 'regex', 'character', 'mecab'
Return values
string —Parser type identifier
isAvailable()
Check if this parser is available on the current system.
public
isAvailable() : bool
Some parsers may depend on external tools (e.g., MeCab binary). This method checks if all dependencies are satisfied.
Return values
bool —True if parser can be used, false otherwise
parse()
Parse text into a structured result with sentences and tokens.
public
parse(string $text, ParserConfig $config) : ParserResult
Parameters
- $text : string
-
Text to parse (already preprocessed)
- $config : ParserConfig
-
Parser configuration from language settings
Return values
ParserResult —Parsing result containing sentences and tokens