MecabParser
in package
implements
ParserInterface
MeCab-based parser for Japanese language.
Uses the MeCab morphological analyzer to tokenize Japanese text into words. MeCab must be installed on the system.
Tags
Table of Contents
Interfaces
- ParserInterface
- Interface for text parsers that tokenize text into words and sentences.
Properties
- $availabilityMessage : string
- $mecabAvailable : bool|null
- $parsingService : TextParsingService
Methods
- __construct() : mixed
- getAvailabilityMessage() : string
- Get a description of why this parser might not be available.
- getName() : string
- Get human-readable name for UI display.
- getType() : string
- Get unique identifier for this parser type.
- isAvailable() : bool
- Check if this parser is available on the current system.
- parse() : ParserResult
- Parse text into a structured result with sentences and tokens.
- parseMecabOutput() : ParserResult
- Parse MeCab output into ParserResult.
- preprocessText() : string
- Preprocess text before MeCab parsing.
- runMecab() : string
- Run MeCab on the text and return output.
- checkAvailability() : void
- Check if MeCab is available on the system.
Properties
$availabilityMessage
private
string
$availabilityMessage
= ''
$mecabAvailable
private
bool|null
$mecabAvailable
= null
$parsingService
private
TextParsingService
$parsingService
Methods
__construct()
public
__construct([TextParsingService|null $parsingService = null ]) : mixed
Parameters
- $parsingService : TextParsingService|null = null
getAvailabilityMessage()
Get a description of why this parser might not be available.
public
getAvailabilityMessage() : string
Return values
string —Description of missing dependencies or empty if available
getName()
Get human-readable name for UI display.
public
getName() : string
Return values
string —Human-readable parser name
getType()
Get unique identifier for this parser type.
public
getType() : string
Return values
string —Parser type identifier
isAvailable()
Check if this parser is available on the current system.
public
isAvailable() : bool
Return values
bool —True if parser can be used, false otherwise
parse()
Parse text into a structured result with sentences and tokens.
public
parse(string $text, ParserConfig $config) : ParserResult
Parameters
- $text : string
-
Text to parse (already preprocessed)
- $config : ParserConfig
-
Parser configuration from language settings
Return values
ParserResult —Parsing result containing sentences and tokens
parseMecabOutput()
Parse MeCab output into ParserResult.
protected
parseMecabOutput(string $mecabOutput) : ParserResult
Parameters
- $mecabOutput : string
-
MeCab output text
Return values
ParserResult —Parsed result
preprocessText()
Preprocess text before MeCab parsing.
protected
preprocessText(string $text) : string
Parameters
- $text : string
-
Raw text
Return values
string —Preprocessed text
runMecab()
Run MeCab on the text and return output.
protected
runMecab(string $text) : string
Parameters
- $text : string
-
Text to parse
Return values
string —MeCab output
checkAvailability()
Check if MeCab is available on the system.
private
checkAvailability() : void
This method checks for MeCab directly without calling getMecabPath() which would die() if MeCab is not found.