ParsingCoordinator
in package
Coordinates parsing operations, providing a facade for the parser system.
This class handles parser selection, preprocessing, and database persistence. It serves as the main entry point for text parsing operations.
Tags
Table of Contents
Properties
Methods
- __construct() : mixed
- getRegistry() : ParserRegistry
- Get the parser registry.
- parseAndSave() : void
- Parse text and save to database.
- parseForPreview() : ParserResult
- Parse text and return the result without database operations.
- splitIntoSentences() : array<string|int, string>
- Split text into sentences without database operations.
- checkExpressions() : bool
- Check for multi-word expressions and populate tempexprs.
- insertTokensToTemp() : void
- Insert tokens into temp_word_occurrences table.
- preprocess() : string
- Preprocess text before parsing.
- registerSentencesTextItems() : void
- Register sentences and text items in the database.
- saveToDatabase() : void
- Save parsing result to database.
Properties
$registry
private
ParserRegistry
$registry
Methods
__construct()
public
__construct(ParserRegistry $registry) : mixed
Parameters
- $registry : ParserRegistry
getRegistry()
Get the parser registry.
public
getRegistry() : ParserRegistry
Return values
ParserRegistry —Parser registry
parseAndSave()
Parse text and save to database.
public
parseAndSave(string $text, Language $language, int $textId) : void
Parameters
- $text : string
-
Text to parse
- $language : Language
-
Language entity
- $textId : int
-
Text ID (must be positive)
Tags
parseForPreview()
Parse text and return the result without database operations.
public
parseForPreview(string $text, Language $language) : ParserResult
Parameters
- $text : string
-
Text to parse
- $language : Language
-
Language entity
Return values
ParserResult —Parsing result
splitIntoSentences()
Split text into sentences without database operations.
public
splitIntoSentences(string $text, Language $language) : array<string|int, string>
Parameters
- $text : string
-
Text to parse
- $language : Language
-
Language entity
Return values
array<string|int, string> —Array of sentences
checkExpressions()
Check for multi-word expressions and populate tempexprs.
protected
checkExpressions(int $lid) : bool
Parameters
- $lid : int
-
Language ID
Return values
bool —True if multi-word expressions were found
insertTokensToTemp()
Insert tokens into temp_word_occurrences table.
protected
insertTokensToTemp(ParserResult $result, int $startSeID) : void
Parameters
- $result : ParserResult
-
Parsing result
- $startSeID : int
-
Starting sentence ID
preprocess()
Preprocess text before parsing.
protected
preprocess(string $text, Language $language) : string
Applies character substitutions and other text cleanup.
Parameters
- $text : string
-
Raw text
- $language : Language
-
Language entity
Return values
string —Preprocessed text
registerSentencesTextItems()
Register sentences and text items in the database.
protected
registerSentencesTextItems(int $tid, int $lid, bool $hasmultiword) : void
Parameters
- $tid : int
-
Text ID
- $lid : int
-
Language ID
- $hasmultiword : bool
-
Whether to process multi-word expressions
saveToDatabase()
Save parsing result to database.
protected
saveToDatabase(ParserResult $result, Language $language, int $textId) : void
Parameters
- $result : ParserResult
-
Parsing result
- $language : Language
-
Language entity
- $textId : int
-
Text ID