Documentation

Token

Represents a single token from text parsing.

A token can be either a word (learnable content) or a non-word (punctuation, whitespace, symbols). Tokens maintain their position within the text for proper reconstruction and display.

Tags
since
3.0.0

Table of Contents

Properties

$isWord  : bool
$order  : int
$reading  : string
$sentenceIndex  : int
$text  : string
$wordCount  : int

Methods

__construct()  : mixed
Create a new token.
getOrder()  : int
Get the order/position within the sentence.
getReading()  : string
Get the phonetic reading.
getSentenceIndex()  : int
Get the sentence index this token belongs to.
getText()  : string
Get the token text content.
getWordCount()  : int
Get the word count (for multi-word expressions).
isWord()  : bool
Check if this token is a learnable word.
nonWord()  : self
Create a non-word token (punctuation, whitespace, etc.).
word()  : self
Create a word token.

Properties

$isWord

private bool $isWord

$reading

private string $reading = ''

$sentenceIndex

private int $sentenceIndex

$wordCount

private int $wordCount = 1

Methods

__construct()

Create a new token.

public __construct(string $text, int $sentenceIndex, int $order, bool $isWord[, int $wordCount = 1 ][, string $reading = '' ]) : mixed
Parameters
$text : string

The token text content

$sentenceIndex : int

Index of the sentence this token belongs to (0-based)

$order : int

Position of this token within its sentence (0-based)

$isWord : bool

True if this is a learnable word, false for punctuation/whitespace

$wordCount : int = 1

Number of words (1 for single word, >1 for multi-word expressions)

$reading : string = ''

Optional phonetic reading (e.g., furigana for Japanese)

getOrder()

Get the order/position within the sentence.

public getOrder() : int
Return values
int

Order within sentence (0-based)

getReading()

Get the phonetic reading.

public getReading() : string
Return values
string

Phonetic reading or empty string if not available

getSentenceIndex()

Get the sentence index this token belongs to.

public getSentenceIndex() : int
Return values
int

Sentence index (0-based)

getText()

Get the token text content.

public getText() : string
Return values
string

Token text

getWordCount()

Get the word count (for multi-word expressions).

public getWordCount() : int
Return values
int

Word count (1 for single words)

isWord()

Check if this token is a learnable word.

public isWord() : bool
Return values
bool

True for words, false for punctuation/whitespace

nonWord()

Create a non-word token (punctuation, whitespace, etc.).

public static nonWord(string $text, int $sentenceIndex, int $order) : self
Parameters
$text : string

Token text

$sentenceIndex : int

Sentence index

$order : int

Order within sentence

Return values
self

New non-word token

word()

Create a word token.

public static word(string $text, int $sentenceIndex, int $order[, string $reading = '' ]) : self
Parameters
$text : string

Word text

$sentenceIndex : int

Sentence index

$order : int

Order within sentence

$reading : string = ''

Optional phonetic reading

Return values
self

New word token


        
On this page

Search results