Documentation

EpubParserService
in package

Lwt

Modules

Book

Application

Services

Service for parsing EPUB files and extracting content.

Uses the kiwilan/php-ebook library to read EPUB files and extract metadata and chapter content for import into LWT.

Methods

cleanHtmlContent() : string: Clean HTML content to plain text suitable for LWT.
getMetadata() : array{title: string, author: string|null, description: string|null, language: string|null}|null: Get just the metadata without parsing chapters.
isValidEpub() : bool: Validate that a file is an EPUB.
parse() : array{metadata: array{title: string, author: string|null, description: string|null, language: string|null, sourceHash: string}, chapters: array{num: int, title: string, content: string}[]}: Parse an EPUB file and extract metadata and chapters.
extractAuthor() : string|null: Extract the primary author name from an ebook.
extractChapters() : array<string|int, array{num: int, title: string, content: string}>: Extract chapters from an ebook.
extractFromHtmlFiles() : array<string|int, array{num: int, title: string, content: string}>: Extract content from HTML files in the EPUB as fallback.
extractTitleFromContent() : string: Extract a title from content if possible.
getEpubModule() : EpubModule|null: Get the EpubModule from an Ebook.

cleanHtmlContent()

Clean HTML content to plain text suitable for LWT.


    public
                    cleanHtmlContent(string $html) : string

Strips HTML tags while preserving paragraph structure with double newlines for paragraph breaks.

Parameters

$html : string: The HTML content

Return values

string —

Clean plain text

getMetadata()

Get just the metadata without parsing chapters.


    public
                    getMetadata(string $filePath) : array{title: string, author: string|null, description: string|null, language: string|null}|null

Parameters

$filePath : string: Path to the EPUB file

Return values

array{title: string, author: string|null, description: string|null, language: string|null}|null —

Metadata or null on failure

isValidEpub()

Validate that a file is an EPUB.


    public
                    isValidEpub(string $filePath) : bool

Parameters

$filePath : string: Path to the file

Return values

bool —

True if valid EPUB

parse()

Parse an EPUB file and extract metadata and chapters.


    public
                    parse(string $filePath) : array{metadata: array{title: string, author: string|null, description: string|null, language: string|null, sourceHash: string}, chapters: array{num: int, title: string, content: string}[]}

Parameters

$filePath : string: Absolute path to the EPUB file

Return values

array{metadata: array{title: string, author: string|null, description: string|null, language: string|null, sourceHash: string}, chapters: array{num: int, title: string, content: string}[]}

extractAuthor()

Extract the primary author name from an ebook.


    private
                    extractAuthor(Ebook $ebook) : string|null

Parameters

$ebook : Ebook: The ebook object

Return values

string|null —

Author name or null if not found

extractChapters()

Extract chapters from an ebook.


    private
                    extractChapters(Ebook $ebook) : array<string|int, array{num: int, title: string, content: string}>

Parameters

$ebook : Ebook: The ebook object

Return values

array<string|int, array{num: int, title: string, content: string}>

extractFromHtmlFiles()

Extract content from HTML files in the EPUB as fallback.


    private
                    extractFromHtmlFiles(Ebook $ebook) : array<string|int, array{num: int, title: string, content: string}>

Parameters

$ebook : Ebook: The ebook object

Return values

array<string|int, array{num: int, title: string, content: string}>

extractTitleFromContent()

Extract a title from content if possible.


    private
                    extractTitleFromContent(string $content, int $num) : string

Parameters

$content : string: The text content
$num : int: Default chapter number

Return values

string —

The extracted or default title

getEpubModule()

Get the EpubModule from an Ebook.


    private
                    getEpubModule(Ebook $ebook) : EpubModule|null

Parameters

$ebook : Ebook: The ebook object

Return values

EpubModule|null —

The EPUB module or null if not an EPUB

Documentation

EpubParserService
in package

Lwt

Modules

Book

Application

Services

Tags

Table of Contents

Methods

Methods

cleanHtmlContent()

Parameters

Return values

getMetadata()

Parameters

Return values

isValidEpub()

Parameters

Return values

parse()

Parameters

Tags

Return values

extractAuthor()

Parameters

Return values

extractChapters()

Parameters

Return values

extractFromHtmlFiles()

Parameters

Return values

extractTitleFromContent()

Parameters

Return values

getEpubModule()

Parameters

Return values

Search results

EpubParserService in package Lwt Modules Book Application Services

Tags

Table of Contents

Methods

Methods

cleanHtmlContent()

Parameters

Return values

getMetadata()

Parameters

Return values

isValidEpub()

Parameters

Return values

parse()

Parameters

Tags

Return values

extractAuthor()

Parameters

Return values

extractChapters()

Parameters

Return values

extractFromHtmlFiles()

Parameters

Return values

extractTitleFromContent()

Parameters

Return values

getEpubModule()

Parameters

Return values

EpubParserService
in package

Lwt

Modules

Book

Application

Services