Documentation

CuratedDictImportService

Service for importing curated dictionaries from remote URLs.

Validates URLs against the curated registry (SSRF defense), downloads the archive, extracts it, delegates to the appropriate importer, and cleans up temp files.

Tags
since
3.0.0

Table of Contents

Constants

DOWNLOAD_TIMEOUT  = 120
HTTP download timeout in seconds.
MAX_DOWNLOAD_BYTES  = 100 * 1024 * 1024
Maximum download size (100 MB).
MAX_ZIP_FILES  = 500
Maximum files allowed in a ZIP archive.

Properties

$facade  : DictionaryFacade
$registryPath  : string|null

Methods

__construct()  : mixed
findImportFile()  : string
Find the importable file within an extracted directory.
importFromUrl()  : array{success: bool, dictId?: int, imported?: int, vocabCreated?: int, error?: string}
Import a curated dictionary from a remote URL.
isCuratedUrl()  : bool
Check whether a URL appears in the curated dictionaries registry.
cleanup()  : void
Clean up temporary files and directories.
downloadToTemp()  : string
Download a URL to a temporary file.
extractTar()  : string
Extract a tar archive (.tar.xz, .tar.gz, .tar.bz2) to a temporary directory.
extractZip()  : string
Extract a ZIP archive to a temporary directory.
isTarArchive()  : bool
Check whether a URL points to a tar archive (.tar.xz, .tar.gz, .tar.bz2).
loadRegistry()  : array<int, array<string, mixed>>|null
Load the curated dictionaries registry.
removeDir()  : void
Recursively remove a directory.

Constants

Properties

Methods

findImportFile()

Find the importable file within an extracted directory.

public findImportFile(string $directory, string $format) : string
Parameters
$directory : string

Extracted directory path

$format : string

Expected format (stardict, csv)

Tags
throws
RuntimeException

If no importable file found

Return values
string

Path to the importable file

importFromUrl()

Import a curated dictionary from a remote URL.

public importFromUrl(int $languageId, string $url, string $format, string $name) : array{success: bool, dictId?: int, imported?: int, vocabCreated?: int, error?: string}
Parameters
$languageId : int

Target language ID

$url : string

Download URL (must be in curated registry)

$format : string

Dictionary format (stardict, csv)

$name : string

Dictionary name

Return values
array{success: bool, dictId?: int, imported?: int, vocabCreated?: int, error?: string}

isCuratedUrl()

Check whether a URL appears in the curated dictionaries registry.

public isCuratedUrl(string $url) : bool
Parameters
$url : string
Return values
bool

cleanup()

Clean up temporary files and directories.

private cleanup(string ...$paths) : void
Parameters
$paths : string

downloadToTemp()

Download a URL to a temporary file.

private downloadToTemp(string $url) : string
Parameters
$url : string
Tags
throws
RuntimeException

On download failure

Return values
string

Path to the downloaded temp file

extractTar()

Extract a tar archive (.tar.xz, .tar.gz, .tar.bz2) to a temporary directory.

private extractTar(string $tarPath) : string

Uses the system tar command which handles xz/gzip/bzip2 auto-detection.

Parameters
$tarPath : string
Tags
throws
RuntimeException

On extraction failure

Return values
string

Path to the extraction directory

extractZip()

Extract a ZIP archive to a temporary directory.

private extractZip(string $zipPath) : string
Parameters
$zipPath : string
Tags
throws
RuntimeException

On extraction failure

Return values
string

Path to the extraction directory

isTarArchive()

Check whether a URL points to a tar archive (.tar.xz, .tar.gz, .tar.bz2).

private isTarArchive(string $url) : bool
Parameters
$url : string
Return values
bool

loadRegistry()

Load the curated dictionaries registry.

private loadRegistry() : array<int, array<string, mixed>>|null
Return values
array<int, array<string, mixed>>|null

removeDir()

Recursively remove a directory.

private removeDir(string $dir) : void
Parameters
$dir : string

        
On this page

Search results