CuratedDictImportService
in package
Service for importing curated dictionaries from remote URLs.
Validates URLs against the curated registry (SSRF defense), downloads the archive, extracts it, delegates to the appropriate importer, and cleans up temp files.
Tags
Table of Contents
Constants
- DOWNLOAD_TIMEOUT = 120
- HTTP download timeout in seconds.
- MAX_DOWNLOAD_BYTES = 100 * 1024 * 1024
- Maximum download size (100 MB).
- MAX_ZIP_FILES = 500
- Maximum files allowed in a ZIP archive.
Properties
- $facade : DictionaryFacade
- $registryPath : string|null
Methods
- __construct() : mixed
- findImportFile() : string
- Find the importable file within an extracted directory.
- importFromUrl() : array{success: bool, dictId?: int, imported?: int, vocabCreated?: int, error?: string}
- Import a curated dictionary from a remote URL.
- isCuratedUrl() : bool
- Check whether a URL appears in the curated dictionaries registry.
- cleanup() : void
- Clean up temporary files and directories.
- downloadToTemp() : string
- Download a URL to a temporary file.
- extractTar() : string
- Extract a tar archive (.tar.xz, .tar.gz, .tar.bz2) to a temporary directory.
- extractZip() : string
- Extract a ZIP archive to a temporary directory.
- isTarArchive() : bool
- Check whether a URL points to a tar archive (.tar.xz, .tar.gz, .tar.bz2).
- loadRegistry() : array<int, array<string, mixed>>|null
- Load the curated dictionaries registry.
- removeDir() : void
- Recursively remove a directory.
Constants
DOWNLOAD_TIMEOUT
HTTP download timeout in seconds.
private
mixed
DOWNLOAD_TIMEOUT
= 120
MAX_DOWNLOAD_BYTES
Maximum download size (100 MB).
private
mixed
MAX_DOWNLOAD_BYTES
= 100 * 1024 * 1024
MAX_ZIP_FILES
Maximum files allowed in a ZIP archive.
private
mixed
MAX_ZIP_FILES
= 500
Properties
$facade
private
DictionaryFacade
$facade
$registryPath
private
string|null
$registryPath
Override path for testing
Methods
__construct()
public
__construct(DictionaryFacade $facade[, string|null $registryPath = null ]) : mixed
Parameters
- $facade : DictionaryFacade
- $registryPath : string|null = null
findImportFile()
Find the importable file within an extracted directory.
public
findImportFile(string $directory, string $format) : string
Parameters
- $directory : string
-
Extracted directory path
- $format : string
-
Expected format (stardict, csv)
Tags
Return values
string —Path to the importable file
importFromUrl()
Import a curated dictionary from a remote URL.
public
importFromUrl(int $languageId, string $url, string $format, string $name) : array{success: bool, dictId?: int, imported?: int, vocabCreated?: int, error?: string}
Parameters
- $languageId : int
-
Target language ID
- $url : string
-
Download URL (must be in curated registry)
- $format : string
-
Dictionary format (stardict, csv)
- $name : string
-
Dictionary name
Return values
array{success: bool, dictId?: int, imported?: int, vocabCreated?: int, error?: string}isCuratedUrl()
Check whether a URL appears in the curated dictionaries registry.
public
isCuratedUrl(string $url) : bool
Parameters
- $url : string
Return values
boolcleanup()
Clean up temporary files and directories.
private
cleanup(string ...$paths) : void
Parameters
- $paths : string
downloadToTemp()
Download a URL to a temporary file.
private
downloadToTemp(string $url) : string
Parameters
- $url : string
Tags
Return values
string —Path to the downloaded temp file
extractTar()
Extract a tar archive (.tar.xz, .tar.gz, .tar.bz2) to a temporary directory.
private
extractTar(string $tarPath) : string
Uses the system tar command which handles xz/gzip/bzip2 auto-detection.
Parameters
- $tarPath : string
Tags
Return values
string —Path to the extraction directory
extractZip()
Extract a ZIP archive to a temporary directory.
private
extractZip(string $zipPath) : string
Parameters
- $zipPath : string
Tags
Return values
string —Path to the extraction directory
isTarArchive()
Check whether a URL points to a tar archive (.tar.xz, .tar.gz, .tar.bz2).
private
isTarArchive(string $url) : bool
Parameters
- $url : string
Return values
boolloadRegistry()
Load the curated dictionaries registry.
private
loadRegistry() : array<int, array<string, mixed>>|null
Return values
array<int, array<string, mixed>>|nullremoveDir()
Recursively remove a directory.
private
removeDir(string $dir) : void
Parameters
- $dir : string