SubtitleParserService
in package
Service for parsing subtitle files (SRT, VTT) and extracting text content.
Supports:
- SRT (SubRip) format
- VTT (WebVTT) format
Tags
Table of Contents
Methods
- detectFormat() : string|null
- Detect subtitle format from filename extension or content.
- isValidSubtitle() : bool
- Validate that content appears to be a valid subtitle file.
- parse() : array{success: bool, text: string, cueCount: int, error: string|null}
- Parse a subtitle file and extract text content.
- cleanText() : string
- Clean extracted text.
- isValidSrt() : bool
- Check if content appears to be valid SRT format.
- isValidVtt() : bool
- Check if content appears to be valid VTT format.
- parseSrt() : string
- Parse SRT format content.
- parseVtt() : string
- Parse VTT format content.
- stripVttTags() : string
- Strip VTT inline styling tags.
Methods
detectFormat()
Detect subtitle format from filename extension or content.
public
detectFormat(string $filename, string $content) : string|null
Parameters
- $filename : string
-
File name
- $content : string
-
File content (for WEBVTT header detection)
Return values
string|null —'srt', 'vtt', or null if unknown
isValidSubtitle()
Validate that content appears to be a valid subtitle file.
public
isValidSubtitle(string $content, string $format) : bool
Parameters
- $content : string
-
File content
- $format : string
-
Expected format ('srt' or 'vtt')
Return values
bool —True if content appears valid for the format
parse()
Parse a subtitle file and extract text content.
public
parse(string $content, string $format) : array{success: bool, text: string, cueCount: int, error: string|null}
Parameters
- $content : string
-
Raw file content
- $format : string
-
Format: 'srt' or 'vtt'
Return values
array{success: bool, text: string, cueCount: int, error: string|null}cleanText()
Clean extracted text.
private
cleanText(string $text) : string
- Normalize whitespace
- Remove excessive blank lines
- Trim lines
Parameters
- $text : string
-
Raw extracted text
Return values
string —Cleaned text
isValidSrt()
Check if content appears to be valid SRT format.
private
isValidSrt(string $content) : bool
Parameters
- $content : string
-
File content
Return values
bool —True if appears to be valid SRT
isValidVtt()
Check if content appears to be valid VTT format.
private
isValidVtt(string $content) : bool
Parameters
- $content : string
-
File content
Return values
bool —True if appears to be valid VTT
parseSrt()
Parse SRT format content.
private
parseSrt(string $content) : string
SRT format:
1
00:00:00,000 --> 00:00:05,000
Subtitle text here
2
00:00:05,100 --> 00:00:10,000
Another subtitle
Parameters
- $content : string
-
Raw SRT content
Return values
string —Extracted text with double newlines between cues
parseVtt()
Parse VTT format content.
private
parseVtt(string $content) : string
VTT format:
WEBVTT
00:00:00.000 --> 00:00:05.000
Subtitle text here
NOTE
This is a comment
00:00:05.100 --> 00:00:10.000
Another subtitle
Parameters
- $content : string
-
Raw VTT content
Return values
string —Extracted text with double newlines between cues
stripVttTags()
Strip VTT inline styling tags.
private
stripVttTags(string $text) : string
Removes tags like:
- , , , , ,
- <c.classname>
-
-
Parameters
- $text : string
-
Text with potential VTT tags
Return values
string —Text with tags removed