Documentation

SubtitleParserService

Service for parsing subtitle files (SRT, VTT) and extracting text content.

Supports:

  • SRT (SubRip) format
  • VTT (WebVTT) format
Tags
since
3.0.0

Table of Contents

Methods

detectFormat()  : string|null
Detect subtitle format from filename extension or content.
isValidSubtitle()  : bool
Validate that content appears to be a valid subtitle file.
parse()  : array{success: bool, text: string, cueCount: int, error: string|null}
Parse a subtitle file and extract text content.
cleanText()  : string
Clean extracted text.
isValidSrt()  : bool
Check if content appears to be valid SRT format.
isValidVtt()  : bool
Check if content appears to be valid VTT format.
parseSrt()  : string
Parse SRT format content.
parseVtt()  : string
Parse VTT format content.
stripVttTags()  : string
Strip VTT inline styling tags.

Methods

detectFormat()

Detect subtitle format from filename extension or content.

public detectFormat(string $filename, string $content) : string|null
Parameters
$filename : string

File name

$content : string

File content (for WEBVTT header detection)

Return values
string|null

'srt', 'vtt', or null if unknown

isValidSubtitle()

Validate that content appears to be a valid subtitle file.

public isValidSubtitle(string $content, string $format) : bool
Parameters
$content : string

File content

$format : string

Expected format ('srt' or 'vtt')

Return values
bool

True if content appears valid for the format

parse()

Parse a subtitle file and extract text content.

public parse(string $content, string $format) : array{success: bool, text: string, cueCount: int, error: string|null}
Parameters
$content : string

Raw file content

$format : string

Format: 'srt' or 'vtt'

Return values
array{success: bool, text: string, cueCount: int, error: string|null}

cleanText()

Clean extracted text.

private cleanText(string $text) : string
  • Normalize whitespace
  • Remove excessive blank lines
  • Trim lines
Parameters
$text : string

Raw extracted text

Return values
string

Cleaned text

isValidSrt()

Check if content appears to be valid SRT format.

private isValidSrt(string $content) : bool
Parameters
$content : string

File content

Return values
bool

True if appears to be valid SRT

isValidVtt()

Check if content appears to be valid VTT format.

private isValidVtt(string $content) : bool
Parameters
$content : string

File content

Return values
bool

True if appears to be valid VTT

parseSrt()

Parse SRT format content.

private parseSrt(string $content) : string

SRT format:

1
00:00:00,000 --> 00:00:05,000
Subtitle text here

2
00:00:05,100 --> 00:00:10,000
Another subtitle
Parameters
$content : string

Raw SRT content

Return values
string

Extracted text with double newlines between cues

parseVtt()

Parse VTT format content.

private parseVtt(string $content) : string

VTT format:

WEBVTT

00:00:00.000 --> 00:00:05.000
Subtitle text here

NOTE
This is a comment

00:00:05.100 --> 00:00:10.000
Another subtitle
Parameters
$content : string

Raw VTT content

Return values
string

Extracted text with double newlines between cues

stripVttTags()

Strip VTT inline styling tags.

private stripVttTags(string $text) : string

Removes tags like:

  • , , , , ,
  • <c.classname>
Parameters
$text : string

Text with potential VTT tags

Return values
string

Text with tags removed


        
On this page

Search results