Convert VTT to JSON Online Free
Convert WebVTT subtitle files to a structured JSON array. ChangeThisFile parses each VTT cue and outputs a JSON array of objects with index, start time, end time, and text fields.
By ChangeThisFile Team · Last updated: March 2026
ChangeThisFile converts WebVTT files to JSON on a secure server. Each VTT cue becomes a JSON object with index, start, end, and text fields — output as a JSON array. Useful for building subtitle APIs or processing caption data programmatically. Files are auto-deleted after processing. Free, no signup.
Convert WebVTT to JSON
Drop your WebVTT file here to convert it instantly
Drag & drop your .vtt file here, or click to browse
Convert to JSON instantly
WebVTT vs JSON: Format Comparison
Key differences between the two formats
| Feature | WebVTT | JSON |
|---|---|---|
| Structure | Cue-based text blocks | Array of objects |
| Machine-readable | Requires VTT parser | Native in any language |
| API-friendly | Needs text parsing | Direct REST API consumption |
| Timing format | HH:MM:SS.mmm strings | Preserved as HH:MM:SS,mmm strings |
| Video player use | Direct as <track> src | Requires client rendering |
| Query/filter | Text processing needed | Use jq, JavaScript, Python directly |
| Best for | HTML5 video subtitle track | Subtitle APIs, data processing, databases |
When to Convert
Common scenarios where this conversion is useful
Build subtitle APIs
Convert VTT caption files to JSON to power subtitle API endpoints. Clients can request specific time ranges, search by text, or paginate cues as JSON objects — much easier than parsing VTT text client-side.
Import subtitles into databases
Store subtitle data in PostgreSQL, MongoDB, or Elasticsearch. JSON is the universal import format for structured data stores. Convert VTT to JSON to load subtitle content with indexable timing fields.
Custom subtitle rendering in web apps
Building a custom video player? Load subtitle data as JSON, query the current playback time, and render the matching cue. Much simpler than parsing VTT in the client application.
Subtitle search and analysis
Analyze caption data with jq, Python, or JavaScript. Find the longest cues, calculate words per minute, or search for specific dialogue — all trivial with JSON, but complex with raw VTT.
Content moderation pipelines
Run subtitle text through content moderation APIs that expect JSON input. Convert VTT captions to structured JSON, then pass to NLP services for classification, toxicity detection, or keyword extraction.
How to Convert WebVTT to JSON
-
1
Upload your VTT file
Drag and drop your WebVTT file onto the converter, or click browse to select it. The file is uploaded over an encrypted HTTPS connection.
-
2
Server-side conversion
The server parses each VTT cue block and produces a JSON array. Each element has an index (integer), start (timestamp string), end (timestamp string), and text (string) field. The WEBVTT header and NOTE blocks are ignored.
-
3
Download your JSON
Save the JSON file to your device. The server copy is automatically deleted. The output is a pretty-printed JSON array ready to import into any application.
Frequently Asked Questions
The output is a JSON array where each element represents one VTT cue: [{"index": 1, "start": "00:00:01,000", "end": "00:00:04,500", "text": "Hello world"}, ...]. Start and end times use the SRT-style HH:MM:SS,mmm format for consistency with the rest of the subtitle ecosystem.
No. VTT cue identifiers (optional label lines before the timestamp) are not included in the output JSON. Each cue gets a sequential integer index (1, 2, 3...) regardless of whether the VTT file had named identifiers.
No. Cue settings like 'line:', 'position:', 'align:', and 'region:' are stripped. The JSON output contains only timing and text. If you need position data, you will need to parse the VTT directly for those fields.
Multi-line VTT cue text is preserved as a single string with newline characters (\n) embedded. For example: {"text": "Line one\nLine two"}. You can split on \n in your application if you need separate lines.
Yes — this conversion runs server-side. Your file is uploaded over HTTPS, converted, and automatically deleted after download. No content is stored or retained.
Yes. NOTE blocks in VTT files are metadata comments, not cue content. They are automatically skipped during parsing. Only timed cue blocks appear in the JSON output.
Times are output as HH:MM:SS,mmm strings — the SRT-style format with a comma separator. For example: '00:01:23,456'. This matches the internal representation used across the subtitle processing pipeline.
Yes. The JSON array maps directly to a database table with columns: index (integer), start (text), end (text), text (text). Use a JSON import tool or a simple INSERT loop in your application to load the data.
Yes. YouTube auto-generated VTT files are valid WebVTT and convert cleanly. Note that YouTube auto-captions use word-level cues with very short durations — the JSON output will have many small entries corresponding to individual words.
Yes — use ChangeThisFile's JSON to SRT converter to go back to a timed subtitle format, or use SRT to VTT to reach WebVTT. The JSON structure is compatible with the reverse conversion pipeline.
Related Conversions
Related Tools
Free tools to edit, optimize, and manage your files.
Need to convert programmatically?
Use the ChangeThisFile API to convert WebVTT to JSON in your app. No rate limits, up to 500MB files, simple REST endpoint.
Ready to convert your file?
Convert WebVTT to JSON instantly — free, no signup required.
Start Converting