Convert VTT to JSON Online Free

Convert WebVTT subtitle files to a structured JSON array. ChangeThisFile parses each VTT cue and outputs a JSON array of objects with index, start time, end time, and text fields.

By ChangeThisFile Team · Last updated: March 2026

Quick Answer

ChangeThisFile converts WebVTT files to JSON on a secure server. Each VTT cue becomes a JSON object with index, start, end, and text fields — output as a JSON array. Useful for building subtitle APIs or processing caption data programmatically. Files are auto-deleted after processing. Free, no signup.

Free No signup required Encrypted transfer · Auto-deleted Under 2 minutes Updated April 2026

Convert WebVTT to JSON

Drop your WebVTT file here to convert it instantly

Drag & drop your .vtt file here, or click to browse

Convert to JSON instantly

WebVTT vs JSON: Format Comparison

Key differences between the two formats

FeatureWebVTTJSON
StructureCue-based text blocksArray of objects
Machine-readableRequires VTT parserNative in any language
API-friendlyNeeds text parsingDirect REST API consumption
Timing formatHH:MM:SS.mmm stringsPreserved as HH:MM:SS,mmm strings
Video player useDirect as <track> srcRequires client rendering
Query/filterText processing neededUse jq, JavaScript, Python directly
Best forHTML5 video subtitle trackSubtitle APIs, data processing, databases

When to Convert

Common scenarios where this conversion is useful

Build subtitle APIs

Convert VTT caption files to JSON to power subtitle API endpoints. Clients can request specific time ranges, search by text, or paginate cues as JSON objects — much easier than parsing VTT text client-side.

Import subtitles into databases

Store subtitle data in PostgreSQL, MongoDB, or Elasticsearch. JSON is the universal import format for structured data stores. Convert VTT to JSON to load subtitle content with indexable timing fields.

Custom subtitle rendering in web apps

Building a custom video player? Load subtitle data as JSON, query the current playback time, and render the matching cue. Much simpler than parsing VTT in the client application.

Subtitle search and analysis

Analyze caption data with jq, Python, or JavaScript. Find the longest cues, calculate words per minute, or search for specific dialogue — all trivial with JSON, but complex with raw VTT.

Content moderation pipelines

Run subtitle text through content moderation APIs that expect JSON input. Convert VTT captions to structured JSON, then pass to NLP services for classification, toxicity detection, or keyword extraction.

How to Convert WebVTT to JSON

  1. 1

    Upload your VTT file

    Drag and drop your WebVTT file onto the converter, or click browse to select it. The file is uploaded over an encrypted HTTPS connection.

  2. 2

    Server-side conversion

    The server parses each VTT cue block and produces a JSON array. Each element has an index (integer), start (timestamp string), end (timestamp string), and text (string) field. The WEBVTT header and NOTE blocks are ignored.

  3. 3

    Download your JSON

    Save the JSON file to your device. The server copy is automatically deleted. The output is a pretty-printed JSON array ready to import into any application.

Frequently Asked Questions

The output is a JSON array where each element represents one VTT cue: [{"index": 1, "start": "00:00:01,000", "end": "00:00:04,500", "text": "Hello world"}, ...]. Start and end times use the SRT-style HH:MM:SS,mmm format for consistency with the rest of the subtitle ecosystem.

No. VTT cue identifiers (optional label lines before the timestamp) are not included in the output JSON. Each cue gets a sequential integer index (1, 2, 3...) regardless of whether the VTT file had named identifiers.

No. Cue settings like 'line:', 'position:', 'align:', and 'region:' are stripped. The JSON output contains only timing and text. If you need position data, you will need to parse the VTT directly for those fields.

Multi-line VTT cue text is preserved as a single string with newline characters (\n) embedded. For example: {"text": "Line one\nLine two"}. You can split on \n in your application if you need separate lines.

Yes — this conversion runs server-side. Your file is uploaded over HTTPS, converted, and automatically deleted after download. No content is stored or retained.

Yes. NOTE blocks in VTT files are metadata comments, not cue content. They are automatically skipped during parsing. Only timed cue blocks appear in the JSON output.

Times are output as HH:MM:SS,mmm strings — the SRT-style format with a comma separator. For example: '00:01:23,456'. This matches the internal representation used across the subtitle processing pipeline.

Yes. The JSON array maps directly to a database table with columns: index (integer), start (text), end (text), text (text). Use a JSON import tool or a simple INSERT loop in your application to load the data.

Yes. YouTube auto-generated VTT files are valid WebVTT and convert cleanly. Note that YouTube auto-captions use word-level cues with very short durations — the JSON output will have many small entries corresponding to individual words.

Yes — use ChangeThisFile's JSON to SRT converter to go back to a timed subtitle format, or use SRT to VTT to reach WebVTT. The JSON structure is compatible with the reverse conversion pipeline.

Related Conversions

Related Tools

Free tools to edit, optimize, and manage your files.

Need to convert programmatically?

Use the ChangeThisFile API to convert WebVTT to JSON in your app. No rate limits, up to 500MB files, simple REST endpoint.

View API Docs
Read our guides on file formats and conversion

Ready to convert your file?

Convert WebVTT to JSON instantly — free, no signup required.

Start Converting