What is the maximum file size for /v1/jobs?

500MB. This covers virtually all practical use cases including long-duration video files. For files over 500MB, split before submitting (video: scene detection split; PDF: pdftk burst).

How long does a job stay available for download?

The job result is available for 24 hours after completion. The download URL itself expires after 1 hour — if your URL has expired, poll the job status again to get a fresh signed URL.

Can I cancel a job after submitting?

Yes — DELETE /v1/jobs/{job_id}. Jobs in the queued state cancel immediately. Jobs already in processing may complete before the cancel takes effect.

What is the poll interval I should use?

5 seconds for most conversions. For large video files (>100MB), 10-15 seconds is appropriate since conversions take minutes. Polling faster than 5s wastes API calls without improving latency.

Streaming Large File Conversions: Async Jobs API + Webhooks

The sync /v1/convert endpoint works well up to 20MB and 30s. Beyond that — large videos, 100-page PDFs, bulky PowerPoint files — you need the async jobs API. It decouples submission from completion, supports files up to 500MB, and plays well with serverless environments that have short request timeouts. This guide covers the two consumption patterns: polling and webhooks.

TL;DR — when to use /v1/jobs

Use the async jobs endpoint when:

File is over 20MB (images, documents) or 100MB (video/audio)
Expected conversion time exceeds 30s (long video transcode, 50+ page PDF)
Your runtime has a request timeout (Lambda at 30s, Cloudflare Workers at 30s, most load balancers at 60s)
You want webhook push instead of polling

Stick with /v1/convert for everything else — it's simpler and faster (no polling overhead).

Jobs API limits: 500MB max file size. Jobs expire after 24 hours. Download URLs are signed and expire after 1 hour — download promptly or re-poll to get a fresh URL.

Sync endpoint failure modes at large scale

The sync endpoint hits three walls with large files:

Timeout: A 200MB MP4 → WebM conversion takes 90-180s. The API's max sync timeout is 120s. Lambda's max is 15 min — but API Gateway in front of Lambda cuts it to 30s. You get a 504 gateway timeout with no output.
Memory pressure: Streaming a 200MB file through your application, holding the response in memory, and piping it to S3 — all at the same time — requires ~600MB RAM peak. On a 1GB Lambda function, this leaves little headroom.
No visibility: A slow sync request is a black box. You don't know if it's 10% done or 90% done. If you kill it, you waste the conversion.

The async jobs API solves all three: you submit the file, disconnect, and poll (or receive a webhook) when ready.

Polling pattern: submit → poll → download

import asyncio
import os
from pathlib import Path

import httpx

API_KEY = os.environ['CTF_API_KEY']
JOBS_URL = 'https://changethisfile.com/v1/jobs'
HEADERS = {'Authorization': f'Bearer {API_KEY}'}


async def submit_job(client: httpx.AsyncClient, file_path: Path, target: str) -> str:
    """Submit a conversion job. Returns job_id."""
    content = file_path.read_bytes()
    resp = await client.post(
        JOBS_URL,
        headers=HEADERS,
        content=content,
        params={'target': target, 'filename': file_path.name},
        timeout=60,  # upload timeout only
    )
    resp.raise_for_status()
    return resp.json()['job_id']


async def poll_job(
    client: httpx.AsyncClient,
    job_id: str,
    poll_interval: float = 5.0,
    max_wait: float = 600.0,
) -> dict:
    """Poll until job completes. Returns final status dict."""
    waited = 0.0
    while waited < max_wait:
        resp = await client.get(
            f'{JOBS_URL}/{job_id}',
            headers=HEADERS,
            timeout=10,
        )
        resp.raise_for_status()
        status = resp.json()

        if status['state'] == 'done':
            return status
        elif status['state'] == 'failed':
            raise RuntimeError(f"Job {job_id} failed: {status.get('error', 'unknown')}")
        elif status['state'] in ('queued', 'processing'):
            # Optional: log progress
            pct = status.get('progress_pct', '?')
            print(f'Job {job_id}: {status["state"]} ({pct}%)')

        await asyncio.sleep(poll_interval)
        waited += poll_interval

    raise TimeoutError(f'Job {job_id} did not complete within {max_wait}s')


async def download_result(client: httpx.AsyncClient, status: dict, out_path: Path):
    """Download the completed job output."""
    dl_url = status['download_url']  # Pre-signed, expires in 1 hour
    resp = await client.get(dl_url, timeout=300)  # large files
    resp.raise_for_status()
    out_path.write_bytes(resp.content)


async def convert_large_file(file_path: Path, target: str) -> Path:
    out_path = file_path.with_suffix(f'.{target}')
    async with httpx.AsyncClient() as client:
        job_id = await submit_job(client, file_path, target)
        print(f'Submitted job {job_id}')
        status = await poll_job(client, job_id)
        await download_result(client, status, out_path)
    print(f'Saved to {out_path}')
    return out_path


if __name__ == '__main__':
    import sys
    path = Path(sys.argv[1])
    target_fmt = sys.argv[2]
    asyncio.run(convert_large_file(path, target_fmt))

For batches of large files, submit all jobs first, then poll them concurrently:

async def batch_large_files(files: list[Path], target: str):
    async with httpx.AsyncClient() as client:
        # Submit all jobs concurrently
        job_ids = await asyncio.gather(
            *[submit_job(client, f, target) for f in files]
        )
        print(f'Submitted {len(job_ids)} jobs')

        # Poll all jobs concurrently
        statuses = await asyncio.gather(
            *[poll_job(client, jid) for jid in job_ids]
        )

        # Download all results concurrently
        for file_path, status in zip(files, statuses):
            out_path = file_path.with_suffix(f'.{target}')
            await download_result(client, status, out_path)

Webhook pattern: no polling

If you control an HTTP server, register a webhook_url when submitting the job. The API POSTs the result to your URL when conversion completes — no polling required.

resp = await client.post(
    JOBS_URL,
    headers=HEADERS,
    content=file_bytes,
    params={
        'target': 'mp4',
        'filename': 'source.avi',
        'webhook_url': 'https://your-server.com/webhooks/ctf',
        'webhook_metadata': json.dumps({'user_id': user_id, 'file_id': file_id}),
    },
)

The webhook POST body:

{
  "job_id": "job_abc123",
  "state": "done",
  "download_url": "https://...",
  "metadata": {"user_id": "u_123", "file_id": "f_456"},
  "duration_ms": 12450
}

Validate the webhook signature (see the webhook guide) before processing. Download the file from download_url within 1 hour — signed URLs expire. See the webhook-based async conversion pattern guide for the full receiver implementation.

Pattern for serverless environments

Lambda, Cloudflare Workers, and similar runtimes have hard request timeouts (15 min Lambda, 30s CF Workers). The sync endpoint isn't viable for large files in these environments. The correct pattern:

Receive the upload request from the user
Save the file to S3/R2
Submit a job to /v1/jobs with a webhook_url pointing at your handler function
Return a 202 Accepted + job_id to the user
When the webhook fires, download the result from the signed URL and save back to storage
Update your database and notify the user (email, push, poll endpoint)

// Cloudflare Worker — submit endpoint
export default {
  async fetch(request, env) {
    const formData = await request.formData();
    const file = formData.get('file');
    const target = formData.get('target');

    // Submit async job
    const resp = await fetch('https://changethisfile.com/v1/jobs', {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${env.CTF_API_KEY}`,
      },
      body: (() => {
        const fd = new FormData();
        fd.append('file', file);
        fd.append('target', target);
        fd.append('webhook_url', `${env.WORKER_URL}/webhook/ctf`);
        return fd;
      })(),
    });

    const { job_id } = await resp.json();
    // Store job_id → user mapping in KV
    await env.JOBS_KV.put(job_id, JSON.stringify({ userId: 'u_123', target }));

    return Response.json({ job_id, status: 'processing' }, { status: 202 });
  },
};

Error handling and timeouts

Jobs can fail for three reasons: invalid file (4xx), conversion engine error (5xx job failure), or platform timeout (job expires after 24h without completing). Handle each:

status = resp.json()

if status['state'] == 'failed':
    error = status.get('error', {})
    code = error.get('code', 'unknown')
    if code == 'file_invalid':
        # Bad file — skip and log
        log_failure(job_id, 'corrupted_or_unsupported')
    elif code == 'engine_error':
        # Conversion engine crashed — retry once
        new_job = await submit_job(client, file_path, target)
    elif code == 'timeout':
        # File took too long (very large video) — try lower quality target
        log_failure(job_id, 'too_large_for_engine')

For jobs that simply expire (no activity for 24h), your polling loop will eventually raise TimeoutError. This happens when your poller crashes and doesn't resume. Add the job_id to a persistent store before polling so you can resume monitoring after restarts.

The async jobs API removes the hard walls that the sync endpoint hits: timeouts, memory pressure, and serverless function duration limits. For files that routinely exceed 20MB or 30s conversion time, the polling or webhook patterns here are the correct architecture. Free API keys include access to the jobs endpoint.

Streaming Large File Conversions: Async Jobs API and Webhooks