File conversion is one of the worst fits for synchronous serverless functions. A Lambda handler processing a large video with a sync API call will timeout, hit memory limits, or both. The fix is architectural: decouple the three operations — upload, convert, store — into separate functions that each do one thing quickly. This guide shows the complete serverless file conversion architecture, with working code for both Lambda (Python) and Cloudflare Workers (JavaScript).
TL;DR — the four-function architecture
- Upload function (200ms): Generates a presigned S3/R2 upload URL, returns it to the client. The client uploads directly to storage — the function never touches the file bytes.
- Submit function (1-2s): Reads the uploaded file URL, submits a
/v1/jobsrequest with the file URL and webhook callback. Stores job_id → user mapping. Returns 202 immediately. - Webhook receiver (100ms): Verifies signature, stores the result URL or downloads the output. Returns 200 immediately, processes in background.
- Status function (50ms): User polls this to check conversion status. Returns done/processing/failed + output URL.
Total user-visible latency: upload (client-side) + submit (~2s) + conversion (~5-30s depending on file) + webhook (~1s). The user sees a spinner; the functions never block.
The blocking pattern that breaks serverless
# Lambda — THIS BREAKS for files over 10MB or conversions over 30s
import json
import requests
def handler(event, context):
file_bytes = event['body'] # Assume base64 body
resp = requests.post(
'https://changethisfile.com/v1/convert',
headers={'Authorization': f'Bearer {API_KEY}'},
files={'file': file_bytes},
data={'target': 'jpg'},
timeout=240, # 4 min — Lambda max is 15 min but API GW cuts it to 29s
)
return {'statusCode': 200, 'body': resp.content} # Large response body hits Lambda limits
This fails because: (1) API Gateway has a 29-second integration timeout regardless of Lambda's 15-minute limit; (2) Lambda function responses have a 6MB payload limit — a large converted file can't be returned inline; (3) You're paying for the full Lambda duration even while waiting for the conversion engine.
Lambda architecture: presigned upload + async job
# Lambda function 1: generate upload URL
import boto3
import json
import os
import uuid
s3 = boto3.client('s3')
SESSIONS_TABLE = os.environ['SESSIONS_TABLE'] # DynamoDB
def generate_upload_url(event, context):
"""Returns presigned S3 upload URL. Client uploads directly."""
file_id = str(uuid.uuid4())
target = json.loads(event['body'])['target']
presigned = s3.generate_presigned_post(
Bucket=os.environ['UPLOAD_BUCKET'],
Key=f'uploads/{file_id}',
ExpiresIn=600,
)
# Store pending conversion
boto3.resource('dynamodb').Table(SESSIONS_TABLE).put_item(
Item={'file_id': file_id, 'target': target, 'status': 'awaiting_upload'}
)
return {
'statusCode': 200,
'body': json.dumps({'file_id': file_id, 'upload': presigned}),
}
# Lambda function 2: S3 trigger — submit conversion job when upload completes
import httpx
import boto3
import json
import os
CTF_API_KEY = os.environ['CTF_API_KEY']
WEBHOOK_URL = os.environ['WEBHOOK_URL']
def submit_conversion_job(event, context):
"""Triggered by S3 ObjectCreated event."""
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
file_id = key.split('/')[-1] # Extract file_id from key
# Get target format from DynamoDB
dynamodb = boto3.resource('dynamodb')
session = dynamodb.Table(os.environ['SESSIONS_TABLE']).get_item(
Key={'file_id': file_id}
)['Item']
# Generate presigned URL for API to fetch the file
presigned_get = boto3.client('s3').generate_presigned_url(
'get_object',
Params={'Bucket': bucket, 'Key': key},
ExpiresIn=3600,
)
# Submit async job — returns in <2s
resp = httpx.post(
'https://changethisfile.com/v1/jobs',
headers={'Authorization': f'Bearer {CTF_API_KEY}'},
data={
'file_url': presigned_get,
'target': session['target'],
'webhook_url': WEBHOOK_URL,
'webhook_metadata': json.dumps({'file_id': file_id}),
},
timeout=10,
)
resp.raise_for_status()
job_id = resp.json()['job_id']
# Update session
dynamodb.Table(os.environ['SESSIONS_TABLE']).update_item(
Key={'file_id': file_id},
UpdateExpression='SET status = :s, job_id = :j',
ExpressionAttributeValues={':s': 'processing', ':j': job_id},
)
# Lambda function 3: webhook receiver
import hashlib
import hmac
import json
import os
import boto3
import httpx
def receive_webhook(event, context):
body = event['body'].encode() if isinstance(event['body'], str) else event['body']
signature = event['headers'].get('x-ctf-signature', '')
# Verify signature
secret = os.environ['CTF_WEBHOOK_SECRET'].encode()
expected = 'sha256=' + hmac.new(secret, body, hashlib.sha256).hexdigest()
if not hmac.compare_digest(expected, signature):
return {'statusCode': 401, 'body': 'Unauthorized'}
payload = json.loads(body)
metadata = payload.get('metadata', {})
file_id = metadata.get('file_id')
if payload['state'] == 'done':
# Download result and store in S3
dl = httpx.get(payload['download_url'], timeout=300)
boto3.client('s3').put_object(
Bucket=os.environ['OUTPUT_BUCKET'],
Key=f'converted/{file_id}',
Body=dl.content,
)
boto3.resource('dynamodb').Table(os.environ['SESSIONS_TABLE']).update_item(
Key={'file_id': file_id},
UpdateExpression='SET status = :s, output_key = :k',
ExpressionAttributeValues={':s': 'done', ':k': f'converted/{file_id}'},
)
elif payload['state'] == 'failed':
boto3.resource('dynamodb').Table(os.environ['SESSIONS_TABLE']).update_item(
Key={'file_id': file_id},
UpdateExpression='SET status = :s, error = :e',
ExpressionAttributeValues={':s': 'failed', ':e': json.dumps(payload.get('error', {}))},
)
return {'statusCode': 200, 'body': json.dumps({'ok': True})}
Cloudflare Workers architecture
// Cloudflare Worker — complete file conversion service
// Three routes: /upload-url, /submit, /webhook/ctf
export default {
async fetch(request, env, ctx) {
const url = new URL(request.url);
if (url.pathname === '/upload-url' && request.method === 'POST') {
return handleUploadUrl(request, env);
}
if (url.pathname === '/submit' && request.method === 'POST') {
return handleSubmit(request, env);
}
if (url.pathname === '/webhook/ctf' && request.method === 'POST') {
return handleWebhook(request, env, ctx);
}
if (url.pathname === '/status' && request.method === 'GET') {
return handleStatus(request, env);
}
return new Response('Not found', { status: 404 });
},
};
async function handleUploadUrl(request, env) {
const { target, filename } = await request.json();
const fileId = crypto.randomUUID();
// Generate R2 presigned upload URL
const r2Key = `uploads/${fileId}/${filename}`;
// Note: R2 presigned URLs require Workers R2 binding
// Store pending session in KV
await env.SESSIONS_KV.put(
fileId,
JSON.stringify({ target, r2Key, status: 'awaiting_upload' }),
{ expirationTtl: 3600 },
);
return Response.json({ fileId, r2Key });
}
async function handleSubmit(request, env) {
const { fileId } = await request.json();
const session = JSON.parse(await env.SESSIONS_KV.get(fileId));
if (!session) return Response.json({ error: 'Session not found' }, { status: 404 });
// Fetch from R2 and submit to CTF
const r2Object = await env.CONVERSION_BUCKET.get(session.r2Key);
if (!r2Object) return Response.json({ error: 'File not uploaded yet' }, { status: 400 });
const form = new FormData();
form.append('file', await r2Object.arrayBuffer(), { filename: session.r2Key.split('/').pop() });
form.append('target', session.target);
form.append('webhook_url', `${env.WORKER_URL}/webhook/ctf`);
form.append('webhook_metadata', JSON.stringify({ fileId }));
const resp = await fetch('https://changethisfile.com/v1/jobs', {
method: 'POST',
headers: { Authorization: `Bearer ${env.CTF_API_KEY}` },
body: form,
});
const { job_id } = await resp.json();
await env.SESSIONS_KV.put(
fileId,
JSON.stringify({ ...session, jobId: job_id, status: 'processing' }),
{ expirationTtl: 86400 },
);
return Response.json({ fileId, jobId: job_id, status: 'processing' }, { status: 202 });
}
async function handleWebhook(request, env, ctx) {
const bodyBytes = await request.arrayBuffer();
const signature = request.headers.get('X-CTF-Signature') || '';
// Verify HMAC
const key = await crypto.subtle.importKey(
'raw', new TextEncoder().encode(env.CTF_WEBHOOK_SECRET),
{ name: 'HMAC', hash: 'SHA-256' }, false, ['sign']
);
const sig = await crypto.subtle.sign('HMAC', key, bodyBytes);
const hex = Array.from(new Uint8Array(sig)).map(b => b.toString(16).padStart(2, '0')).join('');
if (`sha256=${hex}` !== signature) return new Response('Unauthorized', { status: 401 });
const payload = JSON.parse(new TextDecoder().decode(bodyBytes));
const { fileId } = payload.metadata || {};
// Process in background — return 200 immediately
ctx.waitUntil((async () => {
const session = JSON.parse(await env.SESSIONS_KV.get(fileId) || 'null');
if (!session) return;
if (payload.state === 'done') {
const dl = await fetch(payload.download_url);
const outKey = `converted/${fileId}`;
await env.CONVERSION_BUCKET.put(outKey, await dl.arrayBuffer());
await env.SESSIONS_KV.put(
fileId,
JSON.stringify({ ...session, status: 'done', outputKey: outKey }),
{ expirationTtl: 86400 },
);
} else if (payload.state === 'failed') {
await env.SESSIONS_KV.put(
fileId,
JSON.stringify({ ...session, status: 'failed', error: payload.error }),
{ expirationTtl: 86400 },
);
}
})());
return Response.json({ ok: true });
}
async function handleStatus(request, env) {
const fileId = new URL(request.url).searchParams.get('fileId');
const session = JSON.parse(await env.SESSIONS_KV.get(fileId) || 'null');
if (!session) return Response.json({ error: 'Not found' }, { status: 404 });
const response = { fileId, status: session.status };
if (session.status === 'done') {
// Return presigned download URL
response.downloadUrl = `https://your-cdn.com/${session.outputKey}`;
}
return Response.json(response);
}
Cost and latency comparison
Serverless vs self-hosted conversion costs for 10,000 conversions/month (avg 5MB files):
| Option | Cost/month | Setup time | Maintenance |
|---|---|---|---|
| CTF API (Startup plan) | $99 | 1 day | Zero |
| Lambda + S3 + EC2 FFmpeg | $40-80 | 1 week | High (EC2, FFmpeg, security updates) |
| Self-hosted on VPS | $20-40 | 2 weeks | High (uptime, storage, codec support) |
The API cost premium buys zero maintenance and 690 supported format pairs. At under 50K conversions/month, it's cost-competitive with self-hosted after factoring in engineering time.
The submit-job-and-respond-immediately pattern works with any serverless runtime. The key insight is that file conversion is fundamentally async: the user doesn't need the result in the same HTTP response — they need it soon. Webhook + status polling gives you that with zero blocking. Free tier to test the async jobs flow.