The standard advice for containerized media processing is: install FFmpeg, LibreOffice, and Calibre in your image. That image ends up at 2-4GB, takes 10+ minutes to build, and requires patching when CVEs drop in those tools. If you're converting files as part of a pipeline — not in the hot path of a web request — there's a better approach: put only curl in the image, and delegate the actual conversion to an API.
The result is a ~20MB Alpine image that builds in under 30 seconds and has zero conversion-engine attack surface. The tradeoff is network latency per file and a dependency on the external API — which is fine for batch pipelines but not for sub-second interactive conversion.
TL;DR
Three files: a Dockerfile (Alpine + curl + bash), a conversion script, and a docker-compose.yml. Mount ./input and ./output as volumes. Inject the API key via environment variable. Run with docker compose up.
# Run one batch
docker compose up --abort-on-container-exit
# Run as a daemon that restarts every 5 minutes
docker compose up -d
The use case
This pattern fits CI/CD pipelines and batch processing jobs where you need file conversion as a step in a larger workflow:
- A GitHub Actions job that converts design assets (PNG exports from Figma) to WebP before deploying a site
- A data pipeline that normalizes uploaded documents (DOCX, ODS, RTF) to PDF before archiving
- A media processing step in a Docker Compose stack that converts raw video uploads to MP4 before the transcription service picks them up
- A scheduled container that processes an S3-synced input directory nightly
In all of these cases, you want the conversion logic isolated in a container with a clean interface: files in one directory come out converted in another. The container shouldn't need OS-level media tools — those are a maintenance burden, and the ChangeThisFile API handles 690 routes including all the FFmpeg, LibreOffice, and Calibre routes.
Note: this pattern is for batch/async jobs. If you need conversion inline with a web request (sub-second response required), a direct API call from your app server is more appropriate than spinning up a container.
Dockerfile, conversion script, and docker-compose
1. Dockerfile
# Dockerfile
# Minimal Alpine image with curl + bash only.
# No FFmpeg, no LibreOffice — conversion delegated to ChangeThisFile API.
FROM alpine:3.19
RUN apk add --no-cache bash curl
WORKDIR /app
COPY convert.sh /app/convert.sh
RUN chmod +x /app/convert.sh
# Volumes for input/output (also settable via env)
VOLUME ["/input", "/output"]
# State directory for done log and logs
RUN mkdir -p /state /logs
CMD ["/app/convert.sh"]
2. Conversion script — save as convert.sh alongside the Dockerfile:
#!/usr/bin/env bash
# convert.sh — Runs inside the container.
# Loops over /input, converts each file via the ChangeThisFile API,
# writes results to /output.
set -euo pipefail
# ---- config (all injectable via env) ----------------------------------------
API_KEY="${CTF_API_KEY:?CTF_API_KEY not set}"
INPUT_DIR="${INPUT_DIR:-/input}"
OUTPUT_DIR="${OUTPUT_DIR:-/output}"
TARGET_FORMAT="${TARGET_FORMAT:-pdf}"
API_URL="https://changethisfile.com/v1/convert"
DONE_LOG="/state/converted.log"
LOG="/logs/convert.log"
RUN_ONCE="${RUN_ONCE:-true}" # Set to "false" to loop continuously
LOOP_INTERVAL="${LOOP_INTERVAL:-300}" # Seconds between loops
# ------------------------------------------------------------------------------
log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*" | tee -a "$LOG"; }
mkdir -p "$OUTPUT_DIR" "$(dirname "$DONE_LOG")" "$(dirname "$LOG")"
touch "$DONE_LOG"
run_conversion_pass() {
local converted=0 skipped=0 failed=0
for f in "$INPUT_DIR"/*; do
[[ -f "$f" ]] || continue
local filename stem out http_status
filename=$(basename "$f")
stem="${filename%.*}"
out="$OUTPUT_DIR/${stem}.${TARGET_FORMAT}"
if grep -qF "$filename" "$DONE_LOG" 2>/dev/null; then
log "SKIP $filename"
((skipped++)) || true
continue
fi
log "CONVERT $filename -> ${stem}.${TARGET_FORMAT}"
http_status=$(curl -sf \
--max-time 180 \
--retry 3 \
--retry-delay 5 \
--retry-connrefused \
-w "%{http_code}" \
-o "$out" \
-H "Authorization: Bearer $API_KEY" \
-F "file=@$f" \
-F "target=$TARGET_FORMAT" \
"$API_URL") || {
log "ERROR curl failed for $filename"
((failed++)) || true
continue
}
if [[ "$http_status" == "200" ]]; then
echo "$filename" >> "$DONE_LOG"
out_size=$(stat -c%s "$out" 2>/dev/null || echo "unknown")
log "OK $filename -> ${stem}.${TARGET_FORMAT} (${out_size} bytes)"
((converted++)) || true
else
rm -f "$out"
log "ERROR HTTP $http_status for $filename"
((failed++)) || true
fi
done
log "PASS DONE: converted=$converted skipped=$skipped failed=$failed"
return "$([[ $failed -eq 0 ]] && echo 0 || echo 1)"
}
if [[ "$RUN_ONCE" == "true" ]]; then
run_conversion_pass
else
log "Starting continuous mode (interval: ${LOOP_INTERVAL}s)"
while true; do
run_conversion_pass || true # Don't exit loop on failure
log "Sleeping ${LOOP_INTERVAL}s..."
sleep "$LOOP_INTERVAL"
done
fi
3. docker-compose.yml
services:
converter:
build: .
image: ctf-converter:latest
restart: "no" # Change to "unless-stopped" for daemon mode
environment:
CTF_API_KEY: "${CTF_API_KEY}"
TARGET_FORMAT: "${TARGET_FORMAT:-pdf}"
RUN_ONCE: "true"
volumes:
# Input files to convert
- ./input:/input:ro
# Converted output files
- ./output:/output
# Persistent done log (survives container restarts)
- ctf-state:/state
# Logs (inspect with: docker compose logs or tail ./logs/convert.log)
- ./logs:/logs
# Optional: resource limits to prevent runaway containers
deploy:
resources:
limits:
memory: 256M
cpus: '0.5'
volumes:
ctf-state:
Create a .env file (not committed to version control):
CTF_API_KEY=ctf_sk_your_key_here
TARGET_FORMAT=pdf
Build and run:
# Build the image
docker compose build
# Drop files into ./input, then run one conversion pass
docker compose up --abort-on-container-exit
# Check output
ls ./output/
Error handling and retry strategy
The --retry 3 --retry-delay 5 --retry-connrefused flags on the curl call handle transient network errors without any shell-level loop. curl's built-in retry handles connection failures and 5xx responses automatically.
For more control, replace the curl flags with an explicit retry function:
api_convert() {
local input_file="$1" output_file="$2" target="$3"
local attempt max_attempts=4 delay=5
for attempt in $(seq 1 $max_attempts); do
local status
status=$(curl -sf \
--max-time 180 \
-w "%{http_code}" \
-o "$output_file" \
-H "Authorization: Bearer $API_KEY" \
-F "file=@$input_file" \
-F "target=$target" \
"$API_URL") && [[ "$status" == "200" ]] && return 0
log "Attempt $attempt/$max_attempts failed (status: ${status:-curl-error}) for $(basename "$input_file")"
rm -f "$output_file"
[[ $attempt -lt $max_attempts ]] && sleep $((delay * attempt))
done
return 1
}
Exit codes matter for CI integration. The script exits non-zero if any conversions failed, which causes docker compose up --abort-on-container-exit to propagate a non-zero exit to the calling shell — your CI step fails cleanly.
For daemon mode (RUN_ONCE=false), failures are logged but the loop continues. Use Docker's healthcheck to surface persistent failures:
healthcheck:
test: ["CMD", "bash", "-c", "[[ $(tail -1 /logs/convert.log) != *ERROR* ]]"]
interval: 5m
timeout: 10s
retries: 3
start_period: 30s
Scheduling and orchestration patterns
Three common scheduling patterns for this container:
1. One-shot in CI (GitHub Actions)
- name: Convert assets
run: |
docker compose run --rm \
-e CTF_API_KEY=${{ secrets.CTF_API_KEY }} \
-e TARGET_FORMAT=webp \
converter
2. Cron-triggered via docker compose run
# crontab -e
0 3 * * * cd /opt/converter && docker compose run --rm converter >> /var/log/ctf-cron.log 2>&1
3. Continuous daemon with sidecar
Set RUN_ONCE=false and LOOP_INTERVAL=300 for a container that polls every 5 minutes. Pair with a watcher sidecar if you need sub-minute latency:
services:
converter:
build: .
environment:
CTF_API_KEY: "${CTF_API_KEY}"
RUN_ONCE: "false"
LOOP_INTERVAL: "60"
restart: unless-stopped
volumes:
- shared-input:/input:ro
- ./output:/output
- ctf-state:/state
# Upstream service that populates /input
ingest:
image: your-ingest-service:latest
volumes:
- shared-input:/input
volumes:
shared-input:
ctf-state:
The shared-input named volume lets the ingest service write files that the converter picks up. No network calls between containers needed.
Production tips
- The image is ~20MB. Alpine + bash + curl. Compare this to a full FFmpeg image (~600MB) or LibreOffice image (~1.5GB). Build time is 15-30 seconds vs 5-10 minutes. If your pipeline builds images on every CI run, this difference compounds fast.
- Mount input as read-only (
:ro). The conversion script never needs to write to the input directory — it only reads source files. Read-only mounts prevent accidental input directory corruption if a bug writes to the wrong path. - The done log lives in a named volume, not the output directory. This ensures it persists across container re-creates (
docker compose down && docker compose up). If you stored it in the output directory, adocker compose down -vwould wipe it. - Secret injection via .env, never in the image. The
CTF_API_KEYis passed at runtime via the.envfile or CI secrets. It's never baked into the image layer —docker historywon't expose it. - Free tier: 1,000 conversions/month. A batch pipeline processing 30-40 files/day stays within the free tier. For high-volume pipelines, the $99/mo plan gives 100K conversions. At $0.001/conversion, it's cheaper than running a dedicated LibreOffice instance on a $20/mo VPS.
A curl-only Docker image eliminates the 2-4GB FFmpeg/LibreOffice maintenance burden while supporting 690 conversion routes via the ChangeThisFile API. The pattern scales from one-shot CI steps to continuously-running conversion daemons with no architecture change — just flip RUN_ONCE. Get a free API key to wire it up today.