Why use curl inside Docker instead of an SDK?

curl is available in every Alpine/Debian/Ubuntu base image with a single apk/apt install. It introduces no language runtime dependency, no package manager, and no version conflicts. For a conversion pipeline container, curl is the right tool — no SDK needed.

What if I need to convert files larger than 50MB?

The /api/convert anonymous endpoint has a 50MB limit. The authenticated /v1/convert endpoint used here has higher limits on paid plans. For very large files (video, large archives), use the $99/mo or higher plan.

Can I run multiple conversion containers in parallel?

Yes. Use docker compose scale converter=3 (Compose v2: docker compose up --scale converter=3) with different input subdirectories per instance. Partition your input files across subdirectories before launching. The done log is per-container since each mounts its own state volume.

How do I handle different target formats per file?

Add a sidecar metadata file (.ctf extension) alongside each input file containing the target format, and read it in the script. Alternatively, use subdirectories per target format: input/pdf/, input/webp/ — and loop over each with a different TARGET_FORMAT.

Does the container need internet access?

Yes — it POSTs files to https://changethisfile.com. If your container runs in a VPC with egress restrictions, you need to allow outbound HTTPS (port 443) to changethisfile.com.

Dockerized File Conversion Pipeline: Dockerfile + docker-compose with curl

The standard advice for containerized media processing is: install FFmpeg, LibreOffice, and Calibre in your image. That image ends up at 2-4GB, takes 10+ minutes to build, and requires patching when CVEs drop in those tools. If you're converting files as part of a pipeline — not in the hot path of a web request — there's a better approach: put only curl in the image, and delegate the actual conversion to an API.

The result is a ~20MB Alpine image that builds in under 30 seconds and has zero conversion-engine attack surface. The tradeoff is network latency per file and a dependency on the external API — which is fine for batch pipelines but not for sub-second interactive conversion.

TL;DR

Three files: a Dockerfile (Alpine + curl + bash), a conversion script, and a docker-compose.yml. Mount ./input and ./output as volumes. Inject the API key via environment variable. Run with docker compose up.

# Run one batch
docker compose up --abort-on-container-exit

# Run as a daemon that restarts every 5 minutes
docker compose up -d

The use case

This pattern fits CI/CD pipelines and batch processing jobs where you need file conversion as a step in a larger workflow:

A GitHub Actions job that converts design assets (PNG exports from Figma) to WebP before deploying a site
A data pipeline that normalizes uploaded documents (DOCX, ODS, RTF) to PDF before archiving
A media processing step in a Docker Compose stack that converts raw video uploads to MP4 before the transcription service picks them up
A scheduled container that processes an S3-synced input directory nightly

In all of these cases, you want the conversion logic isolated in a container with a clean interface: files in one directory come out converted in another. The container shouldn't need OS-level media tools — those are a maintenance burden, and the ChangeThisFile API handles 690 routes including all the FFmpeg, LibreOffice, and Calibre routes.

Note: this pattern is for batch/async jobs. If you need conversion inline with a web request (sub-second response required), a direct API call from your app server is more appropriate than spinning up a container.

Dockerfile, conversion script, and docker-compose

1. Dockerfile

# Dockerfile
# Minimal Alpine image with curl + bash only.
# No FFmpeg, no LibreOffice — conversion delegated to ChangeThisFile API.
FROM alpine:3.19

RUN apk add --no-cache bash curl

WORKDIR /app
COPY convert.sh /app/convert.sh
RUN chmod +x /app/convert.sh

# Volumes for input/output (also settable via env)
VOLUME ["/input", "/output"]

# State directory for done log and logs
RUN mkdir -p /state /logs

CMD ["/app/convert.sh"]

2. Conversion script — save as convert.sh alongside the Dockerfile:

#!/usr/bin/env bash
# convert.sh — Runs inside the container.
# Loops over /input, converts each file via the ChangeThisFile API,
# writes results to /output.

set -euo pipefail

# ---- config (all injectable via env) ----------------------------------------
API_KEY="${CTF_API_KEY:?CTF_API_KEY not set}"
INPUT_DIR="${INPUT_DIR:-/input}"
OUTPUT_DIR="${OUTPUT_DIR:-/output}"
TARGET_FORMAT="${TARGET_FORMAT:-pdf}"
API_URL="https://changethisfile.com/v1/convert"
DONE_LOG="/state/converted.log"
LOG="/logs/convert.log"
RUN_ONCE="${RUN_ONCE:-true}"  # Set to "false" to loop continuously
LOOP_INTERVAL="${LOOP_INTERVAL:-300}"  # Seconds between loops
# ------------------------------------------------------------------------------

log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*" | tee -a "$LOG"; }

mkdir -p "$OUTPUT_DIR" "$(dirname "$DONE_LOG")" "$(dirname "$LOG")"
touch "$DONE_LOG"

run_conversion_pass() {
  local converted=0 skipped=0 failed=0

  for f in "$INPUT_DIR"/*; do
    [[ -f "$f" ]] || continue
    local filename stem out http_status
    filename=$(basename "$f")
    stem="${filename%.*}"
    out="$OUTPUT_DIR/${stem}.${TARGET_FORMAT}"

    if grep -qF "$filename" "$DONE_LOG" 2>/dev/null; then
      log "SKIP $filename"
      ((skipped++)) || true
      continue
    fi

    log "CONVERT $filename -> ${stem}.${TARGET_FORMAT}"

    http_status=$(curl -sf \
      --max-time 180 \
      --retry 3 \
      --retry-delay 5 \
      --retry-connrefused \
      -w "%{http_code}" \
      -o "$out" \
      -H "Authorization: Bearer $API_KEY" \
      -F "file=@$f" \
      -F "target=$TARGET_FORMAT" \
      "$API_URL") || {
      log "ERROR curl failed for $filename"
      ((failed++)) || true
      continue
    }

    if [[ "$http_status" == "200" ]]; then
      echo "$filename" >> "$DONE_LOG"
      out_size=$(stat -c%s "$out" 2>/dev/null || echo "unknown")
      log "OK $filename -> ${stem}.${TARGET_FORMAT} (${out_size} bytes)"
      ((converted++)) || true
    else
      rm -f "$out"
      log "ERROR HTTP $http_status for $filename"
      ((failed++)) || true
    fi
  done

  log "PASS DONE: converted=$converted skipped=$skipped failed=$failed"
  return "$([[ $failed -eq 0 ]] && echo 0 || echo 1)"
}

if [[ "$RUN_ONCE" == "true" ]]; then
  run_conversion_pass
else
  log "Starting continuous mode (interval: ${LOOP_INTERVAL}s)"
  while true; do
    run_conversion_pass || true  # Don't exit loop on failure
    log "Sleeping ${LOOP_INTERVAL}s..."
    sleep "$LOOP_INTERVAL"
  done
fi

3. docker-compose.yml

services:
  converter:
    build: .
    image: ctf-converter:latest
    restart: "no"  # Change to "unless-stopped" for daemon mode

    environment:
      CTF_API_KEY: "${CTF_API_KEY}"
      TARGET_FORMAT: "${TARGET_FORMAT:-pdf}"
      RUN_ONCE: "true"

    volumes:
      # Input files to convert
      - ./input:/input:ro
      # Converted output files
      - ./output:/output
      # Persistent done log (survives container restarts)
      - ctf-state:/state
      # Logs (inspect with: docker compose logs or tail ./logs/convert.log)
      - ./logs:/logs

    # Optional: resource limits to prevent runaway containers
    deploy:
      resources:
        limits:
          memory: 256M
          cpus: '0.5'

volumes:
  ctf-state:

Create a .env file (not committed to version control):

CTF_API_KEY=ctf_sk_your_key_here
TARGET_FORMAT=pdf

Build and run:

# Build the image
docker compose build

# Drop files into ./input, then run one conversion pass
docker compose up --abort-on-container-exit

# Check output
ls ./output/

Error handling and retry strategy

The --retry 3 --retry-delay 5 --retry-connrefused flags on the curl call handle transient network errors without any shell-level loop. curl's built-in retry handles connection failures and 5xx responses automatically.

For more control, replace the curl flags with an explicit retry function:

api_convert() {
  local input_file="$1" output_file="$2" target="$3"
  local attempt max_attempts=4 delay=5

  for attempt in $(seq 1 $max_attempts); do
    local status
    status=$(curl -sf \
      --max-time 180 \
      -w "%{http_code}" \
      -o "$output_file" \
      -H "Authorization: Bearer $API_KEY" \
      -F "file=@$input_file" \
      -F "target=$target" \
      "$API_URL") && [[ "$status" == "200" ]] && return 0

    log "Attempt $attempt/$max_attempts failed (status: ${status:-curl-error}) for $(basename "$input_file")"
    rm -f "$output_file"

    [[ $attempt -lt $max_attempts ]] && sleep $((delay * attempt))
  done

  return 1
}

Exit codes matter for CI integration. The script exits non-zero if any conversions failed, which causes docker compose up --abort-on-container-exit to propagate a non-zero exit to the calling shell — your CI step fails cleanly.

For daemon mode (RUN_ONCE=false), failures are logged but the loop continues. Use Docker's healthcheck to surface persistent failures:

healthcheck:
  test: ["CMD", "bash", "-c", "[[ $(tail -1 /logs/convert.log) != *ERROR* ]]"]
  interval: 5m
  timeout: 10s
  retries: 3
  start_period: 30s

Scheduling and orchestration patterns

Three common scheduling patterns for this container:

1. One-shot in CI (GitHub Actions)

- name: Convert assets
  run: |
    docker compose run --rm \
      -e CTF_API_KEY=${{ secrets.CTF_API_KEY }} \
      -e TARGET_FORMAT=webp \
      converter

2. Cron-triggered via docker compose run

# crontab -e
0 3 * * * cd /opt/converter && docker compose run --rm converter >> /var/log/ctf-cron.log 2>&1

3. Continuous daemon with sidecar
Set RUN_ONCE=false and LOOP_INTERVAL=300 for a container that polls every 5 minutes. Pair with a watcher sidecar if you need sub-minute latency:

services:
  converter:
    build: .
    environment:
      CTF_API_KEY: "${CTF_API_KEY}"
      RUN_ONCE: "false"
      LOOP_INTERVAL: "60"
    restart: unless-stopped
    volumes:
      - shared-input:/input:ro
      - ./output:/output
      - ctf-state:/state

  # Upstream service that populates /input
  ingest:
    image: your-ingest-service:latest
    volumes:
      - shared-input:/input

volumes:
  shared-input:
  ctf-state:

The shared-input named volume lets the ingest service write files that the converter picks up. No network calls between containers needed.

Production tips

The image is ~20MB. Alpine + bash + curl. Compare this to a full FFmpeg image (~600MB) or LibreOffice image (~1.5GB). Build time is 15-30 seconds vs 5-10 minutes. If your pipeline builds images on every CI run, this difference compounds fast.
Mount input as read-only (:ro). The conversion script never needs to write to the input directory — it only reads source files. Read-only mounts prevent accidental input directory corruption if a bug writes to the wrong path.
The done log lives in a named volume, not the output directory. This ensures it persists across container re-creates (docker compose down && docker compose up). If you stored it in the output directory, a docker compose down -v would wipe it.
Secret injection via .env, never in the image. The CTF_API_KEY is passed at runtime via the .env file or CI secrets. It's never baked into the image layer — docker history won't expose it.
Free tier: 1,000 conversions/month. A batch pipeline processing 30-40 files/day stays within the free tier. For high-volume pipelines, the $99/mo plan gives 100K conversions. At $0.001/conversion, it's cheaper than running a dedicated LibreOffice instance on a $20/mo VPS.

A curl-only Docker image eliminates the 2-4GB FFmpeg/LibreOffice maintenance burden while supporting 690 conversion routes via the ChangeThisFile API. The pattern scales from one-shot CI steps to continuously-running conversion daemons with no architecture change — just flip RUN_ONCE. Get a free API key to wire it up today.

Dockerized File Conversion Pipeline: Dockerfile + docker-compose Pattern