Do I need to install requests or any other library?

No. The function uses only Python's built-in urllib.request for the API call and boto3 for S3 operations. boto3 is included in every Lambda Python runtime. The deployment package is the single lambda_function.py file — no pip install needed.

How do I handle S3 event notification delays?

S3 event notifications are typically delivered within seconds of a PUT completing. For time-sensitive pipelines, note that S3 event delivery is eventually consistent and can occasionally have delays of up to a few minutes. If you need guaranteed sub-minute latency, use SQS with a short visibility timeout rather than direct Lambda invocation.

Can I filter to only convert specific file types?

Two layers of filtering: S3 event suffix filters (e.g., .docx) and the SOURCE_EXT_FILTER Lambda environment variable. S3 suffix filters are more efficient because they prevent the Lambda from being invoked at all for non-matching files. The Lambda-level filter is useful when you need multiple-extension matching that S3 suffix filters don't support (S3 only allows one suffix per configuration).

What happens if the output bucket doesn't exist?

The s3.put_object call will fail with a NoSuchBucket error, which will be logged and re-raised, triggering Lambda's retry logic. Create the output bucket before deploying the function.

How do I test the Lambda locally before deploying?

Use a mock S3 event payload. The Records structure is well-documented in AWS docs. You can invoke locally with: python3 -c "import lambda_function; lambda_function.lambda_handler({'Records': [{'s3': {'bucket': {'name': 'test-bucket'}, 'object': {'key': 'uploads/test.docx'}}}]}, None)". Requires valid CTF_API_KEY, AWS credentials, and the test file in S3.

S3 Event → Lambda → ChangeThisFile API: Automated File Conversion Pipeline

S3 event notifications are one of the cleanest serverless triggers available: drop a file, something runs. The standard pattern for format conversion uses S3 → SNS/SQS → Lambda with an FFmpeg Lambda layer or a container image. That's 2-4GB of Lambda deployment package, cold start times measured in seconds, and a Lambda layer that needs updating whenever FFmpeg releases a CVE fix.

An alternative: a 20KB Lambda function that calls the ChangeThisFile API. Same event-driven trigger, same S3-to-S3 result flow, but the Lambda deployment is tiny and there are no native media tools to maintain. The tradeoff is an outbound HTTPS call per conversion — Lambda pays egress, you pay for API conversions.

TL;DR

S3 PUT → Lambda (Python) → ChangeThisFile API → S3 PUT (converted result). The Lambda function is ~60 lines of Python. Deployment package is ~20KB. No Lambda layers needed.

# Deploy with AWS CLI
zip function.zip lambda_function.py
aws lambda update-function-code \
  --function-name ctf-converter \
  --zip-file fileb://function.zip

The use case

S3-based conversion pipelines appear in:

Document ingest pipelines. Users upload DOCX/ODS/RTF to s3://company-uploads/raw/. Lambda converts to PDF, writes to s3://company-uploads/processed/. Downstream indexers only consume PDFs.
Image optimization for CDN delivery. Build pipeline pushes PNG assets to s3://assets-input/. Lambda converts to WebP, writes to s3://assets-cdn/webp/. CloudFront serves from the output bucket.
Media format normalization. Audio recordings land as WAV or AIFF in an S3 bucket. Lambda normalizes to MP3 for podcast RSS feeds and streaming.
Automated ebook format conversion. Authors upload EPUB manuscripts to S3. Lambda converts to MOBI, AZW3, and PDF for multi-format distribution.

The ChangeThisFile API covers all 690 routes, so a single Lambda function handles the entire range of formats your pipeline might encounter. You just change the target parameter.

Lambda function, IAM policy, and S3 event configuration

1. Lambda function — save as lambda_function.py:

"""lambda_function.py

S3 PUT → ChangeThisFile API → S3 PUT (converted result)

Environment variables:
  CTF_API_KEY       - ChangeThisFile API key (required)
  TARGET_FORMAT     - Output format, e.g. pdf, webp, mp3 (default: pdf)
  OUTPUT_BUCKET     - Destination S3 bucket (default: same as input bucket)
  OUTPUT_PREFIX     - Key prefix for converted files (default: converted/)
  SOURCE_EXT_FILTER - Comma-separated list of extensions to process (default: all)
                      e.g. "docx,odt,rtf" to only convert document files
"""

import json
import logging
import os
import urllib.request
import urllib.error
from io import BytesIO

import boto3

logger = logging.getLogger()
logger.setLevel(logging.INFO)

APIURl = "https://changethisfile.com/v1/convert"


def get_env(key: str, required: bool = True, default: str = "") -> str:
    val = os.environ.get(key, default)
    if required and not val:
        raise ValueError(f"Environment variable {key} is not set")
    return val


def build_multipart(file_bytes: bytes, filename: str, target: str):
    """Build a multipart/form-data body without external libraries."""
    boundary = b"----CTFBoundary7MA4YWxkTrZu0gW"

    def part(name: str, value: bytes, fname: str = "") -> bytes:
        header = (
            f'--{boundary.decode()}\r\n'
            f'Content-Disposition: form-data; name="{name}"'
        )
        if fname:
            header += f'; filename="{fname}"'
        header += "\r\nContent-Type: application/octet-stream\r\n\r\n"
        return header.encode() + value + b"\r\n"

    body = (
        part("file", file_bytes, fname=filename) +
        part("target", target.encode()) +
        f"--{boundary.decode()}--\r\n".encode()
    )
    content_type = f"multipart/form-data; boundary={boundary.decode()}"
    return body, content_type


def convert_file(file_bytes: bytes, filename: str, target: str, api_key: str) -> bytes:
    """POST file to ChangeThisFile API, return converted bytes."""
    body, content_type = build_multipart(file_bytes, filename, target)

    req = urllib.request.Request(
        APIURl,
        data=body,
        headers={
            "Authorization": f"Bearer {api_key}",
            "Content-Type": content_type,
        },
        method="POST",
    )

    with urllib.request.urlopen(req, timeout=180) as response:
        if response.status != 200:
            raise RuntimeError(f"API returned HTTP {response.status}")
        return response.read()


def lambda_handler(event, context):
    api_key = get_env("CTF_API_KEY")
    target_format = get_env("TARGET_FORMAT", required=False, default="pdf")
    output_prefix = get_env("OUTPUT_PREFIX", required=False, default="converted/")
    source_ext_filter_raw = get_env("SOURCE_EXT_FILTER", required=False)
    source_ext_filter = (
        {e.strip().lower() for e in source_ext_filter_raw.split(",")}
        if source_ext_filter_raw else set()
    )

    s3 = boto3.client("s3")
    results = []

    for record in event.get("Records", []):
        bucket = record["s3"]["bucket"]["name"]
        key = record["s3"]["object"]["key"]
        filename = key.split("/")[-1]
        stem, ext = (filename.rsplit(".", 1) + [""])[:2]
        ext = ext.lower()

        # Apply extension filter
        if source_ext_filter and ext not in source_ext_filter:
            logger.info("Skipping %s (extension %s not in filter)", key, ext)
            continue

        output_bucket = get_env("OUTPUT_BUCKET", required=False, default=bucket)
        output_key = f"{output_prefix.rstrip('/')}/{stem}.{target_format}"

        logger.info("Processing s3://%s/%s -> s3://%s/%s",
                    bucket, key, output_bucket, output_key)

        try:
            # Download source file from S3
            response = s3.get_object(Bucket=bucket, Key=key)
            file_bytes = response["Body"].read()
            logger.info("Downloaded %d bytes from s3://%s/%s", len(file_bytes), bucket, key)

            # Convert via ChangeThisFile API
            converted_bytes = convert_file(file_bytes, filename, target_format, api_key)
            logger.info("Converted %s: %d bytes -> %d bytes",
                        filename, len(file_bytes), len(converted_bytes))

            # Upload converted file to S3
            s3.put_object(
                Bucket=output_bucket,
                Key=output_key,
                Body=converted_bytes,
                # Optional: tag for traceability
                Tagging=f"source-bucket={bucket}&source-key={key}&converted-from={ext}&converted-to={target_format}",
            )
            logger.info("Uploaded converted file to s3://%s/%s", output_bucket, output_key)

            results.append({"status": "ok", "source": key, "output": output_key})

        except urllib.error.HTTPError as e:
            logger.error("API HTTP error %d for %s: %s", e.code, key, e.read())
            results.append({"status": "error", "source": key, "error": f"HTTP {e.code}"})
            raise  # Re-raise to trigger Lambda retry from SQS/SNS

        except Exception as e:
            logger.error("Error processing %s: %s", key, str(e))
            results.append({"status": "error", "source": key, "error": str(e)})
            raise

    return {"statusCode": 200, "body": json.dumps(results)}

2. IAM policy for the Lambda execution role

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": "arn:aws:s3:::your-input-bucket/*"
    },
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject"],
      "Resource": "arn:aws:s3:::your-output-bucket/converted/*"
    },
    {
      "Effect": "Allow",
      "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"],
      "Resource": "arn:aws:logs:*:*:*"
    }
  ]
}

3. Deploy and configure S3 event notification via AWS CLI

# Create the Lambda function
zip function.zip lambda_function.py

aws lambda create-function \
  --function-name ctf-converter \
  --runtime python3.12 \
  --handler lambda_function.lambda_handler \
  --role arn:aws:iam::YOUR_ACCOUNT_ID:role/ctf-converter-role \
  --zip-file fileb://function.zip \
  --timeout 300 \
  --memory-size 512 \
  --environment Variables="{CTF_API_KEY=ctf_sk_your_key_here,TARGET_FORMAT=pdf,OUTPUT_BUCKET=your-output-bucket,OUTPUT_PREFIX=converted}"

# Grant S3 permission to invoke Lambda
aws lambda add-permission \
  --function-name ctf-converter \
  --statement-id s3-trigger \
  --action lambda:InvokeFunction \
  --principal s3.amazonaws.com \
  --source-arn arn:aws:s3:::your-input-bucket \
  --source-account YOUR_ACCOUNT_ID

# Configure S3 event notification
aws s3api put-bucket-notification-configuration \
  --bucket your-input-bucket \
  --notification-configuration '{
    "LambdaFunctionConfigurations": [{
      "LambdaFunctionArn": "arn:aws:lambda:us-east-1:YOUR_ACCOUNT_ID:function:ctf-converter",
      "Events": ["s3:ObjectCreated:Put"],
      "Filter": {
        "Key": {
          "FilterRules": [{"Name": "prefix", "Value": "uploads/"}]
        }
      }
    }]
  }'

Optional: Store API key in Parameter Store (more secure than Lambda env vars):

# Store the key
aws ssm put-parameter \
  --name /ctf/api-key \
  --value ctf_sk_your_key_here \
  --type SecureString

# In lambda_function.py, replace the get_env call with:
import boto3
ssm = boto3.client('ssm')
def get_api_key() -> str:
    response = ssm.get_parameter(Name='/ctf/api-key', WithDecryption=True)
    return response['Parameter']['Value']

Error handling and Lambda retry behavior

Lambda retries matter here. If the function raises an exception:

Direct S3 trigger: Lambda retries the event 2 more times automatically. The third failure drops the event (configure a Dead Letter Queue to capture it).
Via SQS: Lambda returns the message to the queue. The queue's VisibilityTimeout and MaxReceiveCount control retry behavior before the message goes to the DLQ.
Via SNS: SNS retries delivery to Lambda 3 times. After that, SNS moves it to the subscription DLQ if configured.

The function re-raises exceptions after logging them, which is intentional — it lets Lambda's native retry machinery handle the retry schedule rather than sleeping inside the function.

Idempotency. S3 PUT notifications can fire more than once for the same object (S3 event delivery is at-least-once). The Lambda will convert and re-upload the file each time it fires for the same key. This is safe for most pipelines — overwriting the converted output with an identical file is harmless. If you need exactly-once semantics, check if the output key already exists before converting:

try:
    s3.head_object(Bucket=output_bucket, Key=output_key)
    logger.info("Output already exists, skipping: %s", output_key)
    return {"statusCode": 200, "body": "already converted"}
except s3.exceptions.ClientError as e:
    if e.response['Error']['Code'] != '404':
        raise  # Real error, not "not found"

Large file handling. Lambda has 512MB of ephemeral /tmp storage by default (up to 10GB configurable). The function streams the file through memory rather than writing to /tmp. For very large files, increase Lambda's memory allocation — memory and CPU scale together in Lambda, so a 1GB memory allocation also gives more CPU for the download/upload.

Timeouts. Lambda timeout is set to 300s in the deployment command above. The API call has a 180s timeout. For large video files or dense documents, you may need to increase both. Lambda's maximum timeout is 15 minutes.

S3 event routing patterns

S3 event notifications support prefix and suffix filters, which lets a single bucket route different file types to different Lambda functions:

{
  "LambdaFunctionConfigurations": [
    {
      "LambdaFunctionArn": "arn:aws:lambda:...:function:ctf-to-pdf",
      "Events": ["s3:ObjectCreated:*"],
      "Filter": {
        "Key": {
          "FilterRules": [
            {"Name": "prefix", "Value": "docs/"},
            {"Name": "suffix", "Value": ".docx"}
          ]
        }
      }
    },
    {
      "LambdaFunctionArn": "arn:aws:lambda:...:function:ctf-to-webp",
      "Events": ["s3:ObjectCreated:*"],
      "Filter": {
        "Key": {
          "FilterRules": [
            {"Name": "prefix", "Value": "images/"},
            {"Name": "suffix", "Value": ".png"}
          ]
        }
      }
    }
  ]
}

Avoid trigger loops. If your input and output bucket are the same, the converted file PUT will trigger another Lambda invocation. Prevent this by using an OUTPUT_PREFIX that doesn't match the S3 event filter prefix, or use separate buckets (cleaner).

SNS fanout for multiple consumers. If you need the same S3 event to trigger multiple consumers (conversion + indexing + notification), route through SNS:

# S3 -> SNS -> Lambda (conversion) + Lambda (indexer) + SQS (notification queue)
aws s3api put-bucket-notification-configuration \
  --bucket your-input-bucket \
  --notification-configuration '{
    "TopicConfigurations": [{
      "TopicArn": "arn:aws:sns:us-east-1:ACCOUNT:file-uploaded",
      "Events": ["s3:ObjectCreated:Put"]
    }]
  }'

Production tips

Use Parameter Store or Secrets Manager for the API key, not Lambda env vars. Lambda environment variables are visible to anyone with lambda:GetFunctionConfiguration IAM permission. Parameter Store SecureString (KMS-encrypted) is the right place for secrets in Lambda.
Set concurrency limits on the Lambda function. S3 can fire many simultaneous PUTs (especially during bulk uploads). Without a concurrency limit, Lambda scales out and you hit the ChangeThisFile API with many parallel requests. Use aws lambda put-function-concurrency --function-name ctf-converter --reserved-concurrent-executions 10 to cap parallelism.
Configure a Dead Letter Queue. S3 events that fail all Lambda retries are silently dropped without a DLQ. Wire an SQS DLQ: aws lambda update-function-configuration --function-name ctf-converter --dead-letter-config TargetArn=arn:aws:sqs:...:ctf-dlq. Monitor the DLQ for unprocessed files.
Estimate cost before connecting a high-volume bucket. Lambda charges ~$0.20 per million invocations + $0.0000166667/GB-second. At 512MB and 10s average duration: ~$0.085 per 1,000 invocations. Add ChangeThisFile API costs: free tier covers 1,000/month, $29/mo for 10K, $99/mo for 100K. A pipeline processing 5,000 files/month costs roughly $0.50 Lambda + $29 API = ~$30/month total.
Tag converted objects with source metadata. The function already tags output objects with source-bucket, source-key, and format fields. This makes it easy to trace any converted file back to its source and audit the pipeline in S3 Inventory reports.

The S3 → Lambda → ChangeThisFile → S3 pipeline is roughly 60 lines of Python with zero Lambda layers, a ~20KB deployment package, and cold starts under 500ms. It handles 690 conversion routes with a single function controlled by an environment variable. Get a free API key — 1,000 conversions/month at no cost, enough to run this pipeline in low-volume environments without any spend.

S3 PUT Event to File Conversion: Lambda + ChangeThisFile API Pipeline