JSON-to-YAML is structurally trivial — both are tree-shaped data. The interesting parts are stylistic: block vs flow style, key ordering, multi-line strings, and (if you care) preserving comments. PyYAML covers 90% of cases; ruamel.yaml handles the round-trip cases.

Method 1: PyYAML (the standard option)

PyYAML is the canonical YAML library for Python. It's been around since 2006 and ships pre-installed in most data-science distributions.

pip install PyYAML
import json
import yaml

def json_to_yaml(in_path: str, out_path: str) -> None:
    with open(in_path) as f:
        data = json.load(f)
    with open(out_path, "w") as out:
        yaml.dump(data, out, sort_keys=False, default_flow_style=False, allow_unicode=True)

json_to_yaml("config.json", "config.yaml")

Three flags worth knowing:

  • sort_keys=False — preserves the JSON insertion order. Without this, PyYAML alphabetizes everything, which makes config files harder to scan.
  • default_flow_style=False — produces block style ({key: value} pairs on separate lines, lists with -). This is the YAML most humans expect.
  • allow_unicode=True — keeps non-ASCII characters as-is instead of escaping them as \u sequences.

For multi-line strings, set them to use the literal block scalar:

class LiteralStr(str): pass

def literal_representer(dumper, data):
    return dumper.represent_scalar("tag:yaml.org,2002:str", data, style="|")

yaml.add_representer(LiteralStr, literal_representer)

data["description"] = LiteralStr(data["description"])
yaml.dump(data, ..., default_flow_style=False)

Method 2: ruamel.yaml (round-trip safe, preserves comments)

If you're converting JSON into YAML that will then be edited by humans (Helm charts, GitHub Actions configs, ansible playbooks), ruamel.yaml is the better choice — it preserves comments, key ordering, and quoting style on round-trips.

pip install ruamel.yaml
import json
from ruamel.yaml import YAML

def json_to_yaml(in_path: str, out_path: str) -> None:
    yaml = YAML()
    yaml.indent(mapping=2, sequence=4, offset=2)  # standard 2-space indent
    yaml.preserve_quotes = True
    yaml.allow_unicode = True

    with open(in_path) as f:
        data = json.load(f)
    with open(out_path, "w") as out:
        yaml.dump(data, out)

json_to_yaml("config.json", "config.yaml")

ruamel.yaml's killer feature: if a downstream user edits the YAML and adds comments, you can re-load and re-dump without losing them. PyYAML can't do this.

The trade-off: ruamel.yaml is slower and has a different API surface. If your YAML is purely machine-generated and never edited, PyYAML is fine.

Method 3: ChangeThisFile API (no library, handles edge cases)

If your JSON inputs are unpredictable — coming from third-party APIs, customer uploads, or different vendors — the API handles edge cases like JSON5, comments, trailing commas, and unusual nesting. Get a free API key.

import requests

API_KEY = "ctf_sk_your_key_here"

def json_to_yaml(in_path: str, out_path: str) -> None:
    with open(in_path, "rb") as f:
        response = requests.post(
            "https://changethisfile.com/v1/convert",
            headers={"Authorization": f"Bearer {API_KEY}"},
            files={"file": f},
            data={"source": "json", "target": "yaml"},
            timeout=30,
        )
    response.raise_for_status()
    with open(out_path, "wb") as out:
        out.write(response.content)

json_to_yaml("vendor_config.json", "vendor_config.yaml")

The API also supports JSON5 (with comments) and JSONC inputs without you needing to switch parsers — useful for processing tsconfig.json or VS Code config files.

When to use each

ApproachBest forTradeoff
PyYAMLOne-shot scripts, machine-generated YAML, no comments to preserveLoses comments and quoting on round-trips
ruamel.yamlEditable configs (Helm, k8s, ansible), round-trip safetySlower; different (more complex) API
ChangeThisFile APIUnpredictable input (JSON5, JSONC), edge runtimes, multi-language teamsPer-call cost, network call

CLI alternative: yq

If you don't need Python, yq is a jq-style CLI for YAML/JSON conversion. It's the fastest option for one-off conversions.

brew install yq    # macOS
snap install yq    # Linux

# JSON to YAML:
yq -P 'sort_keys(.)' -oy config.json > config.yaml

# Or just:
yq -oy config.json

# Reverse — YAML to JSON:
yq -ojson config.yaml

yq uses Go's YAML library under the hood — different defaults than PyYAML but functionally equivalent output. Use yq for ad-hoc shell work; use Python when conversion is part of a larger pipeline.

Common pitfalls

  • String '01' becomes int 1 with default settings. YAML auto-coerces unquoted strings that look like numbers, dates, or booleans. PyYAML's safe_load gives you the converted value; safe_dump will quote ambiguous strings on output.
  • Boolean strings get coerced. 'yes' / 'no' / 'on' / 'off' all become booleans in older YAML 1.1 mode. To keep them as strings, quote them in the output.
  • Tabs are forbidden in YAML. JSON allows tabs in indentation; YAML parsers reject them. PyYAML output uses spaces by default; if you're hand-massaging YAML, use spaces only.
  • Anchors and aliases. JSON has no equivalent to YAML's & / *. If you want anchors in the output, you have to construct them manually with ruamel.yaml's API.

For machine-generated YAML, PyYAML. For human-edited YAML, ruamel.yaml. For inputs of varying quality, the API. Free tier gives 100 conversions/month.