JSON-to-YAML is structurally trivial — both are tree-shaped data. The interesting parts are stylistic: block vs flow style, key ordering, multi-line strings, and (if you care) preserving comments. PyYAML covers 90% of cases; ruamel.yaml handles the round-trip cases.
Method 1: PyYAML (the standard option)
PyYAML is the canonical YAML library for Python. It's been around since 2006 and ships pre-installed in most data-science distributions.
pip install PyYAML
import json
import yaml
def json_to_yaml(in_path: str, out_path: str) -> None:
with open(in_path) as f:
data = json.load(f)
with open(out_path, "w") as out:
yaml.dump(data, out, sort_keys=False, default_flow_style=False, allow_unicode=True)
json_to_yaml("config.json", "config.yaml")
Three flags worth knowing:
- sort_keys=False — preserves the JSON insertion order. Without this, PyYAML alphabetizes everything, which makes config files harder to scan.
- default_flow_style=False — produces block style ({key: value} pairs on separate lines, lists with -). This is the YAML most humans expect.
- allow_unicode=True — keeps non-ASCII characters as-is instead of escaping them as \u sequences.
For multi-line strings, set them to use the literal block scalar:
class LiteralStr(str): pass
def literal_representer(dumper, data):
return dumper.represent_scalar("tag:yaml.org,2002:str", data, style="|")
yaml.add_representer(LiteralStr, literal_representer)
data["description"] = LiteralStr(data["description"])
yaml.dump(data, ..., default_flow_style=False)
Method 2: ruamel.yaml (round-trip safe, preserves comments)
If you're converting JSON into YAML that will then be edited by humans (Helm charts, GitHub Actions configs, ansible playbooks), ruamel.yaml is the better choice — it preserves comments, key ordering, and quoting style on round-trips.
pip install ruamel.yaml
import json
from ruamel.yaml import YAML
def json_to_yaml(in_path: str, out_path: str) -> None:
yaml = YAML()
yaml.indent(mapping=2, sequence=4, offset=2) # standard 2-space indent
yaml.preserve_quotes = True
yaml.allow_unicode = True
with open(in_path) as f:
data = json.load(f)
with open(out_path, "w") as out:
yaml.dump(data, out)
json_to_yaml("config.json", "config.yaml")
ruamel.yaml's killer feature: if a downstream user edits the YAML and adds comments, you can re-load and re-dump without losing them. PyYAML can't do this.
The trade-off: ruamel.yaml is slower and has a different API surface. If your YAML is purely machine-generated and never edited, PyYAML is fine.
Method 3: ChangeThisFile API (no library, handles edge cases)
If your JSON inputs are unpredictable — coming from third-party APIs, customer uploads, or different vendors — the API handles edge cases like JSON5, comments, trailing commas, and unusual nesting. Get a free API key.
import requests
API_KEY = "ctf_sk_your_key_here"
def json_to_yaml(in_path: str, out_path: str) -> None:
with open(in_path, "rb") as f:
response = requests.post(
"https://changethisfile.com/v1/convert",
headers={"Authorization": f"Bearer {API_KEY}"},
files={"file": f},
data={"source": "json", "target": "yaml"},
timeout=30,
)
response.raise_for_status()
with open(out_path, "wb") as out:
out.write(response.content)
json_to_yaml("vendor_config.json", "vendor_config.yaml")
The API also supports JSON5 (with comments) and JSONC inputs without you needing to switch parsers — useful for processing tsconfig.json or VS Code config files.
When to use each
| Approach | Best for | Tradeoff |
|---|---|---|
| PyYAML | One-shot scripts, machine-generated YAML, no comments to preserve | Loses comments and quoting on round-trips |
| ruamel.yaml | Editable configs (Helm, k8s, ansible), round-trip safety | Slower; different (more complex) API |
| ChangeThisFile API | Unpredictable input (JSON5, JSONC), edge runtimes, multi-language teams | Per-call cost, network call |
CLI alternative: yq
If you don't need Python, yq is a jq-style CLI for YAML/JSON conversion. It's the fastest option for one-off conversions.
brew install yq # macOS
snap install yq # Linux
# JSON to YAML:
yq -P 'sort_keys(.)' -oy config.json > config.yaml
# Or just:
yq -oy config.json
# Reverse — YAML to JSON:
yq -ojson config.yaml
yq uses Go's YAML library under the hood — different defaults than PyYAML but functionally equivalent output. Use yq for ad-hoc shell work; use Python when conversion is part of a larger pipeline.
Common pitfalls
- String '01' becomes int 1 with default settings. YAML auto-coerces unquoted strings that look like numbers, dates, or booleans. PyYAML's safe_load gives you the converted value; safe_dump will quote ambiguous strings on output.
- Boolean strings get coerced. 'yes' / 'no' / 'on' / 'off' all become booleans in older YAML 1.1 mode. To keep them as strings, quote them in the output.
- Tabs are forbidden in YAML. JSON allows tabs in indentation; YAML parsers reject them. PyYAML output uses spaces by default; if you're hand-massaging YAML, use spaces only.
- Anchors and aliases. JSON has no equivalent to YAML's & / *. If you want anchors in the output, you have to construct them manually with ruamel.yaml's API.
For machine-generated YAML, PyYAML. For human-edited YAML, ruamel.yaml. For inputs of varying quality, the API. Free tier gives 100 conversions/month.