YAML is the format that made DevOps engineers into "YAML engineers." If you've written a Kubernetes manifest, a GitHub Actions workflow, an Ansible playbook, a Docker Compose file, or a CI/CD pipeline, you've written YAML. It's the dominant configuration language for infrastructure and automation tools, and for good reason: YAML is genuinely pleasant to read and write for small-to-medium config files.
But YAML has a dark side. Its implicit type coercion turns country codes into booleans. Its indentation rules create invisible bugs. Its specification is 86 pages long — longer than XML's. And its multiline string syntax has so many variants that even experienced users can't remember all of them.
This guide covers YAML's real strengths, its well-documented footguns, and when to use it versus JSON or TOML.
YAML Fundamentals: Syntax and Types
YAML represents data using three basic structures:
# Mapping (key-value pairs)
name: John Smith
age: 30
active: true
# Sequence (ordered list)
colors:
- red
- green
- blue
# Scalar (single value)
title: "Hello World"
count: 42
pi: 3.14159YAML's readability advantage over JSON is obvious. No braces, no brackets, no quoted keys, no trailing comma anxiety. Comments start with #. Nesting is expressed through indentation (spaces only — tabs are illegal in YAML).
YAML supports more types than JSON:
| Type | Example | Notes |
|---|---|---|
| String | name: hello | Quotes optional in most cases. This is also the source of most footguns. |
| Integer | count: 42 | Supports hex (0xFF), octal (0o77) |
| Float | pi: 3.14 | Also .inf, -.inf, .nan |
| Boolean | debug: true | This is the footgun. See Norway problem below. |
| Null | value: null or value: ~ | Empty value is also null: value: |
| Timestamp | date: 2026-03-19 | ISO 8601 dates are auto-parsed. JSON has no date type. |
| Sequence | [1, 2, 3] | Flow style (inline) or block style (indented list) |
| Mapping | {a: 1, b: 2} | Flow style (inline) or block style (indented key-value) |
The Norway Problem: Implicit Type Coercion
YAML's most notorious bug is implicit type coercion. Because YAML doesn't require quotes around strings, the parser must guess whether a bare value is a string, number, boolean, or null. These guesses are frequently wrong.
The Norway problem: in YAML 1.1, the country code NO is parsed as the boolean false. This happened because YAML 1.1 recognized an absurdly broad set of boolean values:
Truthy (all parsed as true) | Falsy (all parsed as false) |
|---|---|
y, Y, yes, Yes, YES | n, N, no, No, NO |
true, True, TRUE | false, False, FALSE |
on, On, ON | off, Off, OFF |
This means a YAML file listing countries would parse Norway's code as false, Austria's AT as a string, and the result would be a silent data corruption bug. Real-world victims include Ruby's CI/CD system (countries.yml), multiple GitHub Actions workflows, and Ansible inventories.
YAML 1.2 (2009) fixed this by reducing booleans to only true and false (lowercase). But adoption has been slow. PyYAML (Python's most popular YAML library) still defaults to YAML 1.1 as of 2026. Ruby's Psych switched to 1.2 in Ruby 3.1. Go's go-yaml uses 1.2. The result: the same YAML file can produce different data structures depending on which library parses it.
The fix: always quote strings that might be misinterpreted. country: "NO" is always a string. country: NO might be false. When in doubt, quote it.
Other Type Coercion Traps
Booleans are the most famous, but not the only coercion issue:
- Octal numbers:
port: 0777is parsed as octal (511 in decimal) in YAML 1.1. YAML 1.2 uses0o777syntax instead. - Sexagesimal numbers:
time: 1:30was parsed as 90 (seconds, in base-60) in YAML 1.1. YAML 1.2 removed this. - Version numbers:
version: 3.10becomes the float3.1, not the string"3.10". This broke Python 3.10 version specifications in multiple projects. - Scientific notation:
value: 1e3becomes the number 1000.
Every one of these is preventable by quoting: version: "3.10", port: "0777". The cost of unnecessary quotes is zero. The cost of unexpected type coercion is debugging time.
Indentation: The Invisible Bug Factory
YAML uses indentation (spaces, never tabs) to express structure. This makes YAML visually clean but creates a class of bugs that are invisible in most editors:
# This is a mapping with nested sequence
server:
ports:
- 8080
- 8443
host: localhost
# This is a DIFFERENT structure (ports is a sibling, not nested)
server:
host: localhost
ports:
- 8080
- 8443Common indentation bugs:
- Mixed spaces and tabs. YAML forbids tabs for indentation but many editors insert them. A tab that looks like two spaces creates a parse error. Configure your editor to insert spaces for YAML files.
- Inconsistent indentation depth. YAML allows any consistent number of spaces per level (1, 2, 4, 8 — anything), but switching between them within a document creates parse errors or, worse, valid-but-wrong structures.
- Invisible trailing whitespace. A space after a colon on an otherwise-blank line can change parsing:
key:(with trailing space) is a null value, whilekey:starts a nested mapping. - Copy-paste misalignment. Pasting a YAML block from documentation or another file often introduces indentation mismatches that create structural bugs without syntax errors.
The industry-standard indent is 2 spaces for YAML. Kubernetes, Docker Compose, GitHub Actions, and Ansible all use 2-space indentation.
Multiline Strings: | vs > and Their Variants
YAML has multiple ways to write multiline strings, and the differences are subtle:
| Syntax | Name | Behavior |
|---|---|---|
| | Literal block | Preserves newlines. Each line break in YAML becomes a newline in the string. |
> | Folded block | Folds newlines into spaces (like HTML). Double newlines become single newlines. |
|- | Literal strip | Like | but strips the trailing newline. |
>- | Folded strip | Like > but strips the trailing newline. |
|+ | Literal keep | Like | but keeps all trailing newlines. |
>+ | Folded keep | Like > but keeps all trailing newlines. |
Example:
# Literal: preserves newlines
description: |
Line one.
Line two.
Line three.
# Result: "Line one.\nLine two.\nLine three.\n"
# Folded: joins lines with spaces
description: >
This is a long
description that wraps
across multiple lines.
# Result: "This is a long description that wraps across multiple lines.\n"The | (literal) style is what you want for code blocks, scripts, and multi-line values where line breaks matter. The > (folded) style is for long text that wraps in the YAML file but should be a single paragraph. The - suffix strips the trailing newline, which is often what you want for config values.
Anchors and Aliases: DRY YAML
YAML supports anchors (&) and aliases (*) to avoid repeating the same data:
defaults: &defaults
adapter: postgres
host: localhost
port: 5432
development:
database: app_dev
<<: *defaults
production:
database: app_prod
<<: *defaults
host: db.production.comThe &defaults anchor marks the block. *defaults references it. The << merge key inserts the anchored keys into the current mapping, with later keys overriding earlier ones (so production uses db.production.com as its host).
Anchors are useful for eliminating repetition in Docker Compose files (shared service configs), CI/CD pipelines (common step definitions), and database configuration (shared connection defaults).
Caveat: The merge key (<<) is not part of the YAML 1.2 specification — it's a YAML 1.1 extension that many libraries still support. Relying on it creates portability risks. Some tools (like Kubernetes) don't support anchors at all because they process YAML before it reaches a full YAML parser.
YAML 1.1 vs 1.2: The Compatibility Mess
YAML 1.2 (released in 2009) was a significant cleanup that made YAML a strict superset of JSON and fixed many type coercion issues. Key differences:
| Feature | YAML 1.1 | YAML 1.2 |
|---|---|---|
| Boolean values | yes/no, on/off, y/n, true/false | Only true/false |
| Octal syntax | 0777 | 0o777 |
| Sexagesimal | 1:30 = 90 | String "1:30" |
| JSON superset | No (some JSON is invalid YAML 1.1) | Yes (all JSON is valid YAML 1.2) |
The problem: YAML 1.2 is 17 years old and still not universally adopted. PyYAML defaults to 1.1. This means the same YAML file produces different results depending on the parser. The safe approach: write YAML that's valid in both versions by quoting ambiguous values and using only true/false for booleans.
The YAML Ecosystem: Where YAML Dominates
YAML's dominance in DevOps and infrastructure tooling is near-total:
- Kubernetes — all manifests, ConfigMaps, Secrets, CRDs, Helm charts
- Docker Compose — service definitions, networks, volumes
- GitHub Actions — workflow definitions
- GitLab CI —
.gitlab-ci.yml - CircleCI —
.circleci/config.yml - Ansible — playbooks, roles, inventories
- Terraform — some modules (though HCL is primary)
- Swagger/OpenAPI — API specifications (also supports JSON)
- Ruby on Rails —
database.yml,routes.yml,locales/*.yml - Spring Boot —
application.yml - Home Assistant —
configuration.yaml
The "YAML engineer" meme exists because modern infrastructure work involves writing more YAML than application code. A Kubernetes deployment might require a Deployment manifest, a Service, an Ingress, a ConfigMap, a Secret, an HPA, and a PDB — all YAML files, totaling hundreds of lines for a single microservice.
When to Use YAML, When to Use Something Else
Use YAML when:
- The tool requires it (Kubernetes, GitHub Actions, Docker Compose — no choice)
- Humans will read and edit the file frequently
- You need comments in your data file
- The config is moderately complex (10-200 lines) with some nesting
Use JSON instead when:
- Machines are the primary consumer (API responses, data storage)
- You need guaranteed parse consistency across all implementations
- The data is generated, not hand-edited
Use TOML instead when:
- The config is flat or shallow (mostly key-value with a few sections)
- You want explicit types without YAML's coercion surprises
- The ecosystem prefers it (Rust, Python, Hugo)
The decision is often made for you by the tool. But for greenfield config files, TOML's simplicity and explicit typing make it the safer choice for anything that doesn't require YAML's deep nesting or DevOps ecosystem integration.
YAML is a genuinely good format for its intended purpose: human-readable configuration files with moderate complexity. The indentation-based syntax is cleaner than JSON's braces, and comments alone make it superior for anything humans will edit. But YAML's implicit typing, indentation sensitivity, and specification complexity mean it requires more care than simpler formats.
The practical rules: always quote strings that might be misinterpreted as booleans or numbers. Use a linter (yamllint) in your editor and CI pipeline. Configure your editor to insert spaces, not tabs, for YAML files. And when the config is simple enough for TOML, use TOML — you'll sleep better.