XML is the most hated format that everyone still uses. Developers mock its verbosity — <name>John</name> versus JSON's "name": "John" — and then spend their days working with DOCX files (ZIP archives of XML), SVG images (XML), RSS feeds (XML), Android layouts (XML), Maven builds (XML), and SAML authentication (XML).

The reason XML survives is not momentum or legacy compatibility, though both help. XML survives because it does things no other common format can do: schema validation at the parser level, namespace-based composition of multiple vocabularies, mixed content (text interleaved with markup), and declarative transformation via XSLT. These capabilities are irreplaceable in their domains.

This guide covers XML's actual strengths, its real costs, and the specific scenarios where no other format will do.

From SGML to XML: A Brief History

XML was born from SGML (Standard Generalized Markup Language), an ISO standard from 1986 that defined a framework for creating markup languages. SGML was powerful but monstrously complex — the specification ran to 500+ pages, and few implementations fully complied with it. HTML was defined as an SGML application, but browsers never actually parsed it as strict SGML.

In 1996, the W3C set out to create a simplified subset of SGML suitable for the web. The result was XML 1.0, published as a W3C Recommendation in February 1998. The design goals were explicit: XML should be straightforwardly usable over the internet, support a wide variety of applications, be compatible with SGML, be easy to write programs that process XML documents, have a minimum of optional features (ideally zero), and documents should be human-legible and reasonably clear.

XML achieved most of these goals. The "minimum of optional features" goal resulted in strict syntax rules: every opening tag must have a closing tag (or be self-closing), attribute values must be quoted, elements must be properly nested. This strictness was intentional — it made parsers simpler and documents unambiguous.

Anatomy of an XML Document

A well-formed XML document has three components:

<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://example.com/catalog">
  <product id="P001" status="active">
    <name>Widget</name>
    <price currency="USD">29.99</price>
    <description><![CDATA[Price < $30 & ships free]]></description>
  </product>
</catalog>

Key structural concepts:

  • Prolog (<?xml ... ?>): Declares XML version and encoding. Optional but recommended.
  • Elements: The building blocks. Can contain text, other elements, or both (mixed content). Case-sensitive.
  • Attributes: Key-value metadata on elements. Values must be quoted. id="P001" and status="active" above.
  • CDATA sections: <![CDATA[...]]> blocks where the parser treats content as raw text. No need to escape <, >, or &. Essential for embedding code, HTML, or mathematical expressions.
  • Namespaces: The xmlns attribute identifies which vocabulary an element belongs to, preventing name collisions when combining schemas.

Schema Validation: XSD, DTD, and RelaxNG

XML's killer feature is schema validation — the ability to define exactly what a valid document looks like and have the parser enforce it before your application code ever runs. This is fundamentally different from JSON, where validation is always application-level and opt-in.

XSD (XML Schema Definition)

XSD is the most powerful and most widely used XML schema language. An XSD schema defines:

  • Which elements and attributes are allowed
  • The data type of each element/attribute (string, integer, decimal, date, boolean, and ~40 more built-in types)
  • Cardinality constraints (minOccurs, maxOccurs)
  • Element ordering (sequence, choice, all)
  • Pattern restrictions (regex validation on string content)
  • Enumeration constraints (value must be one of a defined set)

XSD schemas are themselves XML documents, which means they can be parsed, generated, and transformed with the same XML tools. This is powerful but also means XSD schemas are verbose — a schema definition is often longer than the documents it validates.

XSD is used by SOAP web services (WSDL files contain XSD schemas), XBRL (financial reporting), HL7 FHIR (healthcare), and most government data exchange standards.

DTD (Document Type Definition)

DTD is the original SGML schema language, inherited by XML. It's simpler than XSD but less powerful — DTDs can define elements and attributes but have limited type support (everything is essentially a string) and can't express complex constraints. DTDs use their own non-XML syntax, making them harder to process with standard XML tools.

DTDs are still used in HTML5 (the <!DOCTYPE html> declaration is a vestigial DTD reference) and in defining entities (character shortcuts like &copy; for ©). For new schemas, XSD or RelaxNG are preferred.

RelaxNG

RelaxNG is a schema language designed to be simpler than XSD while remaining expressive. It comes in two syntaxes: an XML syntax and a compact non-XML syntax that's more human-readable. RelaxNG is technically more expressive than XSD in some areas (like supporting unordered content models) and is easier to learn.

RelaxNG is used by OpenDocument Format (ODF), DocBook, and several OASIS standards. It's less widely supported by tools than XSD but is often praised by developers who have used both.

Namespaces: Solving the Name Collision Problem

Namespaces are XML's answer to a practical problem: what happens when two different schemas both define an element called <title>? A book's title and an HTML page's title are different things, but they share a name.

XML namespaces use URIs (usually URLs, though they don't need to resolve) to uniquely identify vocabularies:

<invoice xmlns:cust="http://example.com/customer"
         xmlns:ship="http://shipper.com/schema">
  <cust:name>Acme Corp</cust:name>
  <ship:name>FedEx Ground</ship:name>
</invoice>

Both <name> elements coexist without ambiguity because they're in different namespaces. The prefix (cust:, ship:) is just a shorthand — the actual namespace is the URI.

Namespaces are powerful but confusing. Default namespaces (xmlns="..." without a prefix) apply to the element and all its descendants, which can cause unexpected behavior when nesting elements from different schemas. Namespace-unaware tools may break when encountering prefixed elements. This complexity is the single biggest complaint about XML from developers working with it for the first time.

No other common data format has namespaces. JSON, YAML, TOML, and CSV all assume a single vocabulary per document. This is fine for most applications but breaks down when you need to combine data from multiple independent schemas in one document — which is exactly what enterprise integration requires.

XSLT and XPath: Transformation and Query

XPath is a query language for selecting nodes in an XML document. It uses path expressions similar to file system paths: /catalog/product/name selects all <name> elements that are children of <product> elements that are children of the root <catalog>. XPath supports predicates (//product[@status='active']), functions (count(), string-length(), sum()), and axes (parent, ancestor, sibling, descendant). XPath is used by XSLT, XSD, and most XML processing libraries.

XSLT (Extensible Stylesheet Language Transformations) is a declarative language for transforming XML documents. An XSLT stylesheet defines template rules that match XPath patterns and produce output. XSLT can transform XML into different XML, HTML, plain text, or any text format. It's Turing-complete, meaning it can compute anything computable — though using XSLT for general computation is widely regarded as a war crime against readability.

Practical XSLT uses: transforming XML data feeds into HTML pages, converting between XML schemas (mapping one industry standard to another), generating reports from XML data, and converting XML to CSV or JSON. XSLT 3.0 (2017) added JSON support, streaming for large documents, and higher-order functions.

Where XML Remains Irreplaceable

XML lost the API format war to JSON around 2012-2015. New REST APIs return JSON. New databases store JSON. New config files use YAML or TOML. But XML dominates specific domains where its unique features are required:

DomainFormatWhy XML
Office documentsDOCX, XLSX, PPTX (OOXML)Mixed content (text + formatting). Document structure requires markup.
Vector graphicsSVGHierarchical element structure with attributes for styling/geometry.
Web feedsRSS 2.0, AtomEstablished standard. Self-describing with namespaces for extensions.
Enterprise APIsSOAP + WSDLSchema validation, namespace composition, formal contracts.
AuthenticationSAMLXML Signature for cryptographic signing of assertions.
Financial reportingXBRLExtensible taxonomies via namespaces. Regulatory requirement.
HealthcareHL7 FHIR, CDAComplex data models with strict validation requirements.
Build systemsMaven (pom.xml), Ant, MSBuildEstablished ecosystem, schema validation for configuration.
AndroidLayouts, manifests, resourcesHierarchical UI structure with attribute-based properties.
ConfigurationJava/.NET app configs, SpringSchema-validated configuration with IDE support.

The Verbosity Cost — and When It Doesn't Matter

XML is approximately 2-3x larger than equivalent JSON for the same data. A JSON object {"name": "John", "age": 30} is 27 bytes. The XML equivalent <person><name>John</name><age>30</age></person> is 56 bytes — more than double. At scale, this means more bandwidth, more storage, and slower parsing.

But verbosity is often irrelevant:

  • Compressed transfer: XML and JSON compress to similar sizes with gzip. A 100KB XML file and a 50KB JSON file with the same data both compress to roughly 10-15KB because XML's repetitive tag names compress extremely well.
  • Document formats: DOCX files are already ZIP-compressed. The internal XML is never transferred raw.
  • Enterprise integration: When processing a $10M financial transaction, nobody cares that the XBRL message is 3KB instead of 1.5KB.
  • Developer time: If XML's schema validation catches a malformed message before it enters your system, the bytes saved by using JSON are meaningless compared to the debugging hours saved.

Converting XML to Other Formats

XML conversions are uniquely lossy because XML has features without equivalents in most target formats:

ConversionWhat's Lost
XML to JSONAttributes vs. elements distinction, namespaces, processing instructions, comments, CDATA markers, mixed content
XML to CSVAll hierarchy, attributes, types. Only works for flat repeated elements.
XML to YAMLSame as JSON (YAML is a JSON superset). Attributes become special keys.
XML to TOMLDeep nesting, mixed content, attributes. Only works for simple config-like XML.

The reverse direction — JSON to XML, CSV to XML — is generally lossless because XML can represent everything these simpler formats contain, just more verbosely. The generated XML won't have attributes (JSON has no attribute concept) or namespaces, but the data is preserved.

XML isn't going anywhere. The domains where it dominates — document formats, enterprise integration, regulated industries, vector graphics — chose XML for features that no alternative provides. Schema validation, namespaces, mixed content, and XSLT are not available in JSON, YAML, or any other lightweight format. Until they are, XML will remain the backbone of these ecosystems.

The practical takeaway: don't use XML for new APIs (JSON won that battle), new config files (YAML or TOML are simpler), or simple data exchange (CSV or JSON). But when you encounter XML in the wild — and you will — understand that it's there for a reason. The verbosity you're paying for buys validation, composition, and transformation capabilities that save engineering time downstream.