Markdown occupies a unique position among document formats: it's the only one designed to be read comfortably without being rendered. A Markdown file is a plain text file that humans can read as-is, but that converts to structured HTML (or PDF, DOCX, or other formats) through simple parsing rules. # Heading becomes <h1>. **bold** becomes <strong>. [link](url) becomes <a href>.
John Gruber published the original Markdown syntax and a Perl conversion script on his site Daring Fireball in March 2004. His stated goal was creating a format that's "publishable as-is, as plain text, without looking like it's been marked up with tags or formatting instructions." Two decades later, Markdown is everywhere — GitHub READMEs, static site generators, note-taking apps (Obsidian, Bear, Notion), documentation platforms (GitBook, Docusaurus), and messaging apps (Slack, Discord).
But "Markdown" in 2026 is not a single format. It's a family of dialects with varying syntax rules, extension mechanisms, and rendering behaviors. This guide covers the core syntax, the standardization effort, the major dialects, and the conversion pipelines that make Markdown genuinely useful as a document format.
Core Markdown Syntax
Markdown's syntax maps to HTML elements. The mapping is intentional — Gruber designed it for web writing.
# Heading 1through###### Heading 6— heading levels**bold**or__bold__— strong emphasis*italic*or_italic_— emphasis[text](url)— hyperlinks— images> blockquote— block quotes- itemor* item— unordered lists1. item— ordered lists`code`— inline code- Triple backticks or 4-space indent — code blocks
---— horizontal rule
Markdown also passes through raw HTML. Any HTML tag that appears in a Markdown file is included verbatim in the output. This escape hatch means Markdown can express anything HTML can — but once you're writing HTML in your Markdown, you've lost the readability advantage.
The original spec is deliberately minimal. It doesn't define tables, footnotes, definition lists, task checkboxes, math notation, or metadata. These are all extensions added by various implementations.
CommonMark: Standardizing the Ambiguities
Gruber's original Markdown spec left many edge cases undefined. What happens when you nest bold inside italic? How are list items with blank lines parsed? What's the precedence of different inline elements? Different implementations made different choices, meaning the same Markdown file could render differently in different tools.
CommonMark, launched in 2014 by John MacFarlane (creator of Pandoc) and others, addresses this by providing a comprehensive specification with over 600 examples defining exact behavior for every ambiguous case. The CommonMark spec (commonmark.org) is the closest thing to an authoritative Markdown standard.
Most modern Markdown parsers implement CommonMark as their base: markdown-it (JavaScript), cmark (C reference implementation), pulldown-cmark (Rust), goldmark (Go), and comrak (Rust). If a tool says it "supports Markdown," it almost certainly means CommonMark plus some extensions.
Gruber has not endorsed CommonMark and considers it a separate project from his original Markdown. The name was changed from "Standard Markdown" to "CommonMark" after Gruber objected. This political footnote matters because some tools still implement "original Markdown" rather than CommonMark, and the rendering differences, while usually minor, do exist.
GitHub Flavored Markdown (GFM)
GFM is CommonMark plus a set of extensions that GitHub uses. It's arguably the most widely-used Markdown dialect because every GitHub README, issue, pull request, and wiki page uses it. GFM adds:
- Tables: pipe-delimited syntax (
| Header | Header |) with alignment support - Task lists:
- [ ] uncheckedand- [x] checked - Strikethrough:
~~deleted text~~ - Autolinks: bare URLs are automatically linked
- Fenced code blocks: triple backticks with language identifiers for syntax highlighting
GFM is formally specified (github.github.com/gfm/) and has its own test suite. The GitHub implementation uses cmark-gfm, a fork of the C reference implementation. Many non-GitHub tools also support GFM extensions because they've become de facto standard features.
GFM deliberately does not include some popular Markdown extensions: footnotes, definition lists, table of contents generation, math notation, and metadata frontmatter. These are common in other dialects but not part of the GFM spec.
MDX: JSX Inside Markdown
MDX (Markdown + JSX) extends Markdown with the ability to import and render React components. A blog post in MDX can include an interactive chart component, a live code editor, or a custom callout box — all inline with regular Markdown text. MDX was created for documentation and blog platforms built on React (Next.js, Gatsby, Docusaurus).
An MDX file looks like regular Markdown with JSX blocks:
# My Article
Regular paragraph text.
<Chart data={salesData} type="bar" />
More paragraph text with **formatting**.MDX compiles to a JavaScript module that exports a React component. Each Markdown element maps to a component that can be overridden — you can replace every <h1>, <p>, and <a> with custom React components. This makes MDX extremely powerful for design systems and documentation platforms.
The tradeoff: MDX files are not portable. They depend on React, specific component imports, and a build pipeline. A standard Markdown file works anywhere; an MDX file only works in an MDX-aware build system. Use MDX for documentation sites with interactive components. Use standard Markdown for everything else. Converting MDX to HTML requires the React build step — there's no simple parser that handles it.
Markdown for Note-Taking
Markdown has become the default format for knowledge management tools. Obsidian, Bear, Logseq, Joplin, and Notion (export) all use Markdown as their underlying format. The appeal: your notes are plain text files that work with any text editor, version control system, or search tool. No vendor lock-in, no proprietary database.
Obsidian has popularized its own Markdown extensions: [[wikilinks]] for internal links, ![[embedded notes]] for transclusion, %%comments%% for hidden text, and callout blocks. These extensions are Obsidian-specific — they won't render correctly in other Markdown tools. This creates a tension: the notes are technically plain text files, but they use syntax that only Obsidian interprets.
For maximum portability: stick to CommonMark syntax, use standard [link](path) instead of [[wikilinks]], and keep media files in a /assets or /images directory with relative paths. This ensures your notes render correctly in any Markdown tool, any static site generator, and any future application you might switch to.
Documentation as Code
The docs-as-code movement stores documentation in Markdown files alongside source code, using the same version control (Git), review process (pull requests), and build pipeline (CI/CD) as the codebase. Platforms like Docusaurus, MkDocs, GitBook, Sphinx (with MyST), and Hugo all build documentation sites from Markdown source files.
The workflow: write documentation in Markdown files in a /docs directory. Submit changes as pull requests. Reviewers check content the same way they review code. CI builds the documentation site. CD deploys it. The entire documentation history is in Git — every change is attributed, reversible, and diffable.
This approach scales well for technical documentation, API references, and developer guides. It works less well for documents that need rich formatting, complex tables, or precise visual layout. If you find yourself writing more HTML than Markdown in your documentation files, you've outgrown the format.
Popular Markdown-based static site generators: Hugo (Go, fast builds), Astro (JS, flexible), MkDocs (Python, Material theme), Docusaurus (React, versioned docs), Jekyll (Ruby, GitHub Pages native). Each adds its own extensions to Markdown — frontmatter for metadata, shortcodes for embeds, custom syntax for admonitions.
Converting Markdown: Pandoc and Beyond
Pandoc is the universal document converter, and Markdown is its native format. Pandoc reads Markdown (with extensive extension support) and outputs HTML, PDF (via LaTeX or wkhtmltopdf), DOCX, ODT, EPUB, LaTeX, RTF, and dozens of other formats. It's the Swiss Army knife for Markdown conversion.
Markdown to HTML (/md-to-html): This is Markdown's native conversion. Every Markdown parser outputs HTML. The result is semantic HTML with headings, paragraphs, lists, links, images, and code blocks. No styling — you add CSS separately.
Markdown to PDF (/md-to-pdf): Multiple paths. Pandoc can go through LaTeX (best typography) or wkhtmltopdf (HTML-based rendering). Direct tools like md-to-pdf (npm) use headless Chrome. The quality varies significantly by tool and template.
Markdown to DOCX (/md-to-docx): Pandoc generates clean DOCX from Markdown, mapping headings to Word heading styles, bold/italic to character formatting, and lists to Word list styles. The output is editable and looks professional. Custom reference documents let you control fonts and styles.
Markdown to LaTeX (/md-to-tex): Pandoc converts Markdown to LaTeX source, which you can then compile with pdflatex or xelatex. Useful for academic workflows where you want to write in Markdown but submit in LaTeX/PDF.
Going the other direction: HTML to Markdown (via Turndown or similar), DOCX to Markdown (via Mammoth or Pandoc), and various import tools in note-taking apps. These conversions lose visual formatting but preserve structure.
Frontmatter: Metadata in Markdown Files
Frontmatter is a YAML (or TOML, or JSON) block at the top of a Markdown file, delimited by ---:
---
title: My Article
date: 2026-03-19
author: Jane Smith
tags: [markdown, writing]
---
# The actual content starts hereFrontmatter is not part of any Markdown specification — it's a convention popularized by Jekyll and adopted by virtually every static site generator and documentation platform. The metadata in frontmatter is used by the build system for titles, dates, categories, SEO metadata, and template selection.
Different tools handle frontmatter differently. Some strip it silently. Some render it as a table. Some ignore it entirely. If you're sharing Markdown files across tools, know that frontmatter is convention, not standard, and may or may not be parsed correctly by the recipient's tool.
Markdown's dominance in technical writing is well-earned. It's simple enough to learn in 10 minutes, powerful enough for entire documentation sites, and portable enough that your content isn't locked into any single application. The format's minimalism is both its strength (easy to learn, easy to parse, easy to diff) and its limitation (no rich formatting, no precise layout, no visual design).
For most writing, Markdown is the right default. Write in Markdown, version control in Git, and convert to PDF or DOCX when you need to distribute. If you need rich formatting from the start, begin in a word processor. But if your content is destined for the web, documentation, or any text-first workflow, Markdown gets you there with less friction than any other format.