Every file format you use daily exists because someone, at some specific moment, decided the existing options weren't good enough. GIF was created because CompuServe needed a color image format for dial-up modems. MP3 was born from a Fraunhofer research project that almost lost its funding. PNG exists solely because Unisys decided to enforce a patent on GIF's compression algorithm.
Understanding file format history isn't nostalgia — it's context. Why does JPEG have an 8x8 block artifact? Because of computational limits in 1992. Why are there three competing next-gen image formats? Because of patent licensing disputes in 2015. Why does the world still use ZIP when better compressors exist? Because Phil Katz published the specification in 1989.
This guide traces the timeline from the earliest digital formats to the standards being ratified today, decade by decade, with the people, organizations, and technical decisions that shaped how we store information.
1960s-1970s: The Foundations
ASCII (1963) — Bob Bemer at IBM led the committee that created the American Standard Code for Information Interchange. 128 characters, 7 bits each. Every text format since is either ASCII or an extension of it. ASCII defined not just letters and numbers but control characters (carriage return, line feed, tab, bell) that still affect file compatibility in 2026.
TIFF (1986, developed from mid-1970s work) — Aldus Corporation (later acquired by Adobe) created the Tagged Image File Format for desktop publishing. TIFF's design was ahead of its time: tagged chunks allow arbitrary metadata, multiple compression options (none, LZW, ZIP, JPEG), and multi-page documents. It's still the standard for archival imaging, print workflows, and medical imaging (via DICOM, which wraps TIFF-like structures).
PCM/WAV foundation: Pulse-code modulation — the technique of sampling analog audio at regular intervals — was developed by Alec Reeves in 1937 and became the basis for digital audio. The sampling theorem (Nyquist-Shannon, 1928/1949) proved that sampling at 2x the highest frequency captures all the information. CD audio (1982) settled on 44.1 kHz, 16-bit PCM — a standard that defined the uncompressed audio format still used in WAV files today.
1980s: The Desktop Revolution
GIF (1987) — Steve Wilhite at CompuServe created the Graphics Interchange Format for transmitting color images over slow modem connections. GIF used LZW compression (Lempel-Ziv-Welch), limiting it to 256 colors but achieving excellent compression for graphics. GIF 89a added animation support and binary transparency. The LZW patent (Unisys) would later trigger a crisis.
ZIP (1989) — Phil Katz created the ZIP format and PKZIP tool after a legal dispute with System Enhancement Associates over ARC format. Critically, Katz published the .ZIP specification (APPNOTE) publicly, allowing anyone to create ZIP-compatible tools. This openness made ZIP the universal archive format. Katz died in 2000 at age 37; his open specification remains one of the most consequential decisions in file format history. ZIP is still the standard for cross-platform archives.
MIDI (1983) — Dave Smith (Sequential Circuits) and Ikutaro Kakehashi (Roland) created the Musical Instrument Digital Interface as a hardware protocol, but the .MID file format for storing note sequences became equally important. MIDI files store instructions (note on, note off, velocity, program change), not audio — a 3-minute MIDI file is typically 30-50KB.
DOC (1983) — Microsoft Word's binary document format debuted with Word 1.0 for MS-DOS. The format evolved through Word 95, 97, and 2003, becoming more complex with each version. Its binary structure made it nearly impossible for other software to fully support, creating the first major document format lock-in.
1990s: The Internet Explosion
The World Wide Web (1991) and Netscape Navigator (1994) created explosive demand for efficient file formats. More formats were created in this decade than any other.
JPEG (1992)
The Joint Photographic Experts Group published the JPEG standard (ITU T.81/ISO 10918-1) in 1992. It used the Discrete Cosine Transform (DCT), which had been described by Nasir Ahmed in 1972 but only became practical for image compression in the late 1980s as computing power caught up. JPEG's 8x8 block structure was a compromise — larger blocks give better compression but were too slow for 1992-era hardware. Those 8x8 blocks are why heavily compressed JPEGs show blocky artifacts to this day.
JPEG became the default photo format within a few years and remains the most widely supported image format in existence. Every camera, phone, browser, and image viewer supports it. PNG to JPG and WebP to JPG remain among the most common conversions.
PNG (1996)
In December 1994, Unisys began enforcing its patent on LZW compression, demanding royalties from software that created GIF files. The internet community responded with a coordinated effort to create a patent-free replacement. Thomas Boutell posted the first PNG specification draft on January 4, 1995 — just weeks after the Unisys announcement. By October 1996, PNG 1.0 was published as W3C Recommendation.
PNG used DEFLATE compression (which Phil Katz had already released as royalty-free in the ZIP specification) and surpassed GIF in every technical dimension: 48-bit color, 8-bit alpha transparency, lossless by design. The one thing PNG deliberately omitted was animation — its creators didn't want to replicate GIF's crude frame-based animation. APNG would later fill this gap, though it took until 2017 for Safari support. BMP to PNG and GIF to PNG are common migration paths.
MP3 and PDF
MP3 (1993) — Karlheinz Brandenburg and the team at Fraunhofer IIS in Erlangen, Germany, published MPEG-1 Audio Layer III. The project nearly died multiple times due to funding concerns. The breakthrough was psychoacoustic modeling — discarding frequencies that are masked by louder nearby frequencies. MP3 at 128 kbps compresses CD audio by 11:1 and changed the music industry forever. Napster (1999) and the iPod (2001) were both built on MP3. Convert between audio formats: WAV to MP3 | FLAC to MP3.
PDF (1993) — John Warnock, co-founder of Adobe, created the Portable Document Format as part of the "Camelot" project (1991). The goal: a format that preserves exact document layout regardless of the viewing software, fonts, or operating system. PDF 1.0 was proprietary; Adobe gradually opened it, and PDF became ISO 32000 in 2008. DOCX to PDF is the most common document conversion on the web.
Other 1990s Formats
HTML (1993) — Tim Berners-Lee's markup language for the web. Not a file format in the traditional sense, but HTML defined how text, images, and links are structured for browsers.
AVI (1992) — Microsoft's Audio Video Interleave container, based on RIFF. Simple but limited: no modern features like multiple subtitle tracks, chapters, or variable frame rates.
MP4/QuickTime (1998/1991) — Apple's QuickTime container (MOV, 1991) became the basis for MPEG-4 Part 14 (MP4, 1998). MP4's atom/box structure is the most widely used video container in 2026. MOV to MP4 is largely a container remux.
SWF/Flash (1996) — Macromedia's Flash format dominated web multimedia from 1998-2015. Adobe killed Flash Player in December 2020. SWF files are now unplayable in browsers without specialized emulators (Ruffle).
2000s: The Open Standards Push
DOCX/XLSX/PPTX (2006) — Microsoft introduced Office Open XML, replacing the binary DOC/XLS/PPT formats. OOXML is technically a ZIP archive containing XML files, making it more accessible to third-party tools. Microsoft submitted it as an ISO standard (ISO/IEC 29500), though the standardization process was controversial — IBM and others accused Microsoft of manipulating national standards bodies. DOC to DOCX remains a common migration.
FLAC (2001) — Josh Coalson released the Free Lossless Audio Codec, providing CD-quality audio at 50-60% the file size of WAV, with no licensing fees. FLAC became the standard for music archival and audiophile distribution. WAV to FLAC halves storage with zero quality loss.
H.264/AVC (2003) — The Joint Video Experts Team (JVET) published H.264, which became the dominant video codec for a decade. H.264 achieved 50% better compression than MPEG-2 (DVD-era) through variable block sizes, multiple reference frames, and context-adaptive entropy coding. YouTube, Blu-ray, and every streaming service adopted it.
EPUB (2007) — The International Digital Publishing Forum created EPUB as an open ebook format based on XHTML and CSS. It replaced the fragmented landscape of proprietary ebook formats (though Amazon's Kindle ecosystem still uses its own formats). EPUB to MOBI bridges the compatibility gap.
WebM (2010) — Google released WebM (VP8 video + Vorbis audio in a Matroska container) as a royalty-free alternative to H.264 for the web. The "HTML5 video codec war" between Google (VP8/WebM) and Apple/Microsoft (H.264) lasted years before both sides effectively won — browsers now support both.
2010s: The Patent War and Next-Gen Formats
The 2010s were defined by a backlash against patent-encumbered formats. H.265/HEVC (2013) demanded steep royalties from three separate patent pools (MPEG LA, HEVC Advance, and Vantiva), prompting the industry to fund royalty-free alternatives.
WebP (2010) — Google's image format based on VP8. It took a decade to achieve full browser support (Safari 14, September 2020), but by 2026 it's the standard recommendation for web images. JPG to WebP and PNG to WebP are core web optimization conversions.
HEIC/HEIF (2015) — The MPEG group's High Efficiency Image Format, using HEVC for compression. Apple adopted HEIC as the default iPhone photo format in iOS 11 (2017), making it the first lossy image format to seriously challenge JPEG's dominance. But HEVC's patent licensing nightmare limits adoption outside Apple's ecosystem. HEIC to JPG is one of the most-searched conversions.
AV1 (2018) — The Alliance for Open Media (Google, Mozilla, Microsoft, Amazon, Netflix, and others) created AV1 as a royalty-free successor to VP9 and competitor to HEVC. AV1 achieves 30-50% better compression than H.264 and roughly matches HEVC. Encoding is slow (10-100x slower than H.264) but hardware decoders are rolling out.
AVIF (2019) — AV1 applied to still images. AVIF achieves 50% smaller files than JPEG at equivalent visual quality, supports HDR, wide color gamut, and both lossy and lossless modes. Browser support: Chrome 85+ (2020), Firefox 93+ (2021), Safari 16.4+ (2023). JPG to AVIF and PNG to AVIF represent the future of image compression.
2020s: Convergence and Cloud-Native Formats
JPEG XL (2022, ISO 18181) — Developed by the JPEG committee, JXL was designed as the one format to replace everything: lossy, lossless, HDR, animation, transparency, progressive decode, and — uniquely — lossless recompression of existing JPEG files (20% smaller with bit-perfect roundtrip). Chrome added then removed JXL support in 2022-2023, effectively limiting its adoption. Safari 17+ and Firefox (behind a flag) support it. The format is technically superior but may not achieve critical mass without Chrome.
Parquet and Arrow (2013/2016, gained mainstream 2020s) — Apache Parquet (columnar storage for analytics) and Apache Arrow (in-memory columnar format) became standard data interchange formats, replacing CSV for large datasets. CSV to Parquet provides 5-10x compression with columnar query performance.
The trend: formats are converging toward open, royalty-free standards. AVIF and AV1 won the video/image codec wars through sheer industry coalition. Container formats are consolidating (MP4 for video, ZIP for archives). Data formats are splitting: JSON/YAML for human-readable config, Parquet/Arrow for machine-scale analytics.
The Recurring Patterns
Across six decades of file format history, the same patterns repeat:
- Proprietary first, open eventually. DOC → DOCX (sort of). GIF (LZW patent) → PNG (DEFLATE, royalty-free). H.265 (patent pools) → AV1 (royalty-free). PSD is still proprietary, but PSD to PNG conversion tools have reverse-engineered enough of the format.
- Patent disputes accelerate open alternatives. The LZW/GIF crisis directly created PNG. The HEVC patent nightmare directly funded AV1. The pattern is so reliable that aggressive patent enforcement is a leading indicator of an open competitor emerging.
- Network effects trump technical superiority. ZIP isn't the best compressor. JPEG isn't the most efficient image codec. MP3 isn't the best audio format. But they were first, they were widely adopted, and they're still dominant decades later. JPEG XL is technically superior to everything, but without Chrome support, it may not matter.
- Containers outlast codecs. The MP4 container from 1998 holds AV1 video from 2018. The MKV container works with any codec. Containers are infrastructure; codecs are interchangeable components.
File format history is a story of tradeoffs between technical quality, patent freedom, backward compatibility, and adoption momentum. The best format doesn't always win — the most available one does. That's why JPEG still dominates photos 34 years after publication, and why AV1 and AVIF are winning the next generation through industry coalition rather than individual technical merit.
The next decade will likely see AVIF replace JPEG for web images, AV1 replace H.264 for video (once hardware encoding is ubiquitous), and the survival or death of JPEG XL depending entirely on Google's Chrome team. The formats may change, but the pattern won't: openness beats patents, availability beats quality, and backward compatibility beats everything.