File conversion is one of those things everyone does and almost nobody understands. You've probably converted a PNG to JPG to make it smaller for email, or exported an Excel spreadsheet as CSV for a database import. But what actually happens when a file changes format? When is quality lost? When is it preserved perfectly? And why do some conversions take seconds while others take minutes?
This guide covers the entire landscape: nine format categories, the key formats in each, when and why you'd convert between them, and the technical details that matter when you need to get it right. Whether you're a designer optimizing images for the web, a developer wrangling data formats, or someone who just needs their HEIC photos to open on a Windows PC, this is the reference you'll want bookmarked.
At the lowest level, every file is just bytes. A format defines the structure -- which bytes mean what. Conversion reads those bytes according to one format's rules and writes them according to another's. Sometimes this is trivial repackaging (renaming fields in a JSON-to-YAML conversion). Sometimes it requires complex computation (resampling audio waveforms for an MP3 encode). The format determines the difficulty.
Images: Raster, Vector, and Everything Between
Image formats split into two fundamental types: raster (a grid of pixels) and vector (mathematical descriptions of shapes). Converting raster-to-raster is common and well-supported. Converting vector-to-raster is straightforward (render the math to pixels). Converting raster-to-vector is hard (you're trying to reverse-engineer shapes from pixel data).
Within raster formats, the big divide is lossy vs lossless:
| Format | Type | Best For | Typical File Size (12MP photo) |
|---|---|---|---|
| JPEG/JPG | Lossy | Photographs, complex images | 2-5 MB |
| PNG | Lossless | Screenshots, text, transparency | 8-15 MB |
| WebP | Both | Web delivery (26% smaller than JPG, 25% smaller than PNG) | 1.5-4 MB |
| AVIF | Both | Next-gen web delivery (50% smaller than JPG) | 0.8-2 MB |
| GIF | Lossless (256 colors) | Simple animations, icons | Varies widely |
| HEIC | Lossy | iPhone photos (50% smaller than JPG) | 1-3 MB |
| BMP | Uncompressed | Legacy compatibility | 36+ MB |
| TIFF | Both | Print, archival, scientific imaging | 36+ MB (uncompressed) |
| SVG | Vector | Logos, icons, illustrations | 5-100 KB |
Common conversions and when to use them:
- Screenshot (PNG) to email (JPG) -- PNG screenshots are unnecessarily large for sharing. PNG to JPG at quality 85 cuts size by 70% with minimal visible difference.
- Any format to web (WebP) -- WebP is universally supported in modern browsers and produces the best size-to-quality ratio. JPG to WebP or PNG to WebP is the standard web optimization move.
- iPhone photo (HEIC) to anything -- HEIC is Apple's default since iOS 11, but Windows and many web apps don't support it. HEIC to JPG is the most common conversion we see.
- Print prep (any to TIFF/PDF) -- Print shops often require TIFF or PDF at 300 DPI. Converting from JPG to TIFF doesn't add quality, but it meets format requirements.
All image-to-image conversions on ChangeThisFile run client-side using the browser Canvas API -- your images never leave your device. For a deeper format comparison, see PNG vs JPG vs WebP.
Video: Containers, Codecs, and Why It's Complicated
Video is the most complex format category because every video file has two layers: a container and one or more codecs.
The container (MP4, MKV, WebM, AVI, MOV) is the packaging. It defines how video, audio, subtitle tracks, and metadata are organized in the file. Think of it like a ZIP file that holds multiple streams.
The codec (H.264, H.265/HEVC, VP9, AV1) is the compression algorithm applied to the actual video data. The codec determines file size, quality, and playback compatibility.
| Container | Common Codecs | Best For |
|---|---|---|
| MP4 | H.264, H.265 | Universal compatibility (plays everywhere) |
| WebM | VP8, VP9, AV1 | Web embedding (royalty-free codecs) |
| MKV | Any codec | Archival (supports any codec + multiple tracks) |
| AVI | DivX, legacy | Legacy compatibility only |
| MOV | H.264, ProRes | Apple ecosystem, professional editing |
Common conversions:
- Phone video to shareable (MOV to MP4) -- iPhones record MOV by default. MOV to MP4 with H.264 encoding ensures playback on any device.
- Web video (any to WebM) -- MP4 to WebM with VP9 gives better compression for web embedding.
- Extract audio (MP4 to MP3) -- MP4 to MP3 pulls the audio track, useful for podcasts from video interviews.
- Legacy playback (any to AVI) -- Some older hardware and software only reads AVI. It's a compatibility fallback.
Video conversion always runs server-side using FFmpeg because transcoding requires substantial CPU power and memory. For the technical details, see How Video Compression Works.
Audio: Lossy, Lossless, and the Perceptual Coding Trick
Audio formats divide cleanly into lossy and lossless. Lossy formats (MP3, AAC, OGG, Opus) discard audio data that most humans can't hear -- a technique called perceptual coding. Lossless formats (WAV, FLAC, AIFF) preserve every sample of the original recording.
| Format | Type | Typical Bitrate | 1 min stereo (44.1kHz) |
|---|---|---|---|
| WAV | Uncompressed | 1,411 kbps | ~10 MB |
| FLAC | Lossless compressed | ~900 kbps | ~6 MB |
| MP3 | Lossy | 128-320 kbps | 1-2.5 MB |
| AAC | Lossy | 128-256 kbps | 1-2 MB |
| OGG Vorbis | Lossy | 96-320 kbps | 0.7-2.5 MB |
| Opus | Lossy | 64-256 kbps | 0.5-2 MB |
A critical principle: converting between lossy formats always loses quality. MP3-to-AAC re-encodes already-degraded audio, introducing a second generation of lossy compression. If you have the original WAV or FLAC, always convert from that source.
Common conversions:
- Podcast distribution (WAV to MP3) -- WAV to MP3 at 192 kbps is the standard for podcast hosting. Transparent quality at ~1/7 the file size.
- Archiving CDs (CD to FLAC) -- FLAC preserves full CD quality at ~60% the size of WAV, with the ability to reconstruct the original data bit-for-bit.
- Video audio extraction (MP4 to MP3) -- MP4 to MP3 strips the video track and re-encodes the audio. Useful for saving talks, lectures, or music videos as audio files.
Audio conversion runs server-side using FFmpeg. For a deeper dive into format tradeoffs, see Audio Formats Explained.
Documents: Structure vs Layout
Document formats fall on a spectrum from pure structure (Markdown, plain text) to pixel-precise layout (PDF). The further apart two formats are on this spectrum, the harder the conversion.
| Format | Model | Editable | Layout Fidelity |
|---|---|---|---|
| TXT | Plain text | Yes | None |
| Markdown | Semantic structure | Yes | None (rendered by viewer) |
| HTML | Semantic + CSS | Yes | Variable (depends on CSS) |
| RTF | Rich text with formatting | Yes | Basic |
| DOCX | XML-based (Office Open XML) | Yes | High |
| ODT | XML-based (Open Document) | Yes | High |
| Fixed layout | Barely | Exact |
Why PDF-to-DOCX is the hardest common conversion: PDF describes where to put each character on the page, not the document's logical structure. It doesn't know about paragraphs, headings, or tables -- just glyph positions. Converting PDF to DOCX requires reverse-engineering the structure from character positions, which is inherently lossy and imperfect. Simple PDFs convert well. Complex PDFs with multi-column layouts, tables, and images will have formatting issues.
Common conversions:
- Submission (DOCX to PDF) -- DOCX to PDF for sending documents that should look the same everywhere.
- Editing (PDF to DOCX) -- PDF to DOCX when you need to edit a PDF. Expect formatting adjustments.
- Web publishing (DOCX to HTML) -- DOCX to HTML runs client-side using mammoth.js and extracts semantic HTML (headings, paragraphs, lists), discarding Word-specific formatting.
- Markdown workflow (MD to HTML) -- Markdown to HTML for static sites, documentation, and README rendering. Fully client-side.
Simple document conversions (DOCX to HTML, MD to HTML, HTML to MD) run client-side. Complex ones (DOCX to PDF, PDF to DOCX, DOC to anything) need server-side LibreOffice. For more, see Document Conversion Guide.
Data and Config: The Plumbing of Software
Data serialization formats are how software communicates and stores structured information. Converting between them is almost always lossless -- you're repackaging the same data in different syntax.
| Format | Structure | Human-Readable | Primary Use |
|---|---|---|---|
| JSON | Hierarchical | Yes | APIs, web data, configuration |
| YAML | Hierarchical | Very | DevOps config (K8s, Docker, CI/CD) |
| XML | Hierarchical | Somewhat | Enterprise, legacy systems, RSS |
| CSV | Flat (tabular) | Yes | Spreadsheets, databases, exports |
| TOML | Hierarchical | Very | Application config (Rust, Python) |
| TSV | Flat (tabular) | Yes | Database exports, scientific data |
| INI | Flat sections | Yes | Legacy Windows config |
The main conversion challenge with data formats is structural mismatch. Converting JSON (hierarchical) to CSV (flat) requires flattening nested objects. Converting CSV to JSON requires deciding whether column values should be strings, numbers, or booleans. These are interpretation choices, not data loss.
Common conversions:
- API integration (CSV to JSON) -- CSV to JSON for loading spreadsheet data into web applications.
- Config migration (JSON to YAML) -- JSON to YAML when moving from JSON-based config to YAML-based tools.
- Spreadsheet import (JSON to CSV) -- JSON to CSV for opening API data in Excel or Google Sheets.
- Cross-format (XML to JSON) -- XML to JSON for modernizing legacy data interchange.
All data format conversions on ChangeThisFile run client-side -- they're pure text transformations with no server dependency. For a detailed comparison, see CSV vs JSON vs XML.
Spreadsheets: Where Data Meets Formatting
Spreadsheet formats sit at the intersection of data and documents. A CSV is pure data. An XLSX file contains data plus formulas, formatting, charts, multiple sheets, cell styles, and conditional formatting. Conversion between them involves deciding what to preserve and what to discard.
| Format | Formulas | Multiple Sheets | Formatting | File Size (10K rows) |
|---|---|---|---|---|
| XLSX | Yes | Yes | Full | ~500 KB |
| XLS | Yes | Yes | Full (legacy) | ~800 KB |
| ODS | Yes | Yes | Full | ~400 KB |
| CSV | No | No | None | ~200 KB |
Common conversions:
- Data export (XLSX to CSV) -- XLSX to CSV extracts the active sheet's cell values. Formulas are evaluated and the results are exported. Formatting is discarded.
- Compatibility (XLS to XLSX) -- XLS to XLSX upgrades legacy Excel files to the modern format.
- Analysis prep (CSV to XLSX) -- CSV to XLSX brings raw data into Excel where you can add formulas and charts.
Spreadsheet conversions run client-side using the SheetJS library, which can parse and write XLSX, XLS, ODS, and CSV entirely in the browser.
Ebooks: Reflowable vs Fixed
Ebook formats divide into reflowable (text reflows to fit the screen) and fixed layout (pages look the same everywhere).
| Format | Layout | DRM | Ecosystem |
|---|---|---|---|
| EPUB | Reflowable | Optional | Universal (Apple Books, Kobo, most readers) |
| MOBI | Reflowable | Optional | Kindle (legacy) |
| AZW3/KF8 | Reflowable | Yes (Kindle DRM) | Kindle |
| FB2 | Reflowable | No | Russian-language readers |
| Fixed | Optional | Universal |
Common conversions:
- Kindle preparation (EPUB to MOBI) -- EPUB to MOBI for sideloading books onto older Kindles. (Newer Kindles accept EPUB directly.)
- Universal ebook (MOBI to EPUB) -- MOBI to EPUB to free a book from Kindle-only format.
- Print-ready (EPUB to PDF) -- Converting reflowable to fixed layout. Page breaks and layout may differ from the ebook reading experience.
Ebook conversion runs server-side using Calibre, the standard open-source ebook toolkit. For more detail, see Ebook Formats Guide.
Archives: Bundle and Compress
Archive formats do two things: bundle multiple files into one, and compress the data to reduce size. Some formats separate these concerns (TAR bundles, GZIP compresses, TAR.GZ does both). Others combine them (ZIP, 7Z, RAR).
| Format | Compression Ratio | Speed | Compatibility |
|---|---|---|---|
| ZIP | Good | Fast | Universal (built into every OS) |
| 7Z | Best (LZMA2) | Slower | Requires 7-Zip or compatible tool |
| TAR.GZ | Good (GZIP) | Fast | Unix/Linux native |
| TAR.BZ2 | Better (Bzip2) | Moderate | Unix/Linux |
| TAR.XZ | Best (LZMA) | Slower | Modern Linux |
| RAR | Very good | Moderate | Proprietary (can extract, can't create without WinRAR) |
Common conversions:
- Maximum compression (ZIP to 7Z) -- ZIP to 7Z can reduce file size by 20-40% compared to ZIP's default DEFLATE compression.
- Compatibility (7Z to ZIP, RAR to ZIP) -- 7Z to ZIP and RAR to ZIP convert to the one format everyone can open without installing anything.
- Cross-platform (TAR.GZ to ZIP) -- Converting between Unix and Windows archive conventions.
Archive conversion extracts the source archive and re-archives in the target format. Directory structure and file permissions are preserved where both formats support them. All archive conversions run server-side using 7-Zip. For more, see Archive Formats Compared.
Fonts: Desktop vs Web
Font formats split into desktop formats (for installing on your OS) and web formats (for embedding in websites).
| Format | Use | Compression | Browser Support |
|---|---|---|---|
| TTF | Desktop, some web | None | All browsers |
| OTF | Desktop, professional | None | All browsers |
| WOFF | Web (first gen) | DEFLATE | All browsers |
| WOFF2 | Web (current standard) | Brotli (~30% smaller than WOFF) | All modern browsers |
Common conversions:
- Web optimization (TTF to WOFF2) -- TTF to WOFF2 compresses a desktop font for web use, typically reducing file size by 40-60%.
- Desktop install (WOFF2 to TTF) -- WOFF2 to TTF extracts a desktop-installable font from a web font.
Font conversions run server-side using Python fonttools. For more, see Font Formats Guide.
Lossy vs Lossless: When Quality Gets Lost
This is the single most important concept in file conversion. Lossy conversion permanently discards data to achieve smaller files. Lossless conversion preserves all original data exactly.
The analogy: lossless is like pouring water between different-shaped glasses -- the water (data) stays the same, only the container changes. Lossy is like distilling the water first -- you get something smaller and often good enough, but you can't un-distill it.
Lossy formats and what they discard:
- JPEG -- Discards high-frequency visual detail that humans are bad at perceiving. At quality 85, most people can't see the difference. At quality 50, compression artifacts become visible.
- MP3 -- Discards audio frequencies masked by louder sounds (psychoacoustic masking). At 192 kbps, most listeners can't distinguish from the original. At 64 kbps, the audio sounds "underwater."
- H.264/H.265 -- Discards visual detail between frames (temporal compression) and within frames (spatial compression). CRF 18-23 is visually transparent for most content.
The generation loss problem: Every lossy re-encoding degrades quality further. Converting JPG-to-PNG-to-JPG produces a worse JPG than the original. Converting MP3-to-WAV-to-MP3 produces a worse MP3. The intermediate lossless format doesn't restore what was already lost -- it just preserves the degraded version. Always convert from the highest-quality source available.
When lossy is the right choice: Sharing, streaming, web delivery, and storage-constrained scenarios. A 50MB WAV podcast episode is impractical; a 5MB MP3 at 192 kbps sounds identical to 99% of listeners.
When lossless is essential: Archival, professional editing, medical/scientific imaging, and any workflow where the file will be processed further. Convert to lossy only at the final delivery step.
Metadata: What Survives Conversion
Files carry more than their primary content. Images have EXIF data (camera model, GPS coordinates, exposure settings). Audio files have ID3 tags (title, artist, album, cover art). Documents have properties (author, creation date, revision count). What happens to this metadata during conversion depends on the format pair and the conversion tool.
| Conversion | Metadata Outcome |
|---|---|
| JPG to PNG | EXIF is typically stripped (PNG doesn't natively support EXIF) |
| PNG to JPG | Minimal metadata (no EXIF from PNG, JPG gets basic JFIF header) |
| MP3 to WAV | ID3 tags lost (WAV doesn't support ID3) |
| WAV to MP3 | New ID3 tags can be written, but originals may not transfer |
| DOCX to PDF | Document properties usually preserved (author, title) |
| CSV to JSON | No metadata to lose (CSV has no metadata standard) |
If metadata preservation matters for your use case (photographer workflows, music library management), verify that the specific conversion preserves what you need. Batch converting vacation photos from HEIC to JPG, for instance, may strip GPS data -- which might be exactly what you want for privacy, or might be a problem if you're organizing by location.
Quality Settings: What the Numbers Mean
Most lossy conversions expose a quality parameter. Understanding what it actually controls helps you make better tradeoffs:
| Format | Setting | Range | Sweet Spot |
|---|---|---|---|
| JPEG | Quality | 1-100 | 80-85 (visually transparent, ~10:1 compression) |
| WebP | Quality | 1-100 | 75-80 (better than JPG at same quality) |
| MP3 | Bitrate (kbps) | 64-320 | 192 kbps for music, 128 for speech |
| H.264 | CRF | 0-51 (lower = better) | 18-23 (visually transparent) |
| H.265 | CRF | 0-51 | 22-28 (similar quality at lower CRF than H.264) |
| FLAC | Compression level | 0-8 | 5 (good balance; lossless at any level) |
The diminishing returns curve: Quality improvements follow a logarithmic curve. Going from JPEG quality 60 to 80 makes a visible difference. Going from 80 to 100 roughly doubles file size with barely perceptible improvement. The sweet spot is always on the knee of the curve -- the highest quality per byte.
For FLAC, the compression level doesn't affect quality at all -- every level is lossless. It only affects how much CPU time is spent finding optimal compression. Level 5 is the conventional default: significantly smaller than level 0, almost as small as level 8, and much faster to encode.
Client-Side vs Server-Side: Privacy and Speed Tradeoffs
Where a conversion runs -- in your browser or on a remote server -- affects both privacy and performance.
| Factor | Client-Side (Browser) | Server-Side (Remote) |
|---|---|---|
| Privacy | File never leaves your device | File uploaded to remote server |
| Speed (small files) | Faster (no upload/download) | Slower (network overhead) |
| Speed (large files) | Limited by browser memory/CPU | Faster (more resources) |
| Format support | Limited to browser-compatible libraries | Full (FFmpeg, LibreOffice, etc.) |
| Offline | Works offline once loaded | Requires internet |
| Max file size | ~1-2 GB (browser limit) | 50 MB (ChangeThisFile API limit) |
ChangeThisFile uses client-side processing for all image, data, spreadsheet, and simple document conversions (445 routes). Server-side processing handles video, audio, ebook, archive, RAW photo, and font conversions (546 routes). Each conversion page clearly indicates which method is used.
For a deep dive into the privacy implications and how to verify client-side claims, see Why Client-Side File Conversion Is Better for Privacy.
Batch Conversion: Processing Multiple Files
When you need to convert dozens or hundreds of files, single-file converters become impractical. Strategies for batch conversion:
- Client-side batch -- Drop multiple files on ChangeThisFile and they'll be converted sequentially in your browser. Good for up to ~20 files of moderate size.
- Command-line tools -- For larger batches, local tools are faster.
for f in *.png; do convert "$f" "${f%.png}.jpg"; done(ImageMagick) orffmpeg -i input.mov output.mp4(FFmpeg) can be scripted for thousands of files. - API integration -- ChangeThisFile's
/v1/convertAPI accepts programmatic conversion requests with API key authentication, useful for integrating conversion into automated workflows.
For batch image conversion, client-side processing has a unique advantage: since files never leave your device, you can process sensitive documents in bulk without any of them touching a server.
File conversion is ultimately about moving data from where it is to where it needs to be, in the shape it needs to be in. Sometimes that's a photographer exporting HEIC photos as JPG for a client. Sometimes it's a developer converting XML config to YAML for a Kubernetes deployment. Sometimes it's a student converting a DOCX essay to PDF for submission.
The key to getting good results is understanding what's happening under the hood: whether quality is being lost, whether metadata is being preserved, and whether your file is being processed locally or sent to a remote server. Armed with that knowledge, you can make informed choices about formats, quality settings, and tools.
ChangeThisFile handles 991 conversion routes across 82 formats, with client-side processing for everything that doesn't absolutely require a server. Pick a conversion from any of the categories above and try it -- the fastest way to understand file conversion is to do one.