Audiobooks and ebooks contain the same words but deliver them through completely different technology stacks. An ebook is a text file with formatting. An audiobook is hours of compressed audio. They share authors, titles, and ISBNs, but the formats, distribution channels, production processes, and user experiences are distinct.
The interesting territory is where they overlap: text-to-speech that generates audio from ebook text, media overlays that synchronize narration with highlighted text, and AI voices that are making the line between "audiobook" and "TTS" increasingly blurry.
Audiobook File Formats
Audiobook formats are audio containers with metadata support for chapters, cover art, and bookmarking:
| Format | Container | Codec | DRM | Chapters | Used By |
|---|---|---|---|---|---|
| M4B | MPEG-4 Part 14 | AAC | Optional (FairPlay) | Yes | Apple Books, iTunes |
| AA | Audible (legacy) | Audible codec | Audible DRM | Yes | Audible (legacy) |
| AAX | MPEG-4 + Audible | AAC | Audible DRM | Yes | Audible (current) |
| MP3 | MPEG Audio Layer 3 | MP3 | None | Via ID3 CHAP frames | Universal |
| FLAC | FLAC container | FLAC (lossless) | None | Via Vorbis comments | Audiophile/archival |
| OGG | Ogg container | Opus/Vorbis | None | Via Vorbis comments | Libre audiobooks |
M4B: The De Facto Standard
M4B is a renamed M4A (MPEG-4 audio) file with the .m4b extension signaling "audiobook" to media players. The extension tells Apple's software to enable bookmarking (remember playback position) and file the content under Audiobooks instead of Music. Technically, M4B and M4A are identical containers with AAC audio.
M4B supports: AAC audio at up to 320kbps (most audiobooks use 64-128kbps, which is sufficient for speech), chapter markers with timestamps, cover art embedded as JPEG/PNG, and metadata (title, author, narrator, series). DRM-free M4B files play in Apple Books, VLC, Bookplayer, Smart Audiobook Player, and most modern media players.
Audible AA and AAX
Audible (Amazon's audiobook platform) uses two proprietary formats. AA is the legacy format using Audible's own codec at 32-64kbps — acceptable quality for speech but noticeably compressed. AAX is the current format using AAC in an MPEG-4 container with Audible's DRM layer. AAX quality is good — typically 64-128kbps AAC, comparable to M4B.
Both formats are DRM-locked to your Audible account. They play in the Audible app, Amazon Music, and Kindle devices. They don't play in generic media players or other audiobook apps without DRM removal. This lock-in is the same strategy Amazon uses with ebook formats — purchase through Amazon, listen through Amazon.
Ebook Formats (For Context)
Ebook formats covered in detail elsewhere in this guide series:
- EPUB — Open standard, reflowable HTML/CSS content. Universal except Kindle. Full guide
- MOBI/AZW3 — Amazon's Kindle formats. MOBI is dead; AZW3 (KF8) is current. Full guide
- PDF — Fixed layout, wrong for reading on variable screens but necessary for layout-dependent content. Comparison guide
The key difference from audiobooks: ebook formats store text that devices render visually. Audiobook formats store audio that devices play back. Converting between them requires either text-to-speech (ebook to audio) or speech-to-text (audio to ebook) — both fundamentally different from format conversion like EPUB to MOBI where the content is the same in a different container.
Text-to-Speech: The Bridge Between Formats
Text-to-speech converts ebook text to synthetic speech in real time. It's built into every major reading platform: Kindle's VoiceView/text-to-speech, Apple Books' spoken content (via iOS), Google Play Books' Read Aloud, and Kobo's Rakuten TTS. The quality varies by platform and TTS engine.
TTS Is Not an Audiobook
Traditionally, TTS has significant limitations versus professional narration: monotone or awkward prosody (emphasis on wrong syllables), incorrect pronunciation of proper nouns, place names, and domain-specific terms, no character differentiation (all dialogue sounds the same), no emotional variation matching the text's tone, and unnatural pacing that doesn't pause at the right moments.
Professional audiobook narration involves: character voices, emotional delivery, pacing matched to content (faster for action, slower for reflection), correct pronunciation of every name and term (narrators receive pronunciation guides), and studio-quality audio production. A 10-hour audiobook costs $2,000-$10,000+ for professional narration.
AI TTS: The Quality Gap Closes
Since 2023, AI-generated speech has dramatically improved. Services like ElevenLabs, OpenAI's TTS API, Google's WaveNet/Journey voices, and Amazon Polly Neural produce speech that's often indistinguishable from human narration in short samples. Features that were exclusive to human narrators — emotional variation, natural pauses, emphasis matching context — are now possible with AI.
The remaining gap: long-form consistency. A human narrator maintains character voice consistency across 10 hours. AI voices can drift or produce inconsistent character voices in extended passages. Dialogue-heavy fiction with many characters is still noticeably better with human narration. Non-fiction, technical content, and single-narrator prose are where AI TTS excels.
Several publishers now offer "AI-narrated" audiobooks at lower price points. Amazon's Virtual Voice program generates AI narrations from Kindle ebooks. Apple partnered with publishers for AI-narrated titles in Apple Books. The ethical debate (displacing narrators, quality expectations, disclosure) is ongoing, but the technology is here.
Platform Comparison: Audio vs Text
| Platform | Ebook Support | Audiobook Support | Dual Format |
|---|---|---|---|
| Amazon Kindle | AZW3, KFX, EPUB (via email) | AAX (via Audible) | Whispersync (switch between Kindle+Audible) |
| Apple Books | EPUB, PDF | M4B (Apple audiobooks) | Some titles offer ebook+audio bundle |
| Google Play | EPUB, PDF | MP3/AAC (Google audiobooks) | Separate purchases |
| Kobo | EPUB, PDF | Kobo Audiobooks (streaming) | Combined subscription plans |
| Libby/OverDrive | EPUB (Adobe ADEPT DRM) | MP3 (time-locked) | Borrow both from library |
| Spotify | None | Streaming (select titles) | N/A |
Amazon Whispersync: The Dual-Format Experience
Amazon's Whispersync for Voice synchronizes reading position between a Kindle ebook and its Audible audiobook. Start reading Chapter 5 on your Kindle, switch to Audible in the car, and the audiobook picks up where you stopped reading. Switch back to the ebook that evening, and it's at the point where you stopped listening.
This requires purchasing both the Kindle ebook and the Audible audiobook of the same title. Amazon often offers a discounted Audible price if you own the Kindle version (sometimes as low as $1.99-$7.49 for the audio add-on). It's the most seamless dual-format experience available, and it's a genuine competitive advantage for Amazon's ecosystem.
Library Audiobooks and Ebooks
Public libraries lend both ebooks and audiobooks digitally through OverDrive/Libby (the dominant platform), hoopla, cloudLibrary, and Palace Project. The formats are DRM-protected with time-locked licenses — the file expires after the lending period (typically 7-21 days).
Ebook lending uses Adobe ADEPT DRM on EPUB files or Kindle format via Amazon's library lending program. Audiobook lending delivers MP3 files with DRM or streaming audio. Library audiobooks are the same professional narrations sold commercially — same narrator, same production quality, just time-limited access.
Library audiobook demand consistently outstrips supply. Popular titles have weeks-long wait lists because publishers limit the number of simultaneous licenses. Ebooks have the same licensing model but typically shorter waits because more copies are available. For budget-conscious readers, the library is the free access point for both formats.
Dual Format: Ebook + Audiobook
Several publishers and platforms offer bundled ebook + audiobook purchases at a discount versus buying separately:
- Amazon: Whispersync discounts (buy Kindle ebook, get Audible audiobook at reduced price)
- Libro.fm: DRM-free audiobooks, some bundled with ebook purchases from partner bookstores
- Kobo: Combined ebook + audiobook subscription plans
- Chirp: Discounted DRM-free audiobooks (BookBub's audiobook platform)
The ideal format pair is EPUB + M4B: an open ebook and an open audiobook, both DRM-free, playable on any device. In practice, most commercial audiobooks are either DRM-locked (Audible) or platform-specific. DRM-free audiobooks are available from Libro.fm, Google Play (some titles), and directly from some publishers (Tor, Macmillan Audio).
Audiobooks and ebooks serve different consumption patterns: commuting vs. sitting, multitasking vs. focused reading, auditory learners vs. visual learners. They're complementary, not competing. The most satisfying setup is access to both formats of the same book — read at home, listen in the car — which is why Amazon's Whispersync and library dual lending are popular despite the friction.
The AI TTS revolution is blurring the boundary. When an ebook can generate professional-quality audio on the fly, the distinction between "ebook with TTS" and "audiobook" becomes semantic. For publishers, this means lower production costs for audio. For readers, it means more titles available in audio. For professional narrators, it means a shifting market. The technology is here; the industry is still adapting.