XLS and XLSX look almost identical when you open them in Excel. Same grid, same formulas, same charts. Under the hood, they're fundamentally different file formats with different architectures, different limits, and different security models. XLS is a binary blob that only Excel can fully interpret. XLSX is a ZIP of human-readable XML files that any tool can parse.
The format war ended years ago — XLSX won decisively. But XLS refuses to die. Financial institutions, government agencies, and manufacturing companies still generate millions of XLS files yearly, locked in by legacy macros, regulatory compliance, and sheer inertia. If you work with spreadsheets from multiple sources, you'll encounter both formats regularly.
Binary Format vs Open XML: The Architecture Gap
XLS uses the Binary Interchange File Format (BIFF), specifically BIFF8 for Excel 97-2003. It stores data as a stream of binary records inside a Compound File Binary Format (CFBF) container — the same OLE2 container used by old .doc and .ppt files. Each record has a type identifier and payload. Reading a BIFF8 file means parsing a sequence of binary records: BOF (beginning of file), DIMENSION (used range), ROW, LABELSST (string cell), NUMBER (numeric cell), FORMULA, EOF.
XLSX uses Office Open XML (OOXML), which is a ZIP archive containing XML files. You can rename .xlsx to .zip, extract it, and read the XML with any text editor. Cell data is in xl/worksheets/sheet1.xml, styles in xl/styles.xml, shared strings in xl/sharedStrings.xml.
This difference has real consequences. Binary XLS files are opaque — you need a BIFF parser to read them, and bugs in those parsers have been a source of security vulnerabilities for decades. XML-based XLSX files are transparent — you can grep for a cell value, validate the XML against a schema, and write custom tools with basic XML parsing.
Row Limits, Column Limits, and Other Boundaries
The limits are the most commonly cited difference, and they're dramatic:
| Limit | XLS (BIFF8) | XLSX (Open XML) |
|---|---|---|
| Rows per sheet | 65,536 | 1,048,576 |
| Columns per sheet | 256 (A-IV) | 16,384 (A-XFD) |
| Characters per cell | 32,767 | 32,767 |
| Unique colors | 56 (color palette) | 16 million (full RGB) |
| Conditional formats per cell | 3 | Unlimited |
| Sort levels | 3 | 64 |
| Undo levels | 16 | 100 |
| Unique cell formats | 4,000 | 64,000 |
| Hyperlinks per sheet | 65,530 | Unlimited |
The 65,536 row limit is the one that actually hurts. Any dataset over 65K rows simply cannot be stored in XLS. This is why data teams that still receive XLS files often convert XLS to XLSX as a first step — not for cosmetic reasons, but because the row limit prevents working with modern datasets.
The 256-column limit matters less often but is equally hard when you hit it. Wide survey data, genomics data, or denormalized exports from databases regularly exceed 256 columns.
File Size: Binary vs Compressed XML
You might expect binary format to be smaller than XML, and for small files, it is. A simple 100-row spreadsheet is often smaller as XLS (30KB) than XLSX (15KB ZIP, but the uncompressed XML inside is 80KB+). The binary format is more compact per-record.
But XLSX wins at scale. For files with thousands of rows and repetitive text data, the shared strings table plus ZIP compression makes XLSX significantly smaller. A 50,000-row dataset with country names and categories might be 8MB as XLS but only 3MB as XLSX. The crossover point is typically around 1,000 rows — below that, XLS is smaller; above, XLSX is smaller.
The XLSB format (Binary Workbook) combines the best of both: binary storage inside a ZIP container. It's typically 30-50% smaller than XLSX and loads significantly faster. But XLSB sacrifices the XML readability advantage and isn't supported by most non-Microsoft tools.
Macro Security: The Split That Changed Everything
In XLS, macros live inside the same file as the data. There's no way to tell from the file extension whether an XLS file contains VBA code. You have to open it (or scan it) to find out. This is the primary reason macro viruses were so effective in the 1990s and 2000s — every email attachment ending in .xls was potentially dangerous.
XLSX solved this by splitting macros into a separate format. Files ending in .xlsx are guaranteed macro-free. Files ending in .xlsm contain macros. This simple naming convention lets email filters, IT policies, and users themselves make informed decisions about trust before opening a file.
When you convert XLS to XLSX, any macros in the original file are stripped. The data and formulas survive, but VBA code is silently removed. If you need to preserve macros, you must save as .xlsm instead. This is frequently a surprise for organizations migrating legacy XLS files that contain critical VBA automation.
Why Organizations Still Use XLS
Despite being technically obsolete since 2007, XLS remains entrenched for specific reasons:
- Legacy VBA macros. Enterprise spreadsheets with thousands of lines of VBA code can't be casually migrated. The macros may depend on Windows API calls, ActiveX controls, or COM automation that works in the BIFF8 environment but behaves differently (or breaks) in OOXML.
- Regulatory compliance. Some industries have compliance processes that reference specific file formats. Changing from XLS to XLSX requires updating compliance documentation, revalidating processes, and sometimes getting regulatory approval. The cost of changing exceeds the cost of maintaining the status quo.
- Third-party integrations. Legacy ERP systems, mainframe exports, and industrial control systems may only output XLS. The vendor is gone, the code is unmaintained, and the output format is what it is.
- Excel for Mac 2011. The last version of Excel for Mac before the unified codebase had significant XLSX compatibility issues. Organizations that standardized on that version (common in education) may have settled on XLS for reliability.
Migrating from XLS to XLSX: A Practical Strategy
If you're managing a bulk migration, here's what works:
- Inventory. Find all XLS files and categorize them: data-only (no macros), macro-enabled (VBA code), and template (used to generate new files). Each category has a different migration path.
- Data-only files: batch convert. Files with no macros can be safely converted from XLS to XLSX in bulk. The data, formulas, and formatting all transfer. Spot-check a sample for formula accuracy, conditional formatting preservation, and chart rendering.
- Macro-enabled files: audit first. Before converting, inventory the VBA code. Simple macros (formatting, printing, data validation) usually work in XLSM without changes. Complex macros (API calls, ActiveX controls, UserForms) may need rewriting. Convert to XLSM, test every macro, and fix what breaks.
- Templates: convert and re-deploy. Templates need testing by the people who use them. Convert the template to XLTX or XLTM, have the original users create a test file from it, and verify the workflow still works.
- Update integrations. Any system that imports or exports XLS needs updating. Most modern libraries support both formats already, but the switch may need a configuration change or code update.
The migration is almost always worth it. XLSX files are more secure (no hidden macros), more compact (for large data), more compatible (every modern tool supports OOXML), and far easier to programmatically process. The 65K row limit alone justifies the effort for any data-heavy workflow.
The XLS-to-XLSX transition mirrors the broader shift from proprietary binary formats to open, inspectable standards. XLS served its purpose for a decade, but its binary opacity, row limits, and security model belong to a different era. XLSX is better in every measurable dimension: capacity, transparency, security, compression, and interoperability.
If you're still working with XLS files, the path forward is clear: convert to XLSX for data-only files, audit and migrate macros to XLSM, and update any integrations that depend on the binary format. The longer you wait, the more technical debt accumulates around a format that Microsoft itself stopped improving twenty years ago.