XLSX and CSV both store tabular data, but they represent fundamentally different philosophies about what a data file should contain. CSV is brutally simple: rows of values, columns separated by commas, nothing else. XLSX is a full spreadsheet application format — a ZIP archive containing XML files that encode not just data values but the entire computational and visual state of a spreadsheet.
This difference is not a matter of features — it's a matter of purpose. CSV is designed to move data between systems. XLSX is designed to be a complete working environment for a human spreadsheet user. Choosing the wrong format for your situation means either losing critical information (CSV when you needed formulas) or creating unnecessary complexity (XLSX when a database just needs the values).
What Each Format Stores
Understanding what XLSX contains — versus what it loses when you export to CSV — is the starting point for the right format choice.
What XLSX Stores
XLSX (Office Open XML Spreadsheet) is a ZIP archive. Unzip an .xlsx file and you'll find XML files inside. It stores:
- Cell values. The actual data: text, numbers, dates, booleans.
- Formulas.
=SUM(B2:B100),=VLOOKUP(A2, Sheet2!A:B, 2, FALSE), pivot table calculations. These are stored as formula strings, re-evaluated when you open the file. - Cell formatting. Font, size, bold/italic, color, borders, fill color, number formats (
$#,##0.00,yyyy-mm-dd, percentage). - Multiple sheets. A single .xlsx file can contain dozens of worksheets, each with its own data and formatting.
- Charts and graphs. Bar charts, line charts, pivot charts — stored as embedded chart objects.
- Named ranges.
TaxRatethat refers to$B$2— formula references that are human-readable. - Data validation. Dropdown lists, numeric constraints, date ranges — the rules that prevent invalid data entry.
- Conditional formatting. Highlight rules — cells that turn red when below a threshold, color scales, data bars.
- Pivot tables. Summary aggregations with filters, groupings, and calculated fields.
- Merged cells. Cells spanning multiple rows or columns — common in reports and headers.
- Comments and notes. Reviewer annotations attached to specific cells.
- Shared strings. XLSX deduplicates repeated string values in a shared string table for file size efficiency.
What CSV Stores (and What It Loses)
CSV stores exactly one thing: the computed values of cells. When you export an XLSX to CSV:
- Formulas → computed values.
=SUM(B2:B100)becomes the number4827.50. The formula is gone. If the source data changes, the CSV won't update. - Formatting → lost. Bold headers, currency formatting, color coding — none of it survives. Every cell is a plain string in CSV.
- Multiple sheets → one sheet. CSV can only represent a single table. When you export a multi-sheet XLSX to CSV, you either export one sheet or create multiple CSV files.
- Charts → lost. No chart format exists in CSV. Charts are visual representations of the data; CSV keeps the data only.
- Number formats → string representation. A date stored as a serial number with a date format in XLSX becomes either a date string (
2026-04-13) or an Excel serial number (46384) in CSV, depending on how the export is performed. This is a frequent source of data corruption. - Data types → ambiguous. CSV has no type system. The number
007may be interpreted as integer7when opened in Excel. The date3/5might be parsed as March 5, May 3, or the fraction 0.6 depending on locale and software.
The rule of thumb: CSV preserves the what but not the how. It's the data without the context, computation, or presentation.
File Size Comparison
For purely numeric or simple tabular data, CSV is smaller than XLSX. For complex spreadsheets with formatting, XLSX's ZIP compression makes the comparison less clear.
| Data Type | CSV Size | XLSX Size | Notes |
|---|---|---|---|
| 10,000 rows of numbers (5 columns) | ~450 KB | ~180 KB | XLSX ZIP compression wins on repetitive numeric data |
| 10,000 rows with text (mixed) | ~800 KB | ~350 KB | XLSX shared strings table deduplicates repeated values |
| Simple 100-row table, no formatting | ~5 KB | ~8 KB | XLSX overhead from XML structure and required files |
| Complex spreadsheet with charts, formatting | N/A (can't represent) | ~2-20 MB | Charts and embedded objects dominate size |
Counterintuitively, XLSX is often smaller than CSV for large datasets because XLSX uses ZIP compression internally. A 1 million-row CSV file might be 150MB; the equivalent XLSX could be 30-50MB. However, XLSX has baseline overhead from its XML structure — for small files (under ~200 rows), a plain CSV is usually smaller due to the XML scaffolding required even for empty XLSX files.
For sharing large datasets via email or messaging apps with file size limits, this matters. Convert to whichever is smaller: XLSX to CSV for simple tabular data, or keep XLSX for complex spreadsheets where the compression benefit outweighs the format overhead.
Compatibility Across Tools and Systems
Compatibility is the primary reason to choose CSV over XLSX in many data workflows:
CSV: Near-Universal Compatibility
Every programming language has a CSV parser. Every database can import CSV. Every BI tool, data warehouse, and analytics platform accepts CSV. It's plain text — any text editor can open it, any scripting language can process it without a library, and any system that can read files can parse it.
- Database imports: PostgreSQL
COPY FROM, MySQLLOAD DATA INFILE, BigQuery load jobs, Snowflake COPY INTO — all accept CSV. - Programming languages: Python's built-in
csvmodule, pandasread_csv, R'sread.csv, Go'sencoding/csv, Node'scsv-parse— zero external dependencies. - Data tools: Jupyter notebooks, dbt, Spark, DuckDB, Hadoop — all treat CSV as a first-class input format.
- Legacy systems: Systems from the 1980s and 1990s that predate Excel's XLSX format can still read CSV. XLSX requires modern software.
XLSX: Spreadsheet-Ecosystem Compatibility
XLSX compatibility is excellent within the spreadsheet ecosystem but less universal for programmatic processing:
- Excel, Google Sheets, LibreOffice Calc: Native format support, full fidelity (with some compatibility notes between vendors for advanced features).
- Programming languages: Requires a library — openpyxl or xlrd in Python, SheetJS (xlsx) in JavaScript, Apache POI in Java. These libraries are mature and widely used but add a dependency.
- Cloud services: Google Drive, Dropbox, OneDrive all preview XLSX natively. Many SaaS platforms can import XLSX (HubSpot, Salesforce, Shopify).
- Databases: Databases don't natively import XLSX. You need an intermediate step: open in a spreadsheet app, export to CSV, then import. Some ETL tools handle XLSX directly but it's never a native database format.
The practical rule: if the data is going into a programming pipeline, database, or analytics tool, convert to CSV first. If the data is going to a human who will use it in a spreadsheet application, keep XLSX.
When to Convert Between XLSX and CSV
The most common conversion scenarios:
- XLSX → CSV for database import. You have a spreadsheet of customer records, product inventory, or financial data that you need to import into a database or data warehouse. Export each sheet as CSV and use the database's bulk import tool. This is the most common XLSX-to-CSV use case. Use ChangeThisFile's XLSX to CSV converter.
- XLSX → CSV for Python/pandas analysis. Pandas can read XLSX directly (
pd.read_excel), butpd.read_csvis 3-10x faster for large files because CSV parsing is simpler. For one-time analysis, read XLSX directly. For repeated processing of large files, convert to CSV first. - CSV → XLSX for sharing reports. You have a CSV data export that you want to format as a professional report — add headers, apply number formatting, create summary rows. Import the CSV into Excel or Google Sheets, apply formatting, and save as XLSX. Use CSV to XLSX conversion to start.
- XLSX → TSV for data pipelines that need tab-delimited output. If your data contains commas (common in XLSX — addresses, descriptions, formatted numbers), exporting to TSV instead of CSV avoids quoting issues. Use XLSX to TSV conversion.
- XLSX → CSV for ML training data. Machine learning pipelines typically expect flat CSV or TSV. Export the relevant sheet, ensure numeric values don't have thousands separators or currency symbols, and check that dates are in ISO format (YYYY-MM-DD) rather than locale-specific formats.
How to Convert Without Excel
You don't need Excel or Google Sheets to convert between XLSX and CSV. Several options work without any spreadsheet application:
Browser-Based Conversion (No Software Required)
ChangeThisFile's converters run entirely in your browser using JavaScript — your file never leaves your device. Convert XLSX to CSV: XLSX to CSV. For the reverse: CSV to XLSX. This is the fastest option for one-off conversions — no software installation, no file upload.
Python (pandas + openpyxl)
For programmatic or batch conversion:
# XLSX to CSV (all sheets)
import pandas as pd
xl = pd.ExcelFile('input.xlsx')
for sheet_name in xl.sheet_names:
df = xl.parse(sheet_name)
df.to_csv(f'{sheet_name}.csv', index=False)
# CSV to XLSX
df = pd.read_csv('input.csv')
df.to_excel('output.xlsx', index=False, engine='openpyxl')Requires: pip install pandas openpyxl. For read-only XLSX parsing (faster, no write support): pip install xlrd. Note that xlrd 2.0+ dropped support for XLSX (it now only reads XLS) — use openpyxl for XLSX.
LibreOffice Command Line
LibreOffice Calc can convert headlessly:
# XLSX to CSV (converts the first sheet)
libreoffice --headless --convert-to csv input.xlsx
# Multiple files
libreoffice --headless --convert-to csv *.xlsxLimitations: LibreOffice's headless CSV export only handles the first sheet and uses the locale's default CSV settings (delimiter, encoding). For multi-sheet XLSX, you'll need pandas or a dedicated tool.
The XLSX vs CSV choice is usually obvious once you know who or what is consuming the data. Human spreadsheet users need XLSX — they need formulas, formatting, and the ability to see and edit all sheets. Databases, data pipelines, programming scripts, and ML tools need CSV — they need plain values they can parse without a spreadsheet engine.
When you need to switch: convert XLSX to CSV for data pipelines and database imports, or convert CSV to XLSX to bring data into a spreadsheet for formatting and analysis. For data containing commas, XLSX to TSV avoids the quoting complexity of CSV. All conversions run in the browser — no Excel, no uploads.