TSV and CSV are both plain-text tabular formats. Both store rows of data, with each row on its own line and columns separated by a delimiter character. The only structural difference is that character: a comma for CSV, a tab for TSV. But that single-character difference has cascading implications for escaping, compatibility, readability, and where each format fits in a data pipeline.

The choice between them is usually driven by what's in your data, not by personal preference. If your data contains commas — and most real-world data does (addresses, prices with thousands separators, free-text fields) — CSV requires quoting rules that are poorly standardized and frequently broken. TSV sidesteps this by using a delimiter that almost never appears in real data. But TSV has its own tradeoffs: tabs are invisible in most editors, and Excel doesn't open TSV files directly with a double-click.

Key Differences Between TSV and CSV

The structural difference is the delimiter. Everything else follows from that.

FeatureCSVTSV
Delimiter characterComma (,)Tab (\t, ASCII 9)
Quoting required?Yes, when field contains comma or newlineRarely — only if field contains tab or newline
Escaping mechanismDouble-quote wrapping with "" for embedded quotesBackslash escape or no standard (format-dependent)
Excel double-click openYes (auto-detected)No (opens as single-column or import wizard)
Common file extension.csv.tsv or .txt
MIME typetext/csvtext/tab-separated-values
RFC standardRFC 4180 (2005)No formal RFC

The lack of a formal RFC for TSV is notable. CSV at least has RFC 4180 as a baseline, even if real-world files frequently violate it. TSV has no equivalent — there's no authoritative specification for how to handle tabs or newlines within field values. In practice, most TSV producers simply forbid tabs and newlines in field data, which is why the format works: the delimiter is rare enough that the edge case rarely arises.

CSV Quoting Rules (and Why They Break)

RFC 4180 defines CSV quoting precisely: any field containing a comma, double-quote, or newline must be enclosed in double-quotes. A double-quote inside a quoted field is escaped by doubling it (""). This sounds simple but creates problems in practice:

  • Inconsistent quoting. Different tools apply quoting rules differently. Some quote every field. Some quote only when necessary. Some quote strings but never numbers. When you import a CSV that was produced by a different tool, the parser may choke on inconsistent quoting.
  • The "always quote" vs "quote when needed" split. Excel always quotes fields that need it, and often quotes string fields even when they don't contain commas. Python's csv module with default settings only quotes when needed. Mixing these producers and consumers creates files that technically conform to RFC 4180 but parse differently across tools.
  • Multiline fields. RFC 4180 allows newlines inside quoted fields. But many CSV parsers that read line-by-line will break on multiline fields, treating each physical line as a separate row. This is the most common CSV corruption bug.
  • The quote-within-quote problem. A field containing the text He said "hello" becomes "He said ""hello""" in RFC 4180 CSV. Producers that use backslash escaping instead ("He said \"hello\"") create files that some parsers misread entirely.

TSV avoids all of this. If your data doesn't contain tabs (and it usually doesn't), you can write TSV without any quoting logic at all — just join fields with \t and join rows with \n.

Character Encoding: The Problem Both Formats Share

Neither CSV nor TSV has a mechanism to declare character encoding. A file is bytes on disk; the consumer decides how to decode those bytes. This creates endemic problems:

  • UTF-8 vs. system encoding. On Windows, many tools default to the system locale encoding (Windows-1252 in the US), not UTF-8. A CSV file written by Python's csv module in UTF-8 will display mojibake (é instead of é) when opened in Excel without specifying the encoding.
  • UTF-8 BOM. Microsoft's workaround: prepend a UTF-8 BOM (\xef\xbb\xbf) to the file. Excel detects the BOM and treats the file as UTF-8. But many non-Microsoft tools treat the BOM as data, adding a phantom invisible character to the first field of the first row. This breaks header-based parsing.
  • The tab character in data. If your data contains literal tabs (copied from a spreadsheet, embedded in text), TSV will silently corrupt. The tab will be interpreted as a delimiter, splitting one field into two. CSV with proper quoting handles embedded commas cleanly; TSV has no equivalent protection for embedded tabs.

The practical rule: if you control both ends of the pipeline, agree on UTF-8 and enforce it. If you don't control the consumer (especially if it might be Excel), use UTF-8 with BOM for CSV, or test TSV carefully.

When to Use CSV

CSV is the right choice when:

  • End users will open the file in Excel or Google Sheets. Double-clicking a .csv file opens it correctly in Excel. A .tsv file requires the import wizard and encoding selection. If your users aren't technical, CSV reduces friction.
  • You're calling an API that returns or accepts CSV. Most APIs and SaaS export endpoints that output tabular data use CSV. The text/csv MIME type is well-supported in HTTP libraries. TSV has spotty API-level support.
  • Your data is safe from commas. Numeric data, identifier columns, ISO dates, and boolean values don't contain commas. A database export of a user table (ID, email, username, created_at) is perfectly safe as CSV without any quoting.
  • You need the broadest possible tool compatibility. CSV parsers are in every language, every database, every BI tool. TSV parsers are also common but slightly less universal — some tools accept TSV only as a special case of CSV with delimiter configuration.

Convert to CSV: convert TSV to CSV for Excel compatibility or API compatibility.

When to Use TSV

TSV is the right choice when:

  • Your data regularly contains commas. Addresses (123 Main St, Suite 400), descriptions, formatted numbers (1,000,000), or prose text all contain commas. Every one of those fields needs quoting in CSV. In TSV they write cleanly with no special handling.
  • You're piping data between Unix tools. cut, awk, and sort treat tabs as delimiters by default (or with -F'\t'). TSV fits naturally into shell pipelines. A cat data.tsv | cut -f3 extracts the third column cleanly.
  • You're doing bioinformatics or genomics work. BLAST output, BED format, VCF, and most bioinformatics tools use tab-delimited text as their standard. The convention is so established that TSV is effectively the default format in that domain.
  • You're generating data programmatically and don't need spreadsheet compatibility. Machine-to-machine pipelines that never touch Excel can use TSV without the Excel-compatibility overhead. The simpler serialization logic reduces the chance of quoting bugs.
  • You're working with NLP or ML training data. Text corpora, annotation files, and feature vectors often use TSV because the data contains commas. Libraries like Hugging Face's datasets and scikit-learn utilities both handle TSV natively.

Convert to TSV: convert CSV to TSV for Unix pipelines or bioinformatics tools.

How to Convert Between TSV and CSV

Converting between TSV and CSV is structurally simple — you're changing the delimiter. But several edge cases make it less trivial than a find-and-replace:

TSV to CSV

To convert TSV to CSV, you need to: (1) split each row on tab characters, (2) check if any field contains a comma, and (3) wrap fields that do in double-quotes (escaping any embedded double-quotes by doubling them). In Python:

import csv
import sys

reader = csv.reader(sys.stdin, delimiter='\t')
writer = csv.writer(sys.stdout)
for row in reader:
    writer.writerow(row)

The csv.writer handles quoting automatically — it will quote any field that contains a comma, double-quote, or newline. The fastest path: use ChangeThisFile's TSV to CSV converter, which handles the conversion in the browser with no upload.

CSV to TSV

Converting CSV to TSV requires: (1) parse CSV correctly (respecting quoted fields and escaped quotes), (2) strip the quoting, and (3) join fields with tabs. The dangerous shortcut — replacing commas with tabs — will corrupt any field that contained quoted commas. In Python:

import csv
import sys

reader = csv.reader(sys.stdin)  # parses RFC 4180 quoting
writer = csv.writer(sys.stdout, delimiter='\t')
for row in reader:
    writer.writerow(row)

Watch for fields that contain tab characters — they'll produce extra columns in the TSV output. If your data might have embedded tabs, escape or strip them before converting. Convert CSV to TSV with ChangeThisFile for a no-code solution.

Common Issues and How to Fix Them

Here are the problems that actually appear when working with TSV and CSV files:

Excel Opens TSV as Single Column

Excel associates the .csv extension with auto-detection logic that recognizes comma and semicolon delimiters. Files with the .tsv or .txt extension trigger the Text Import Wizard instead. Fix: rename the file to .csv, or use Excel's import flow (Data → Get External Data → From Text) and specify Tab as the delimiter. Alternatively, convert to CSV first: TSV to CSV.

CSV Has Garbled Characters (Mojibake)

Garbled characters like ’ instead of ' indicate an encoding mismatch — the file is UTF-8 but the consumer interpreted it as Latin-1 or Windows-1252. Fix options: (1) open in a text editor and re-save as UTF-8 with BOM (for Excel), (2) specify the encoding explicitly in your parser (pd.read_csv('file.csv', encoding='utf-8')), or (3) use a tool that auto-detects encoding (such as chardet in Python).

CSV Rows Are Split Incorrectly

If your CSV has multiline fields (fields containing newlines, allowed in RFC 4180 when quoted), line-by-line parsers will split them incorrectly. The field "First line\nSecond line" will be read as two separate rows. Fix: use a proper RFC 4180 parser (csv.reader in Python, read.csv in R with default settings) rather than splitting on newlines manually. If you're debugging a file, check whether quoted fields span multiple lines.

TSV Has Extra Columns / Row Counts Don't Match

If you convert CSV to TSV and some rows have more columns than expected, a field in the original CSV contained a tab character. Tabs in text pasted from spreadsheets, code editors, or certain web forms can silently embed as \t. Fix: sanitize input — strip or replace tab characters in string fields before writing TSV. If the data is already corrupted, you'll need to identify the affected rows (they'll have more fields than the header) and decide whether to strip or drop them.

TSV and CSV are functionally equivalent for data that doesn't contain the delimiter character. The format choice is really about what's in your data and who consumes it. For data with commas — most natural language text, addresses, anything with decimal numbers formatted for humans — TSV produces cleaner files with fewer escaping bugs. For maximum tool compatibility, especially with spreadsheet software and APIs, CSV is the safe default.

When you need to switch formats: convert TSV to CSV for Excel and API compatibility, or convert CSV to TSV for Unix pipelines and NLP/ML tools. Both conversions happen client-side — your data never leaves the browser.