General

TSV: Tab-Separated Values and When Tabs Beat Commas

Published Mar 19, 2026 6 min read By ChangeThisFile Team

Quick Answer

TSV (Tab-Separated Values) uses tab characters as delimiters instead of commas. This eliminates the quoting problems that plague CSV: commas in addresses, semicolons in European locales, and nested quoted strings. TSV is the standard in bioinformatics (BLAST, BED, VCF formats), the native clipboard format when copying from spreadsheets, and the most Unix-pipeline-friendly tabular format. For data containing free-form text, TSV is reliably cleaner than CSV.

TSV is CSV with one character changed — tab instead of comma — and that single change eliminates most of CSV's parsing headaches. Commas appear everywhere in data: addresses ("123 Main St, Apt 4"), descriptions ("fast, reliable, affordable"), European numbers ("3,14"). Every comma inside a value requires quoting and escaping. Tabs almost never appear in natural text, so TSV data rarely needs quoting at all.

This simplicity makes TSV the preferred format in specific domains where CSV's quirks cause real problems: bioinformatics (where gene annotations contain commas and semicolons), data exchange via clipboard (copy cells from Excel, paste into a text editor — that's TSV), and Unix pipelines (where cut -f2 extracts the second column cleanly without a CSV parser).

TSV vs CSV: The Practical Differences

Property	CSV	TSV
Delimiter	Comma (`,`)	Tab (`\t`)
Quoting needed	Frequently (commas in data, newlines)	Rarely (tabs are uncommon in data)
Locale issues	European locales use semicolons as delimiter	Tabs are locale-independent
Standard	RFC 4180 (loosely followed)	IANA text/tab-separated-values
File extension	.csv	.tsv or .tab
MIME type	text/csv	text/tab-separated-values
Clipboard format	Not standard	Default when copying from spreadsheets
Excel behavior	Auto-opens, auto-detects types (dangerous)	Opens with Text Import Wizard (slightly safer)
Unix tools	Requires CSV parser for quoted fields	`cut`, `awk`, `sort` work natively

The key insight: CSV's quoting mechanism (double quotes around fields containing delimiters) is a source of endless bugs. Unmatched quotes corrupt entire rows. Escaped quotes inside quoted fields ("He said ""hello""") confuse parsers. Different producers use different quoting conventions. TSV sidesteps all of this by using a delimiter that almost never appears in data.

TSV in Bioinformatics: The Domain Standard

Bioinformatics standardized on tab-delimited formats decades ago, and the reasons illuminate why TSV is technically superior for data interchange:

BED format (Browser Extensible Data): Genome region annotations. Tab-separated, no header, minimal: chr1\t1000\t5000\tgene_name.
VCF format (Variant Call Format): Genetic variants. Tab-separated with a ## header section. The INFO field contains semicolon-separated key-value pairs — commas inside this field would be catastrophic in CSV.
GFF/GTF format (General Feature Format): Gene annotations. Tab-separated with semicolon-separated attributes in the 9th column.
BLAST output: Sequence alignment results. Default output format 6 is tab-separated: query_id\tsubject_id\tpct_identity\t...
SAM/BAM format: Sequence alignment. The SAM text format is tab-separated with complex fields containing colons and commas.

Notice the pattern: bioinformatics data frequently contains commas and semicolons within field values. Using commas as both delimiters and data would require complex quoting that breaks simple Unix tools. Tabs avoid the collision entirely.

When working with bioinformatics data in spreadsheet tools, convert TSV to XLSX for viewing in Excel or convert TSV to CSV for tools that specifically require comma delimiting.

Clipboard Copy: Spreadsheets Speak TSV

When you select cells in Excel, Google Sheets, or LibreOffice and press Ctrl+C, the clipboard contains TSV — not CSV. Paste into a text editor and you'll see tab-separated values. Paste into another spreadsheet and the tabs align data into columns automatically.

This is why the copy-paste workflow between spreadsheets and text editors is so smooth: TSV is the native interchange format. When you paste tabular data from a web page or email into Excel, Excel splits on tabs to populate columns. When you paste cells from Excel into a web form's text field, you get tab-separated text.

This has practical implications:

Quick data entry: Type tab-separated data in a text editor, select all, paste into Excel. Columns align automatically.
Quick data extraction: Select cells in Excel, Ctrl+C, paste into a Python script as a multi-line string. Parse with line.split('\t').
Cross-application transfer: Copy from Excel, paste into Google Sheets (or vice versa). The tabs ensure column alignment regardless of the application.

Unix Pipeline Friendliness

TSV is the most Unix-friendly tabular format because standard Unix tools handle tab delimiters natively:

# Extract the second column
cut -f2 data.tsv

# Sort by third column (numeric)
sort -t$'\t' -k3 -n data.tsv

# Filter rows where column 2 equals "active"
awk -F'\t' '$2 == "active"' data.tsv

# Join two TSV files on first column
join -t$'\t' <(sort file1.tsv) <(sort file2.tsv)

# Count unique values in column 3
cut -f3 data.tsv | sort | uniq -c | sort -rn

Try doing any of these with a CSV that has quoted fields containing commas. cut -d, -f2 will break on every row where a field contains a comma. You'd need a CSV-aware tool like csvtool or miller. With TSV, the standard tools work out of the box.

This matters for data engineering pipelines that process files with shell scripts, GNU coreutils, and awk. TSV files flow through these tools without any special handling. CSV files require dedicated parsers, adding dependency and complexity.

When TSV Falls Short

TSV isn't perfect. Its limitations are the mirror of its strengths:

Tabs in data. If your data contains literal tab characters (rare but possible in free-form text, code snippets, or imported data), TSV breaks just like CSV breaks on commas. The fix: escape or remove tabs before writing. In practice, this is far less common than commas in data.
No standard quoting mechanism. CSV has a well-defined (if imperfect) quoting standard: double quotes around fields, doubled double quotes for literal quotes. TSV has no equivalent standard. Some tools support backslash escaping (\t for literal tab, \n for literal newline). Others don't. This ambiguity is TSV's biggest weakness compared to CSV.
Invisible delimiter. Tab characters are invisible in most text editors and terminals. You can't tell by looking whether a file is TSV (tabs) or space-aligned (spaces). In CSV, the commas are visible. Use cat -A data.tsv on Unix to see tabs as ^I characters.
Less universal recognition. Double-clicking a .tsv file may not open in your spreadsheet application. .csv files are universally associated with spreadsheet apps. You may need to rename .tsv to .csv or use File > Open and specify the delimiter.

Converting Between TSV and Other Formats

TSV conversion is straightforward because the only difference from CSV is the delimiter character:

TSV to CSV: Replace tabs with commas, add quoting for fields that contain commas. Convert TSV to CSV handles this automatically. In Python: csv.writer(out, delimiter=',') to write, csv.reader(inp, delimiter='\t') to read.
CSV to TSV: Replace commas with tabs, remove quoting (since tabs in data are rare). Convert CSV to TSV or in Unix: python3 -c "import csv,sys; w=csv.writer(sys.stdout, delimiter='\t'); [w.writerow(r) for r in csv.reader(sys.stdin)]"
TSV to XLSX: Convert TSV to XLSX for sharing with spreadsheet users. Data types are preserved as well as they would be from CSV (which is to say: not well, because TSV also has no type information).
TSV to JSON: Convert TSV to JSON for programmatic consumption. Each row becomes a JSON object with column headers as keys.

For one-off conversions, the Unix command tr '\t' ',' < data.tsv > data.csv works for simple data but doesn't handle quoting. For production use, always use a proper CSV library that handles edge cases.

When to Choose TSV Over CSV

Use TSV when:

Your data contains commas in field values. Addresses, descriptions, names with suffixes ("Smith, Jr."), product lists. TSV avoids the quoting circus.
You're working in bioinformatics. Most genomics tools expect TSV input and produce TSV output. Using CSV introduces unnecessary conversion steps.
Your pipeline uses Unix tools. cut, sort, awk, join work natively with TSV. CSV requires specialized tools.
You're exchanging data between spreadsheet applications. Copy-paste uses TSV natively. Saving as TSV preserves the clipboard format.
You need locale independence. TSV uses the same delimiter worldwide. CSV's delimiter varies by locale (comma in US/UK, semicolon in France/Germany).

Use CSV when:

The recipient expects CSV specifically. Many import tools, SaaS products, and databases have CSV import features but no TSV option.
Maximum compatibility is the goal. .csv is more universally recognized than .tsv.
You're publishing data for broad consumption. CSV is the default expectation for open data portals, kaggle datasets, and API exports.

TSV is the engineer's choice for tabular data. It solves CSV's most annoying problem (commas in data requiring quoting) by using a delimiter that rarely appears in natural text. The trade-off — less universal recognition and no standard quoting mechanism — is minor for technical workflows where the data is processed by scripts and tools rather than opened by double-clicking in Excel.

The recommendation is simple: if your data contains free-form text (addresses, descriptions, annotations), use TSV. If your data is purely numeric or contains only simple strings without commas, CSV and TSV are equivalent — use whichever your downstream tools expect. And if you're not sure, convert CSV to TSV and see how much simpler your parsing becomes.

Key Takeaways

TSV uses tabs as delimiters, eliminating the quoting problems caused by commas in CSV data (addresses, descriptions, European numbers).
Bioinformatics standardized on TSV because genomics data routinely contains commas and semicolons within field values.
Copying cells from any spreadsheet to clipboard produces TSV, not CSV. This makes TSV the native spreadsheet interchange format.
Unix tools (cut, sort, awk, join) handle TSV natively without special parsers. CSV with quoted fields requires dedicated tools.
TSV's main weakness: no standard quoting or escaping mechanism for the rare case when data contains literal tab characters.
Choose TSV for data containing commas, bioinformatics workflows, and Unix pipelines. Choose CSV for maximum compatibility and broad data publishing.

Frequently Asked Questions

What's the difference between TSV and CSV?

The only structural difference is the delimiter character: TSV uses tab (\t), CSV uses comma (,). This single difference has major practical implications: TSV rarely needs quoting because tabs are uncommon in data, while CSV frequently needs quoting because commas appear in addresses, descriptions, and European numbers. TSV is also locale-independent (tabs mean the same thing worldwide), while CSV's delimiter varies by locale.

Can Excel open TSV files?

Yes. Rename the .tsv file to .txt or use File > Open and select the file. Excel's Text Import Wizard will detect the tab delimiter automatically. Alternatively, some systems associate .tsv directly with Excel. You can also <a href='/tsv-to-xlsx'>convert TSV to XLSX</a> for a native Excel file. Note that double-clicking a .tsv file may not open Excel on all systems — .csv has better file association support.

Why is TSV the standard in bioinformatics?

Bioinformatics data frequently contains commas and semicolons inside field values — gene annotations, variant descriptions, and alignment metadata all use these characters in their syntax. Using commas as delimiters would require complex quoting that breaks the simple Unix tools (cut, awk, sort) that bioinformatics pipelines rely on. Tab-separated data flows through these tools without any special handling.

Why does copying cells from Excel produce TSV and not CSV?

Microsoft designed Excel's clipboard format to use tabs because tabs are the safer delimiter for arbitrary spreadsheet data. Any cell could contain commas (addresses, numbers with locale-specific formatting), but tabs in cell values are extremely rare. Using tabs ensures that paste operations into text editors, other spreadsheets, and web forms correctly preserve column structure.

How do I convert TSV to CSV in the command line?

For simple data without special characters: <code>tr '\t' ',' < data.tsv > data.csv</code>. For data that might contain commas (requiring quoting in the CSV output), use Python: <code>python3 -c "import csv,sys; w=csv.writer(sys.stdout); [w.writerow(r) for r in csv.reader(sys.stdin, delimiter='\t')]" < data.tsv > data.csv</code>. The Python approach properly quotes fields that contain commas.

Does TSV have an official standard?

TSV has an IANA-registered MIME type (text/tab-separated-values) defined in a 1993 specification. It's simpler than CSV's RFC 4180: fields are separated by tabs, records by newlines, and there's no quoting mechanism defined. This simplicity is both a strength (no quoting ambiguity) and a weakness (no standard way to handle tabs within data). In practice, most TSV files follow the same conventions: first row is headers, no quoting, no escaping.

Is TSV smaller or larger than CSV?

Almost identical. A tab character and a comma are both 1 byte. TSV files tend to be slightly smaller because they don't need double-quote characters around fields that contain commas. For a file where 10% of fields would need quoting in CSV, the TSV version is about 1-2% smaller. The size difference is negligible — choose based on parsing reliability, not file size.

How do I handle data that contains tab characters?

This is rare but not impossible (code snippets, imported text). Options: replace tabs with spaces before writing TSV (lossy but simple), use backslash escaping (\t for literal tab, \n for literal newline — but not all parsers support this), or switch to CSV with quoting for that specific dataset. If your data routinely contains tabs, CSV with proper quoting or a structured format like JSON is a better choice.

Ready to convert your files?

Use ChangeThisFile to convert between 600+ formats — free, fast, and private.

Start Converting