CSV-to-JSON is a data transformation, not a binary conversion — the interesting decisions are about type coercion (should "42" become the integer 42 or stay a string?) and memory management (loading a 2GB CSV into a list will crash your process). NimbleCSV handles the parsing efficiently; Jason handles the JSON encoding. Together they're the idiomatic Elixir stack for this job.
Method 1: NimbleCSV + Jason (idiomatic, streaming-capable)
NimbleCSV is the standard Elixir CSV parser — fast, streaming, configurable separators. Jason is the standard JSON encoder.
# mix.exs
defp deps do
[
{:nimble_csv, "~> 1.2"},
{:jason, "~> 1.4"}
]
end
NimbleCSV.define(MyCSV, separator: ",", escape: "\"")
defmodule CsvToJson do
@moduledoc "Convert CSV to JSON using NimbleCSV and Jason"
@doc """
Convert a CSV file to a JSON array of objects.
The first row is treated as headers.
Returns {:ok, json_string} or {:error, reason}.
"""
def file_to_json(input_path, output_path \\ nil) do
rows =
input_path
|> File.stream!()
|> MyCSV.parse_stream(skip_headers: false)
|> Enum.to_list()
[header_row | data_rows] = rows
headers = header_row |> Enum.map(&String.trim/1) |> Enum.map(&String.downcase/1)
records =
Enum.map(data_rows, fn row ->
headers
|> Enum.zip(row)
|> Enum.map(fn {k, v} -> {k, coerce_value(v)} end)
|> Map.new()
end)
json = Jason.encode!(records, pretty: true)
if output_path do
File.write!(output_path, json)
end
{:ok, json}
rescue
e -> {:error, Exception.message(e)}
end
# Coerce common types
defp coerce_value(""), do: nil
defp coerce_value(v) do
cond do
v =~ ~r/^-?\d+$/ -> String.to_integer(v)
v =~ ~r/^-?\d+\.\d+$/ -> String.to_float(v)
String.downcase(v) == "true" -> true
String.downcase(v) == "false" -> false
true -> v
end
end
end
# Usage
{:ok, json} = CsvToJson.file_to_json("/tmp/data.csv", "/tmp/data.json")
IO.puts(json)
The coerce_value/1 function converts numeric strings to integers/floats and "true"/"false" to booleans. Remove it if you want all values as strings (safer for data pipelines where you control the schema downstream).
Method 2: Streaming large CSVs with NimbleCSV (memory-efficient)
For large CSV files (100MB+), building a full in-memory list will exhaust the process heap. Stream the rows and write JSON incrementally instead.
defmodule CsvToJsonStream do
@doc """
Stream a large CSV to a JSON Lines file (.jsonl), one object per line.
Memory usage stays constant regardless of CSV size.
"""
def stream_to_jsonl(input_path, output_path) do
rows =
input_path
|> File.stream!()
|> MyCSV.parse_stream(skip_headers: false)
# Peek at the first row for headers
{[header_row], rest} = Enum.split(rows, 1)
headers = Enum.map(header_row, &String.trim/1)
File.open!(output_path, [:write, :utf8], fn file ->
Enum.each(rest, fn row ->
record =
headers
|> Enum.zip(row)
|> Map.new()
IO.write(file, Jason.encode!(record) <> "\n")
end)
end)
:ok
end
@doc """
Stream to a standard JSON array file.
Uses a custom accumulator to avoid loading all records into memory.
"""
def stream_to_json_array(input_path, output_path) do
rows =
input_path
|> File.stream!()
|> MyCSV.parse_stream(skip_headers: false)
{[header_row], data} = Enum.split(rows, 1)
headers = Enum.map(header_row, &String.trim/1)
File.open!(output_path, [:write, :utf8], fn file ->
IO.write(file, "[\n")
data
|> Stream.with_index()
|> Enum.each(fn {row, index} ->
record = headers |> Enum.zip(row) |> Map.new()
suffix = if index == 0, do: "", else: ",\n"
IO.write(file, suffix <> Jason.encode!(record))
end)
IO.write(file, "\n]\n")
end)
:ok
end
end
JSON Lines (.jsonl) is preferred for large datasets — one JSON object per line, easily parsed line-by-line downstream. Standard JSON arrays require buffering the entire output. For truly huge files (1GB+), prefer jsonl.
Method 3: ChangeThisFile API via Req (no parsing deps)
No NimbleCSV, no Jason needed for the conversion itself. POST the CSV file to /v1/convert and receive JSON. Free tier: 1,000 conversions/month.
# curl reference
curl -X POST https://changethisfile.com/v1/convert \
-H "Authorization: Bearer ctf_sk_your_key_here" \
-F "file=@data.csv" \
-F "target=json" \
--output data.json
# mix.exs
defp deps do
[
{:req, "~> 0.5"}
]
end
defmodule CTFConverter do
@api_url "https://changethisfile.com/v1/convert"
@api_key "ctf_sk_your_key_here"
def csv_to_json(input_path, output_path) do
file_data = File.read!(input_path)
filename = Path.basename(input_path)
response =
Req.post!(
@api_url,
headers: [{"Authorization", "Bearer #{@api_key}"}],
form_multipart: [
{:file, file_data,
filename: filename,
content_type: "text/csv"},
{:target, "json"}
],
receive_timeout: 60_000
)
case response.status do
200 ->
File.write!(output_path, response.body)
{:ok, output_path}
status ->
{:error, "API returned #{status}: #{inspect(response.body)}"}
end
end
end
# Usage
{:ok, path} = CTFConverter.csv_to_json("/tmp/data.csv", "/tmp/data.json")
IO.puts("Saved: #{path}")
The API returns a JSON array with the first row used as object keys. All values are returned as strings — apply your own type coercion if needed after parsing with Jason.
When to use each
| Approach | Best for | Tradeoff |
|---|---|---|
| NimbleCSV + Jason (in-memory) | Files under ~50MB, type coercion needed | Full file loaded into memory; process may crash on huge CSVs |
| NimbleCSV streaming | Large files (100MB+), constant memory usage | More complex code; JSON Lines output preferred for largest files |
| ChangeThisFile API (Req) | No parsing deps, quick scripts, cross-language consistency | Network latency; all values as strings; 25MB free-tier limit |
Production tips
- Define NimbleCSV modules at compile time. NimbleCSV.define/2 generates a module — call it in your application module, not in a function body. Defining it inside a function re-generates the module on every call.
- Handle BOM-prefixed CSV files. Some Windows-generated CSVs start with a UTF-8 BOM (byte order mark: EF BB BF). Strip it from the first header:
String.replace(header, "\uFEFF", ""). - Don't assume all rows have the same column count. Malformed CSVs can have rows with more or fewer columns than the header. Use Enum.zip_with default or pad/truncate rows before zipping with headers.
- Use Oban for large background conversions. For CSVs uploaded by users in a Phoenix app, run the conversion in an Oban job. Return a job ID immediately and push completion via PubSub/LiveView.
- Jason.encode! raises on unencodable terms. Atoms (other than nil, true, false) are not valid JSON. If your coerce_value function can return atoms, convert them to strings first.
For Elixir services, NimbleCSV + Jason gives you full type control and streaming support. For quick scripts or cross-language pipelines, the API. Free tier covers 1,000 conversions/month.