CSV-to-JSON is a data transformation, not a binary conversion — the interesting decisions are about type coercion (should "42" become the integer 42 or stay a string?) and memory management (loading a 2GB CSV into a list will crash your process). NimbleCSV handles the parsing efficiently; Jason handles the JSON encoding. Together they're the idiomatic Elixir stack for this job.

Method 1: NimbleCSV + Jason (idiomatic, streaming-capable)

NimbleCSV is the standard Elixir CSV parser — fast, streaming, configurable separators. Jason is the standard JSON encoder.

# mix.exs
defp deps do
  [
    {:nimble_csv, "~> 1.2"},
    {:jason, "~> 1.4"}
  ]
end
NimbleCSV.define(MyCSV, separator: ",", escape: "\"")

defmodule CsvToJson do
  @moduledoc "Convert CSV to JSON using NimbleCSV and Jason"

  @doc """
  Convert a CSV file to a JSON array of objects.
  The first row is treated as headers.
  Returns {:ok, json_string} or {:error, reason}.
  """
  def file_to_json(input_path, output_path \\ nil) do
    rows =
      input_path
      |> File.stream!()
      |> MyCSV.parse_stream(skip_headers: false)
      |> Enum.to_list()

    [header_row | data_rows] = rows
    headers = header_row |> Enum.map(&String.trim/1) |> Enum.map(&String.downcase/1)

    records =
      Enum.map(data_rows, fn row ->
        headers
        |> Enum.zip(row)
        |> Enum.map(fn {k, v} -> {k, coerce_value(v)} end)
        |> Map.new()
      end)

    json = Jason.encode!(records, pretty: true)

    if output_path do
      File.write!(output_path, json)
    end

    {:ok, json}
  rescue
    e -> {:error, Exception.message(e)}
  end

  # Coerce common types
  defp coerce_value(""), do: nil
  defp coerce_value(v) do
    cond do
      v =~ ~r/^-?\d+$/ -> String.to_integer(v)
      v =~ ~r/^-?\d+\.\d+$/ -> String.to_float(v)
      String.downcase(v) == "true" -> true
      String.downcase(v) == "false" -> false
      true -> v
    end
  end
end

# Usage
{:ok, json} = CsvToJson.file_to_json("/tmp/data.csv", "/tmp/data.json")
IO.puts(json)

The coerce_value/1 function converts numeric strings to integers/floats and "true"/"false" to booleans. Remove it if you want all values as strings (safer for data pipelines where you control the schema downstream).

Method 2: Streaming large CSVs with NimbleCSV (memory-efficient)

For large CSV files (100MB+), building a full in-memory list will exhaust the process heap. Stream the rows and write JSON incrementally instead.

defmodule CsvToJsonStream do
  @doc """
  Stream a large CSV to a JSON Lines file (.jsonl), one object per line.
  Memory usage stays constant regardless of CSV size.
  """
  def stream_to_jsonl(input_path, output_path) do
    rows = 
      input_path
      |> File.stream!()
      |> MyCSV.parse_stream(skip_headers: false)

    # Peek at the first row for headers
    {[header_row], rest} = Enum.split(rows, 1)
    headers = Enum.map(header_row, &String.trim/1)

    File.open!(output_path, [:write, :utf8], fn file ->
      Enum.each(rest, fn row ->
        record =
          headers
          |> Enum.zip(row)
          |> Map.new()

        IO.write(file, Jason.encode!(record) <> "\n")
      end)
    end)

    :ok
  end

  @doc """
  Stream to a standard JSON array file.
  Uses a custom accumulator to avoid loading all records into memory.
  """
  def stream_to_json_array(input_path, output_path) do
    rows =
      input_path
      |> File.stream!()
      |> MyCSV.parse_stream(skip_headers: false)

    {[header_row], data} = Enum.split(rows, 1)
    headers = Enum.map(header_row, &String.trim/1)

    File.open!(output_path, [:write, :utf8], fn file ->
      IO.write(file, "[\n")

      data
      |> Stream.with_index()
      |> Enum.each(fn {row, index} ->
        record = headers |> Enum.zip(row) |> Map.new()
        suffix = if index == 0, do: "", else: ",\n"
        IO.write(file, suffix <> Jason.encode!(record))
      end)

      IO.write(file, "\n]\n")
    end)

    :ok
  end
end

JSON Lines (.jsonl) is preferred for large datasets — one JSON object per line, easily parsed line-by-line downstream. Standard JSON arrays require buffering the entire output. For truly huge files (1GB+), prefer jsonl.

Method 3: ChangeThisFile API via Req (no parsing deps)

No NimbleCSV, no Jason needed for the conversion itself. POST the CSV file to /v1/convert and receive JSON. Free tier: 1,000 conversions/month.

# curl reference
curl -X POST https://changethisfile.com/v1/convert \
  -H "Authorization: Bearer ctf_sk_your_key_here" \
  -F "file=@data.csv" \
  -F "target=json" \
  --output data.json
# mix.exs
defp deps do
  [
    {:req, "~> 0.5"}
  ]
end
defmodule CTFConverter do
  @api_url "https://changethisfile.com/v1/convert"
  @api_key "ctf_sk_your_key_here"

  def csv_to_json(input_path, output_path) do
    file_data = File.read!(input_path)
    filename  = Path.basename(input_path)

    response =
      Req.post!(
        @api_url,
        headers: [{"Authorization", "Bearer #{@api_key}"}],
        form_multipart: [
          {:file, file_data,
           filename: filename,
           content_type: "text/csv"},
          {:target, "json"}
        ],
        receive_timeout: 60_000
      )

    case response.status do
      200 ->
        File.write!(output_path, response.body)
        {:ok, output_path}

      status ->
        {:error, "API returned #{status}: #{inspect(response.body)}"}
    end
  end
end

# Usage
{:ok, path} = CTFConverter.csv_to_json("/tmp/data.csv", "/tmp/data.json")
IO.puts("Saved: #{path}")

The API returns a JSON array with the first row used as object keys. All values are returned as strings — apply your own type coercion if needed after parsing with Jason.

When to use each

ApproachBest forTradeoff
NimbleCSV + Jason (in-memory)Files under ~50MB, type coercion neededFull file loaded into memory; process may crash on huge CSVs
NimbleCSV streamingLarge files (100MB+), constant memory usageMore complex code; JSON Lines output preferred for largest files
ChangeThisFile API (Req)No parsing deps, quick scripts, cross-language consistencyNetwork latency; all values as strings; 25MB free-tier limit

Production tips

  • Define NimbleCSV modules at compile time. NimbleCSV.define/2 generates a module — call it in your application module, not in a function body. Defining it inside a function re-generates the module on every call.
  • Handle BOM-prefixed CSV files. Some Windows-generated CSVs start with a UTF-8 BOM (byte order mark: EF BB BF). Strip it from the first header: String.replace(header, "\uFEFF", "").
  • Don't assume all rows have the same column count. Malformed CSVs can have rows with more or fewer columns than the header. Use Enum.zip_with default or pad/truncate rows before zipping with headers.
  • Use Oban for large background conversions. For CSVs uploaded by users in a Phoenix app, run the conversion in an Oban job. Return a job ID immediately and push completion via PubSub/LiveView.
  • Jason.encode! raises on unencodable terms. Atoms (other than nil, true, false) are not valid JSON. If your coerce_value function can return atoms, convert them to strings first.

For Elixir services, NimbleCSV + Jason gives you full type control and streaming support. For quick scripts or cross-language pipelines, the API. Free tier covers 1,000 conversions/month.