Ruby's standard CSV library is one of the best in any language — it handles BOM, custom delimiters, quoted fields, and multi-line values correctly. For most CSV-to-JSON jobs, you don't need an external gem. smarter_csv adds streaming for very large files and automatic type coercion (turning '42' into 42 and 'true' into true).

Method 1: Ruby CSV stdlib (no gems needed)

Ruby's built-in CSV library handles most real-world CSVs correctly without any gems.

require 'csv'
require 'json'

# Simple: array of hashes
rows = CSV.read('data.csv', headers: true, header_converters: :symbol)
json = JSON.pretty_generate(rows.map(&:to_h))
File.write('output.json', json)
puts "Done: #{rows.size} rows"
require 'csv'
require 'json'

def csv_to_json(csv_path, out_path, options = {})
  default_opts = {
    headers: true,
    header_converters: :symbol,  # 'First Name' -> :first_name
    converters: :all,            # auto-convert numbers and dates
    encoding: 'UTF-8',
    liberal_parsing: true        # more tolerant of malformed CSV
  }
  opts = default_opts.merge(options)

  rows = []
  CSV.foreach(csv_path, **opts) do |row|
    rows << row.to_h
  end

  File.write(out_path, JSON.pretty_generate(rows))
  rows.size
end

count = csv_to_json('users.csv', 'users.json')
puts "Converted #{count} rows"

Key options:

  • header_converters: :symbol — converts 'First Name' to :first_name (snake_cased symbol). Useful for consistent JSON keys.
  • converters: :all — automatically converts '42' to 42, '3.14' to 3.14, 'true' to true. Watch out: it also converts date-like strings ('2024-01-15') to Date objects — JSON serialization of Date objects returns a string anyway, but the behavior may surprise you.
  • liberal_parsing: true — more tolerant of real-world CSVs with unbalanced quotes or inconsistent line endings.

Method 2: smarter_csv (streaming, type coercion, large files)

smarter_csv is built for large CSVs and adds chunked processing, custom key mappings, and explicit type conversion.

gem install smarter_csv
# Or Gemfile: gem 'smarter_csv'
require 'smarter_csv'
require 'json'

# Simple: load all rows
rows = SmarterCSV.process('data.csv', {
  key_mapping: {
    first_name: :firstName,     # remap header to camelCase key
    last_name: :lastName,
  },
  remove_empty_values: true,    # drop nil/empty fields
  convert_values_to_numeric: true,  # '42' -> 42
  strings_as_keys: false        # keep symbol keys
})

File.write('output.json', JSON.pretty_generate(rows))
puts "Done: #{rows.size} rows"
require 'smarter_csv'
require 'json'

# Chunked processing for large files (memory-efficient)
def large_csv_to_json_lines(csv_path, out_path)
  File.open(out_path, 'w') do |f|
    SmarterCSV.process(csv_path, chunk_size: 1000) do |chunk|
      chunk.each do |row|
        f.puts JSON.generate(row)  # JSON Lines format: one JSON object per line
      end
    end
  end
end

large_csv_to_json_lines('million_rows.csv', 'output.jsonl')

For CSV files over 100MB, chunked processing with smarter_csv keeps memory constant regardless of file size. The JSON Lines output format (one JSON object per line) is also more streaming-friendly than a single JSON array.

Method 3: ChangeThisFile API (Net::HTTP, no parsing code)

The API converts CSV to JSON server-side. Source auto-detected from filename — pass target=json. Free tier: 1,000 conversions/month, no card needed.

require 'net/http'
require 'uri'
require 'securerandom'

API_KEY = 'ctf_sk_your_key_here'

def csv_to_json_api(csv_path, out_path)
  uri = URI('https://changethisfile.com/v1/convert')
  boundary = "CTF#{SecureRandom.hex(8)}"

  file_data = File.binread(csv_path)
  body = [
    "--#{boundary}\r\n",
    'Content-Disposition: form-data; name="file"; filename="' + File.basename(csv_path) + "\"\r\n",
    "Content-Type: text/csv\r\n\r\n",
    file_data, "\r\n",
    "--#{boundary}\r\n",
    "Content-Disposition: form-data; name=\"target\"\r\n\r\n",
    "json\r\n",
    "--#{boundary}--\r\n"
  ].join

  req = Net::HTTP::Post.new(uri)
  req['Authorization'] = "Bearer #{API_KEY}"
  req['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
  req.body = body

  resp = Net::HTTP.start(uri.host, uri.port, use_ssl: true, read_timeout: 60) { |h| h.request(req) }
  raise "API error: #{resp.code}" unless resp.code == '200'

  File.write(out_path, resp.body)
end

csv_to_json_api('data.csv', 'output.json')
puts 'Done'

When to use each

ApproachBest forTradeoff
CSV stdlibMost use cases — no gem dependency, handles real-world CSVs wellLoads whole file into memory; manual type conversion if needed
smarter_csvLarge files, chunked processing, custom key mappingExternal gem; slightly more configuration
ChangeThisFile APIQuick one-offs, no parsing code, shared hostingNetwork call; free tier 25MB limit

Production tips

  • Specify encoding explicitly for user uploads. Default to 'UTF-8' but rescue Encoding::InvalidByteSequenceError and retry with 'ISO-8859-1'. Many Excel-exported CSVs are in Windows-1252 (similar to ISO-8859-1).
  • Use JSON.generate (not pretty_generate) for large outputs. Pretty-printing adds significant whitespace. For files over 1MB, JSON.generate reduces output size by 20-30%.
  • Watch converters: :all with date-like strings. '2024-01-15' gets converted to a Date object by :all. In JSON, Date#to_json outputs '2024-01-15' anyway, but it bypasses the string key lookup path. Test with your actual data.
  • nil vs empty string. CSV stdlib converts empty fields to nil by default. If downstream systems expect empty strings, convert: row.to_h.transform_values { |v| v.nil? ? '' : v }.

Ruby's CSV stdlib is one of the best in any language — use it for most cases. smarter_csv is worth adding for files over 100MB or when you need key remapping. Free API tier: 1,000 conversions/month.