Ruby's DOCX-to-PDF story is shorter than Python's or PHP's: there's no widely-trusted pure-Ruby DOCX renderer. The practical options are LibreOffice headless (high fidelity, needs the binary) and the ChangeThisFile API (zero dependencies). The docx gem can read and modify DOCX content but can't render to PDF directly. For anything customer-facing, LibreOffice or the API are the only sensible choices.
Method 1: LibreOffice headless (highest fidelity)
LibreOffice converts DOCX to PDF with near-Word fidelity. Available on all major Linux distros and macOS.
# Ubuntu/Debian
apt install libreoffice
# macOS
brew install --cask libreofficerequire 'open3'
require 'tmpdir'
require 'fileutils'
require 'securerandom'
def docx_to_pdf(docx_path, out_dir: '.')
FileUtils.mkdir_p(out_dir)
# Use a unique HOME to avoid lock file conflicts under concurrency
tmp_home = Dir.mktmpdir('lo_')
cmd = [
'env', "HOME=#{tmp_home}",
'soffice',
'--headless',
'--convert-to', 'pdf',
'--outdir', File.expand_path(out_dir),
File.expand_path(docx_path)
]
_stdout, stderr, status = Open3.capture3(*cmd)
FileUtils.rm_rf(tmp_home)
unless status.success?
raise "LibreOffice conversion failed: #{stderr.strip}"
end
basename = File.basename(docx_path, '.*')
pdf_path = File.join(out_dir, "#{basename}.pdf")
raise "PDF not created: #{stderr}" unless File.exist?(pdf_path)
pdf_path
end
pdf = docx_to_pdf('report.docx', out_dir: './output')
puts "PDF: #{pdf}"
The unique HOME trick prevents concurrent conversion deadlocks. LibreOffice stores a lock file at ~/.config/libreoffice/lock — two simultaneous conversions with the same HOME deadlock waiting for the lock. Using Dir.mktmpdir gives each process its own HOME directory.
Method 2: docx gem + Prawn (pure Ruby, simple docs only)
The docx gem reads DOCX content; Prawn generates PDFs. This combination works for text-heavy documents but has poor fidelity for complex layouts.
gem install docx prawn prawn-table
# Or in Gemfile:
# gem 'docx'
# gem 'prawn'
# gem 'prawn-table'
bundle installrequire 'docx'
require 'prawn'
def docx_to_pdf_ruby(docx_path, out_path)
doc = Docx::Document.open(docx_path)
Prawn::Document.generate(out_path, page_size: 'A4', margin: [50, 50, 50, 50]) do |pdf|
doc.paragraphs.each do |para|
text = para.to_s.strip
next if text.empty?
# Very basic style detection
style = para.respond_to?(:style) ? para.style.to_s : ''
if style.start_with?('Heading')
pdf.text text, size: 16, style: :bold
else
pdf.text text, size: 11
end
pdf.move_down 6
end
end
end
docx_to_pdf_ruby('simple.docx', 'output.pdf')
puts 'Done'
This approach loses: tables, images, headers/footers, footnotes, text boxes, and all formatting beyond basic bold/italic. Use it only for simple text documents. For anything else, use LibreOffice or the API.
Method 3: ChangeThisFile API (Net::HTTP, no installs)
The API runs LibreOffice server-side. Source auto-detected from filename — just pass target=pdf. Free tier: 1,000 conversions/month, no card required.
require 'net/http'
require 'uri'
require 'securerandom'
API_KEY = 'ctf_sk_your_key_here'
def docx_to_pdf_api(docx_path, out_path)
uri = URI('https://changethisfile.com/v1/convert')
boundary = "CTF#{SecureRandom.hex(8)}"
file_data = File.binread(docx_path)
body = [
"--#{boundary}\r\n",
'Content-Disposition: form-data; name="file"; filename="' + File.basename(docx_path) + "\"\r\n",
"Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document\r\n\r\n",
file_data, "\r\n",
"--#{boundary}\r\n",
"Content-Disposition: form-data; name=\"target\"\r\n\r\n",
"pdf\r\n",
"--#{boundary}--\r\n"
].join
req = Net::HTTP::Post.new(uri)
req['Authorization'] = "Bearer #{API_KEY}"
req['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
req.body = body
resp = Net::HTTP.start(uri.host, uri.port, use_ssl: true, read_timeout: 120) { |h| h.request(req) }
raise "API error: HTTP #{resp.code}" unless resp.code == '200'
File.binwrite(out_path, resp.body)
end
docx_to_pdf_api('report.docx', './output/report.pdf')
puts 'Done'
When to use each
| Approach | Best for | Tradeoff |
|---|---|---|
| LibreOffice headless | Complex DOCX with tables, images, styles | System binary required; one conversion per HOME |
| docx + Prawn | Simple text-only documents, no binary deps | Very poor fidelity; loses tables, images, most formatting |
| ChangeThisFile API | Heroku/PaaS, no LibreOffice available | Network call; 25MB file limit on free tier |
Production tips
- Always use a unique HOME per LibreOffice call. Dir.mktmpdir + cleanup is the pattern. Without it, concurrent conversions deadlock on the LO lock file.
- Add a conversion timeout. Large DOCX files can take 30+ seconds. Use timeout(60) { Open3.capture3(*cmd) } from the Timeout module.
- Validate output file exists and is non-empty. LibreOffice sometimes exits 0 even on partial failures. Check File.exist? and File.size? (returns nil if empty).
- On Heroku: use the LibreOffice buildpack. Search for a community buildpack that installs LibreOffice headless. Alternatively, use the API to avoid the buildpack entirely.
- Use background jobs for user-uploaded documents. LibreOffice conversion blocks a thread. For web apps, enqueue conversions with Sidekiq and return the PDF URL asynchronously.
LibreOffice headless is the right default for any server that can install binaries. For Heroku or PaaS environments where installing LibreOffice is painful, the API skips it entirely. Free tier: 1,000 conversions/month.