DOCX-to-PDF in Java has a painful truth: no pure-Java library produces Word-perfect output for complex documents. Apache POI reads the DOCX spec, but rendering fonts and layout to PDF requires a full document engine. LibreOffice headless is the highest-fidelity free option — it's the same engine used in production document pipelines worldwide. For environments where you can't install LibreOffice, the ChangeThisFile API handles it server-side with Java 11's built-in HttpClient.

Method 1: Apache POI + docx4j (pure Java)

docx4j can export DOCX to PDF using either iText or FO rendering. Output quality is acceptable for simple documents — tables, images, and complex fonts may drift.

<dependency>
  <groupId>org.docx4j</groupId>
  <artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
  <version>11.4.9</version>
</dependency>
<dependency>
  <groupId>org.docx4j</groupId>
  <artifactId>docx4j-export-fo</artifactId>
  <version>11.4.9</version>
</dependency>
import org.docx4j.Docx4J;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;

import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.nio.file.Path;

public class DocxToPdfDocx4j {

    public static void convert(Path docxPath, Path pdfPath) throws Exception {
        WordprocessingMLPackage pkg = WordprocessingMLPackage
            .load(docxPath.toFile());

        try (OutputStream out = new FileOutputStream(pdfPath.toFile())) {
            Docx4J.toPDF(pkg, out);
        }
    }

    public static void main(String[] args) throws Exception {
        convert(Path.of("document.docx"), Path.of("output.pdf"));
        System.out.println("Converted successfully");
    }
}

docx4j uses Apache FOP for PDF rendering. It handles basic formatting but struggles with custom fonts (you need to configure font substitution in docx4j.properties) and complex table layouts. For reports and forms with predictable layouts, quality is acceptable.

Method 2: LibreOffice headless (best fidelity, system dependency)

LibreOffice's headless mode produces near-perfect PDFs — the same output you'd get from File → Export as PDF in the GUI. It requires LibreOffice installed on the server.

# Ubuntu/Debian
sudo apt install libreoffice-headless

# macOS
brew install libreoffice
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.concurrent.TimeUnit;

public class DocxToPdfLibreOffice {

    public static Path convert(Path docxPath, Path outDir) throws IOException, InterruptedException {
        Files.createDirectories(outDir);

        ProcessBuilder pb = new ProcessBuilder(
            "libreoffice",
            "--headless",
            "--convert-to", "pdf",
            "--outdir", outDir.toAbsolutePath().toString(),
            docxPath.toAbsolutePath().toString()
        );
        pb.environment().put("HOME", "/tmp");  // Avoid profile lock conflicts
        pb.redirectErrorStream(true);

        Process process = pb.start();
        String stdout = new String(process.getInputStream().readAllBytes());
        boolean finished = process.waitFor(120, TimeUnit.SECONDS);

        if (!finished) {
            process.destroyForcibly();
            throw new IOException("LibreOffice timed out after 120s");
        }
        if (process.exitValue() != 0) {
            throw new IOException("LibreOffice failed: " + stdout);
        }

        // LibreOffice names the output file the same as the input with .pdf extension
        String baseName = docxPath.getFileName().toString()
            .replaceAll("\\.[^.]+$", "");
        return outDir.resolve(baseName + ".pdf");
    }

    public static void main(String[] args) throws Exception {
        Path pdf = convert(Path.of("document.docx"), Path.of("./output"));
        System.out.println("PDF at: " + pdf);
    }
}

Critical: Set HOME=/tmp in the environment. LibreOffice creates a user profile on first run — if two processes share a profile directory, one will fail with a lock error. With HOME=/tmp, each invocation uses an isolated temp profile.

Method 3: ChangeThisFile API (Java 11 HttpClient, no SDK)

Send the DOCX as a multipart POST using Java 11's HttpClient. The API runs LibreOffice server-side — same fidelity as Method 2, zero local installation. Free tier covers 1,000 conversions/month.

import java.io.IOException;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.time.Duration;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

public class DocxToPdfApi {

    private static final String API_KEY = "ctf_sk_your_key_here";
    private static final String API_URL = "https://changethisfile.com/v1/convert";
    private static final HttpClient HTTP = HttpClient.newBuilder()
        .connectTimeout(Duration.ofSeconds(10))
        .build();

    public static byte[] convert(Path docxPath) throws IOException, InterruptedException {
        String boundary = "----CTFBoundary" + UUID.randomUUID().toString().replace("-", "");
        byte[] fileBytes = Files.readAllBytes(docxPath);
        String fileName = docxPath.getFileName().toString();

        List<byte[]> parts = new ArrayList<>();
        parts.add(("--" + boundary + "\r\n" +
            "Content-Disposition: form-data; name=\"target\"\r\n\r\npdf\r\n").getBytes(StandardCharsets.UTF_8));
        parts.add(("--" + boundary + "\r\n" +
            "Content-Disposition: form-data; name=\"file\"; filename=\"" + fileName + "\"\r\n" +
            "Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document\r\n\r\n"
        ).getBytes(StandardCharsets.UTF_8));
        parts.add(fileBytes);
        parts.add(("\r\n--" + boundary + "--\r\n").getBytes(StandardCharsets.UTF_8));

        int totalLen = parts.stream().mapToInt(b -> b.length).sum();
        byte[] body = new byte[totalLen];
        int offset = 0;
        for (byte[] part : parts) {
            System.arraycopy(part, 0, body, offset, part.length);
            offset += part.length;
        }

        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(API_URL))
            .header("Authorization", "Bearer " + API_KEY)
            .header("Content-Type", "multipart/form-data; boundary=" + boundary)
            .timeout(Duration.ofSeconds(120))
            .POST(HttpRequest.BodyPublishers.ofByteArray(body))
            .build();

        HttpResponse<byte[]> response = HTTP.send(request,
            HttpResponse.BodyHandlers.ofByteArray());

        if (response.statusCode() != 200) {
            throw new IOException("API error " + response.statusCode() +
                ": " + new String(response.body()));
        }
        return response.body();
    }

    public static void main(String[] args) throws Exception {
        byte[] pdf = convert(Path.of("document.docx"));
        Files.write(Path.of("output.pdf"), pdf);
        System.out.println("Saved output.pdf (" + pdf.length + " bytes)");
    }
}

When to use each

ApproachBest forTradeoff
docx4j (pure Java)Simple documents, no system deps, fully in-processLayout fidelity suffers on complex DOCX (tables, custom fonts)
LibreOffice headlessBest fidelity, free, on-prem serversSystem dependency (~300MB), single-threaded, HOME=/tmp required
ChangeThisFile APICloud/serverless, no LibreOffice install, consistent outputNetwork round-trip, 25MB limit on free tier

Production tips

  • LibreOffice is single-threaded per profile. Use a process pool (e.g., 4 LibreOffice workers with separate HOME dirs: HOME=/tmp/lo-1, HOME=/tmp/lo-2) and queue jobs to them via a BlockingQueue. This gives you parallelism without lock conflicts.
  • Wrap LibreOffice calls in CompletableFuture.supplyAsync(). Document conversion is I/O and CPU heavy. Keep your main thread free and use a dedicated thread pool for conversion work.
  • Reuse the HttpClient instance. Create HttpClient once (static field or Spring bean) — it maintains a connection pool to the API endpoint and amortizes TLS handshake cost across requests.
  • Set a 120-second timeout on API requests. Large DOCX files with embedded images can take 30–60 seconds. The default HttpClient timeout is indefinite.
  • Font substitution matters for docx4j output. If output PDFs have wrong fonts, configure docx4j.properties with docx4j.fonts.pkg.dir=/path/to/fonts and map missing fonts explicitly.

For servers where you control the environment, LibreOffice headless is the highest-fidelity free option. For simpler documents or Lambda/Cloud Run, docx4j avoids any system install. For maximum consistency without local infrastructure, the ChangeThisFile API with Java 11's HttpClient ships in zero additional JARs. Free tier covers 1,000 conversions/month.