All posts

Japanese PDFs in Go: the 2026 definitive guide

How to generate Japanese PDFs in Go in 2026 — fonts, TrueType subsets, mixed kanji/kana/ASCII, and why CGO and Chromium are unnecessary.

by gpdf team

TL;DR

If your Go PDF renders こんにちは as five tofu boxes, the fix is two lines of setup, not a rewrite. Load a Japanese TTF, pass gpdf.WithFont to NewDocument, write Japanese. gpdf subsets the glyph table automatically, so the output carries only the characters you actually used — around 30 KB, not the 5 MB full font. This guide is the map: why Japanese PDF generation is weirdly hard in Go, the four real options in 2026, a complete working example, font subsetting internals, mixed-script edge cases, and what still doesn't work.

Why this guide exists

A Japanese-text PDF in Go should be five minutes of work. For a lot of teams it's a day and a half.

The usual story: someone swaps in AddUTF8Font, the PDF renders blank rectangles — the infamous 豆腐 — and a senior engineer spends an afternoon figuring out whether the problem is the font path, the subset flag, the CMap, the UTF-8 flag, or the PDF reader. By the end of the day there's a Slack thread titled "WHY IS 漢字 STILL BROKEN" and a pull request that adds three helper functions everyone already regrets.

The root cause isn't any one of those things. It's that Go's longest-lived PDF library was designed in 2002 for PHP and Latin-1, and almost every Japanese tutorial written since has been fighting that legacy. This guide is the 2026 version: what actually works when you start clean, and what's still genuinely hard.

All code in this post runs against gpdf v1.x as of 2026-04. The benchmark numbers are from an Apple M1 with Go 1.25.

The tofu problem in 90 seconds

PDF doesn't care about Unicode. It cares about glyph IDs — integer indices into a font's embedded glyph table. When you write "こんにちは" to a PDF, somebody has to:

  1. Parse the TTF and find the glyph ID for each code point (via the font's cmap subtable).
  2. Write a ToUnicode CMap so the PDF reader can map glyphs back to text when the user copies or searches.
  3. Subset the font so the PDF doesn't carry all 20,000 glyphs of Noto Sans JP.
  4. Embed the result with correctly stitched name, OS/2, head, and encoding objects.

If any of those steps is missing or wrong, the reader can't find a glyph for the code point and paints a tofu box. The archived jung-kurt/gofpdf and go-pdf/fpdf lineages retrofitted all of this onto a single-byte-font internal model — the original FPDF from 2002 only knew about Latin-1. That's why setup is fragile, why the output often embeds the full font instead of a subset, and why the failure modes vary by OS and PDF reader.

gpdf treats CJK as a first-class case. The TTF subsetter is in the core package. The ToUnicode CMap is written automatically. There is no AddUTF8Font dance because there is no single-byte-font legacy to retrofit around.

The four real options in 2026

Before writing any code: the honest field. "Japanese-capable" means "will render arbitrary Japanese text without crashes or tofu, given a correct TTF."

OptionLicenseDepsCJK pathPDF size for 300-char docNotes
go-pdf/fpdf (archived 2025)MITstdlibAddUTF8Font retrofit~5 MB (full font)Retrofitted onto Latin-1 core. Subsetting is opt-in and imperfect.
signintech/gopdfMITstdlibAddTTFFont + manual~3 MB typicalLow-level. You write coordinates. Subsetting exists but you drive it.
chromedp + ChromiumMIT + ChromeChromium binaryNative via browservariesHTML/CSS. Needs fonts installed in the container. 500 MB+ image.
gpdfMITstdlib onlyNative, automatic subset~30 KBPure Go. Builder API. ToUnicode CMap written for you.

Two things worth underlining:

The 160× size difference between "full font embedded" and "automatic subset" is not a rounding error. A customer-facing e-commerce invoice with ten line items needs maybe 120 unique Japanese glyphs. Embedding the full Noto Sans JP (5.1 MB) on every invoice means your object storage bill includes the same 5 MB of glyph data 10 million times by the end of the year. Subset embedding carries only the glyphs you used.

"Chromedp works" is true and also the most expensive answer. If your team already runs a headless Chrome fleet for screenshotting, piggybacking on it for PDFs is fine. If you don't, standing one up just to print 日本語 is a lot of infrastructure for a problem that's 40 lines of Go.

The shortest path that works

Start with this. It's complete — copy, save as main.go, drop two TTFs next to it, go run main.go.

package main

import (
    "log"
    "os"

    "github.com/gpdf-dev/gpdf"
    "github.com/gpdf-dev/gpdf/document"
    "github.com/gpdf-dev/gpdf/template"
)

func main() {
    regular, err := os.ReadFile("NotoSansJP-Regular.ttf")
    if err != nil {
        log.Fatal(err)
    }
    bold, err := os.ReadFile("NotoSansJP-Bold.ttf")
    if err != nil {
        log.Fatal(err)
    }

    doc := gpdf.NewDocument(
        gpdf.WithPageSize(document.A4),
        gpdf.WithMargins(document.UniformEdges(document.Mm(20))),
        gpdf.WithFont("NotoSansJP", regular),
        gpdf.WithFont("NotoSansJP-Bold", bold),
        gpdf.WithDefaultFont("NotoSansJP", 11),
    )

    page := doc.AddPage()
    page.AutoRow(func(r *template.RowBuilder) {
        r.Col(12, func(c *template.ColBuilder) {
            c.Text("請求書", template.FontFamily("NotoSansJP-Bold"), template.FontSize(22))
            c.Text("2026 年 4 月 16 日")
        })
    })
    page.AutoRow(func(r *template.RowBuilder) {
        r.Col(7, func(c *template.ColBuilder) {
            c.Text("株式会社 ABC 御中", template.FontSize(13))
            c.Text("〒 100-0001 東京都千代田区千代田 1-1")
        })
        r.Col(5, func(c *template.ColBuilder) {
            c.Text("合計 ¥ 128,000", template.FontFamily("NotoSansJP-Bold"), template.AlignRight())
            c.Text("支払期限: 2026-05-31", template.AlignRight())
        })
    })

    data, err := doc.Generate()
    if err != nil {
        log.Fatal(err)
    }
    if err := os.WriteFile("invoice-ja.pdf", data, 0o644); err != nil {
        log.Fatal(err)
    }
}

Things to notice without me narrating them to death:

  • No AddUTF8Font, no UTF-8 flag, no font path argument to Text. gpdf.WithFont registers a family; c.Text just writes Unicode. The plumbing is internal.
  • Bold is a separate family, not a flag. This matches how TTFs ship (Noto Sans JP Regular and Noto Sans JP Bold are distinct files with different name tables). Gothic and Mincho variants, or Source Han Sans JP Normal/Heavy, follow the same pattern.
  • Layout is grid, not cursor. r.Col(7, ...) and r.Col(5, ...) add to 12. Widths are declarative; you don't compute x-coordinates. More on this in How does the 12-column grid work in gpdf?.
  • AlignRight() is locale-agnostic. The Japanese "¥ 128,000" right-aligns the same way as "$1,280.00" would. The text content doesn't change the layout code.

Open the resulting invoice-ja.pdf in any reader. Select "株式会社 ABC 御中". Paste into a text editor. You get 株式会社 ABC 御中, not a jumble. That's the ToUnicode CMap working; gpdf writes one by default.

Font subsetting: the hidden size bomb

Here is the single most important property of CJK-in-PDF that tutorials skip: subset embedding.

A TTF font is a collection of glyph outlines plus metadata tables. Noto Sans JP Regular ships about 17,500 glyphs and weighs 5.1 MB. A typical invoice uses 60 to 200 unique Japanese characters. Embedding the full font on every document is an order-of-magnitude waste.

Subset embedding keeps only the glyphs you used. gpdf does this automatically. You can see it by running the example above and inspecting the output:

$ ls -l invoice-ja.pdf
-rw-r--r--  1 dev  staff  34892 Apr 16 10:12 invoice-ja.pdf

34 KB. For comparison, the same document generated with go-pdf/fpdf and AddUTF8Font("NotoSansJP", "NotoSansJP-Regular.ttf", true) — where the third argument is the UTF-8 flag — is 4.9 MB. Same input, same output text, 143× larger file. The reason is the fpdf code path embeds the entire font table rather than subsetting it at emit time.

A few consequences worth naming:

  • At 10 invoices per second (a normal SaaS scale), the subsetting difference is the difference between 0.3 MB/s and 43 MB/s of outbound PDF bytes. Your load balancer has an opinion on that.
  • Cold storage bills scale linearly with PDF size. Five million archived invoices at 5 MB each is 25 TB. At 30 KB each, it's 150 GB. Object storage pricing makes this a four-figures-versus-two-figures monthly line item.
  • Email delivery has 10–25 MB attachment limits depending on provider. A 5 MB Japanese invoice plus any other attachment plus MIME encoding starts bumping into that ceiling.

gpdf subsets at render time. There's no flag to turn it on. You can see which glyphs ended up in the output by running gpdf's verification tool locally, but the short version is: if you used , , , and , those four glyphs are in the output and the other 17,496 are not.

Mixed scripts: kanji + kana + ASCII on one line

Japanese text is rarely Japanese-only. A real-world line in a Japanese document looks like this:

API の P95 レイテンシは 50 ms 未満です。

That's five scripts: romaji (ASCII Latin), katakana, hiragana, kanji (Han), and numerals. A naive implementation picks the wrong font for the ASCII parts and you end up with a monospaced "API" next to proportional Japanese, which looks terrible.

gpdf's default behavior is to render every code point in the registered family. If Noto Sans JP is your default, API and 50 ms are drawn with Noto Sans JP's Latin glyphs, which Noto provides (most Japanese superfamilies do). The result looks like a single typeface, because it is.

If you want to mix families deliberately — say, use a condensed sans for ASCII and Noto Sans JP for Japanese — register both and override per-text-call:

c.Text("API の P95 レイテンシは 50 ms 未満です。",
    template.FontFamily("NotoSansJP"))
c.Text("API latency (P95) is under 50 ms.",
    template.FontFamily("InterVariable"))

Two c.Text calls, two families, no script-detection logic in your code. If you need intra-line mixing (ASCII Inter + Japanese Noto in the same sentence), that's coming in gpdf v1.2; today the workaround is to split at script boundaries manually and lay out with a horizontal row of columns.

What still hurts

The Japanese PDF story in Go is 95% solved. Here's the 5%.

Vertical text (縦書き) is not there yet. gpdf renders horizontal text only in v1.x. Traditional Japanese layout — right-to-left columns of top-to-bottom characters with the appropriate glyph rotation and punctuation repositioning — is a deep layout engine change, not a rendering tweak. The open issue has a proposed design; it'll land when it lands. For now, if you need 縦書き for books or formal correspondence, generate with a tool that supports it (Word, InDesign, or a pandoc + LuaLaTeX pipeline) and embed the output PDF with gpdf.Merge.

Ruby annotations (振り仮名) are workaround-only. There's no c.Ruby("漢字", "かんじ") primitive. If you need ruby for children's content or language textbooks, the workaround is a two-row column: small kana text on top, regular kanji below, aligned. It works, but it's manual, and fine kerning across furigana boundaries takes care.

Complex fallbacks across multiple CJK fonts. If a user submits text that mixes Japanese kanji with Chinese-only characters (the character forms differ — , , render subtly differently in CN vs JP), you need to manually split and use two families. gpdf doesn't auto-fall-back across families within a single c.Text call. In practice, very few documents need this; if yours does, see Multi-language PDFs: mixing JP/CN/KR/EN. (Article pending — B-070.)

PDF/A-2b strict compliance with Japanese. gpdf produces PDF/A output via gpdf.WithPDFA, but the tight compliance requirements around embedded glyph metadata, the ActualText span on every CJK run, and tagged structure trees are still being ironed out for the CJK case. If you're exporting for long-term archival under 電子帳簿保存法, validate with a third-party tool (veraPDF is free) before committing.

None of these are blockers for the common cases: invoices, reports, statements, receipts, certificates. They're worth naming because somebody reading this is about to hit one of them in production, and "it's on the roadmap" is less useful than "here's the workaround."

A word on compliance

One piece of ecosystem context that usually goes unsaid: Japanese PDF generation in 2026 is not just a typography problem. Two regulatory shifts push it into the compliance conversation.

The 適格請求書 (qualified invoice) regime under the consumption-tax reform requires invoices to include specific fields (registered business number, applicable tax rate, breakdown) and to be retained in a tamper-evident way. PDFs are the default format for this, and "tamper-evident" maps to PDF digital signatures — PAdES-B-LT in the strict case.

The 電子帳簿保存法 (e-book storage act), revised in 2024, extended retention mandates to include invoices stored in electronic form. Archived PDFs must meet certain integrity requirements. PDF/A-2b or PDF/A-3b is the de-facto target format.

Both requirements lean on native PDF features — signatures, long-term validation, PDF/A embedded metadata. HTML-to-PDF via headless browser does not meet either cleanly; Chromium's PDF output isn't PDF/A-compliant and can't embed digital signatures in a single step. A native Go stack (gpdf + gpdf/signature for PAdES + gpdf.WithPDFA) does the whole chain in one pipeline without leaving the process.

This is a future-topic flag rather than a deep dive — signature and PDF/A each deserve their own hero article (they're B-067 and B-068 on the backlog). But if you're choosing a Japanese PDF stack today and compliance is anywhere on your radar, pick a stack that can do signatures and PDF/A natively. The migration tax from "works today" to "passes audit" is real.

FAQ

Do I need to install fonts on the server or in the container? No. gpdf reads TTF bytes; it doesn't go through the system font cache. os.ReadFile("NotoSansJP-Regular.ttf") or //go:embed NotoSansJP-Regular.ttf works identically on macOS, Linux, and Windows, inside a distroless container, and on AWS Lambda. No fontconfig, no fc-cache -fv. This is one of the reasons gpdf works in FROM scratch images.

Noto Sans JP vs Source Han Sans JP — does it matter? They're the same font family under two names. Adobe publishes Source Han Sans JP; Google repackages it as Noto Sans JP. Glyph coverage is identical. Pick whichever license distribution fits your legal review; both are SIL Open Font License. For brand-neutral documentation we default to Noto Sans JP because the file names are easier to remember.

What about 游ゴシック (Yu Gothic) or Hiragino? Those are OS-bundled proprietary fonts. You can use them if your deployment target has licensed them (Windows Server bundles Yu Gothic; macOS bundles Hiragino), but you'll need to source the TTF file and confirm redistribution terms for your container build. For open deployments, stick with Noto Sans JP or IPAex Gothic (both free redistribution).

The PDF renders but Ctrl+F search finds nothing. Why? Almost always a ToUnicode CMap issue. gpdf writes one automatically, so if you're seeing this with gpdf, open an issue with the reader name. If you're seeing it with gofpdf, the fix is to enable the UTF-8 flag and ensure the reader supports CID fonts — old Preview.app versions on macOS have known issues. Test with Adobe Reader or Chrome as a control.

How do I add a JIS X 0213 character that's not in the font? You don't — there's no glyph to draw. The practical answer is "use a font that covers JIS X 0213." Noto Sans JP covers the full BMP plus JIS X 0213 Level 1. For rare historical variants, Hanazono Mincho (花園明朝) is the last-mile fallback. If a code point isn't in any font, gpdf emits the Unicode replacement character (U+FFFD) rather than a silent tofu — so you'll see in the output and know to check.

Is there a performance cost to CJK vs ASCII? Small. gpdf's benchmark for a "complex CJK invoice" is 133 µs per document on an Apple M1, vs 108 µs for a 4×10 ASCII line-items table. That's a ~23% overhead, almost entirely from the larger glyph-lookup and subsetting work. For reference, go-pdf/fpdf on the same CJK benchmark is 254 µs, and Maroto v2 is 10.4 ms. Japanese rendering isn't the bottleneck in your service.

Try gpdf

gpdf is a Go library for generating PDFs. MIT, zero external dependencies, native CJK.

go get github.com/gpdf-dev/gpdf

⭐ Star on GitHub · Read the docs

Next reads