Private. In-browser. No upload.
Japanese OCR — extract Japanese text from images
Drop an image with Japanese text — a screenshot, a photo of a document, a scan, a sign, a receipt — and get back the text as something you can copy and edit. Recognizes letters, accent marks, and Japanese-specific characters. Runs entirely in your browser using a local OCR model trained on Japanese. Nothing about the image leaves your device.
日本語 · Japanese
What is OCR?
OCR (Optical Character Recognition) is the technology that reads text from inside a picture and turns it into text you can copy, paste, edit, or search. Without it, the words inside an image are just colored pixels — your computer has no idea what they say. With OCR, those pixels become actual letters and words again.
- Receipts — snap a photo, get the line items as text you can total up.
- Screenshots — grab text out of an image when you can't select it normally.
- Scanned documents — turn a scan of a paper letter, contract, or form into editable text.
- Photos of book or magazine pages — extract quotes or paragraphs without retyping.
- Foreign-language signs and menus — pull the text out so you can paste it into a translator.
- Japanese optimized. Uses a Japanese-trained recognition model so accents and special characters come through correctly.
- No upload, ever. The image and the language model are both processed locally in your browser.
- Switch language anytime. The dropdown above lets you switch to any other supported language without reloading.
Common uses for Japanese OCR
- Japanese business documents, receipts, and forms
- Scanned manga panels, novels, or magazine articles
- Photos of menus, signs, and product packaging
- Letters, postcards, and personal correspondence
How it works
- Drop your image. JPG, PNG, WebP, BMP, GIF, or TIFF — up to 20MB.
- Japanese is already selected in the language dropdown. (You can switch to any other supported language too.)
- Click "Extract text". The Japanese OCR model downloads on first use (~10–15 MB, cached after that) and processes your image. Takes a few seconds per image.
- Copy or save the result. The extracted text appears in a textbox below. Copy to clipboard, or download as a .txt file.
Common questions
- Does it really recognize Japanese accents and special characters?
- Yes. The model is trained specifically on Japanese text, so 日本語 — including accents, diacritics, and any Japanese-specific characters — gets recognized properly. You'll get the same characters you'd find on a Japanese keyboard, not a transliterated approximation.
- Why is the first recognition slow?
- On your first Japanese OCR run, the browser downloads the Japanese language model (typically 10–15 MB). This happens once per language, ever — after that, the model is cached and starts instantly. Switching back to English uses the English model, which is downloaded the same way.
- What if the image has both Japanese and English in it?
- Pick the dominant language. Tesseract.js can recognize either, but it'll do best on whichever language matches the model. For heavily mixed documents (like academic papers with English citations in a Japanese body), running OCR twice — once in each language — and combining the results gives the best coverage.
- How accurate is it on handwriting?
- Tesseract is designed for printed text. Clean printed Japanese (a typed page, a screenshot, a clear photo of a document) is recognized with high accuracy. Handwriting works only for very neat, blocky writing — cursive and rushed notes generally produce garbage output. For handwriting recognition you'd need a different kind of model.
- Sample text in Japanese:
- "素早い茶色のキツネが怠惰な犬を飛び越える。"