Sunbelt Computer Software

TuneOCR

TuneOCR is a stable, production-ready, open-source framework for fine-tuning OCR models. It helps researchers, developers, and hobbyists train and adapt small OCR models more effectively across diverse document types and languages while keeping compute and cost practical.

Overview

TuneOCR provides reproducible training pipelines, dataset utilities, and helpers for flexible token / special-token positioning so you can experiment with language tags, transcription prompts, and label-formatting strategies without modifying model internals.

Key capabilities:

Fine-tune a variety of compact OCR and document-understanding models.
Reformat or inject special tokens (e.g., language tags) in training targets.
Simple data collators and training scripts ready for CPU/GPU or small cloud instances.
Extensible plugin/back-end system for new model integrations.
Built-in deterministic runs and logging to reproduce experiments.

Supported Architectures

TuneOCR ships with integrations or examples for the following architectures:

TrOCR — Single-line handwriting and printed text.
Donut — Document-to-JSON extraction (receipts, invoices, structured forms).
VLMs — Visual-linguistic models for complex document QA and understanding.
QWEN OCR — Multi-language, high-capacity OCR models.
Nanonets OCR — Lightweight, fast OCR suitable for scanned forms and field capture.
OlmOCR — Flexible research-oriented OCR framework for experimentation.

If a backend you need is missing, TuneOCR is designed so you can add one quickly (processor, model wrapper, data prep).

Why TuneOCR?

Open source & community-first — Inspect, reuse, and extend the training recipes.
Small-model focused — Emphasizes parameter-efficient approaches (LoRA/adapters, selective fine-tuning) and practical defaults so you can iterate on modest hardware.
Flexible label formatting — Move language/task tokens (e.g., <|en|>, <|transcribe|>) to start, middle, or end of targets to test what works for your data.
Reproducible experiments — Deterministic seeds, config-driven runs, and logging for fair comparisons.
Practical evaluations — WER / CER and structured extraction checks (for Donut-like workflows).

Installation

Python 3.11 recommended

Install with Pip

pip install -r requirements.txt

Contribution & Community

Contributors are super welcome! Help the project grow by opening issues and submitting PRs.

How to contribute:

Open an issue for bugs, feature requests, or new backend proposals. Include:

Problem statement / feature description
Minimal repro or sample dataset (if applicable)
Expected behavior or desired API

Create a Pull Request (PR) for fixes, features, docs, or new backends. PR checklist we appreciate:

Descriptive title and summary of the change
Tests where appropriate (unit or small integration)
Updated CHANGELOG.md and documentation for visible changes

Use feature branches named like feature/qwen-backend or fix/token-positioning-bug.

Contributors

All contributors are welcome. To be acknowledged:

Add yourself to CONTRIBUTORS.md via PR, or open an issue to request inclusion.
Major contributions will be noted in CHANGELOG.md and release notes.

Contact / Maintainer

Author / Maintainer: Emkay Nguyen
Email: minhkhoinguyendo1210@gmail.com

For coordination, partnership, or academic collaboration, open an issue or email the maintainer.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
sample_dataset		sample_dataset
tests		tests
tuneocr		tuneocr
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gen_data.py		gen_data.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TuneOCR

Overview

Supported Architectures

Why TuneOCR?

Installation

Install with Pip

Contribution & Community

Contributors

Contact / Maintainer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Sunbelt Computer Software

PL/B Language Development and Support

Folders and files

Latest commit

History

Repository files navigation

TuneOCR

Overview

Supported Architectures

Why TuneOCR?

Installation

Install with Pip

Contribution & Community

Contributors

Contact / Maintainer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages