Releases: opendatalab/MinerU
mineru-3.4.0-released
What's Changed
-
2026/06/18 3.4 Released
This release focuses on OCR capability upgrades for the pipeline backend, OCR processing pipeline optimization, and model download experience improvements. The main updates include:
-
OCR model upgrade and processing acceleration
- The OCR model for the
pipelinebackend has been upgraded toPP-OCRv6, improving OCR accuracy by about11%on OmniDocBench v1.6. - Removed Japanese, Traditional Chinese, English, and Latin options from OCR language selection. These scenarios are now routed to the
chOCR model, simplifying model configuration and language selection. - Optimized the OCR inference and processing pipeline, increasing OCR processing speed by about
100%and significantly improving parsing efficiency for batch documents and OCR-intensive documents.
- The OCR model for the
-
Model download logic optimization
- Added automatic model source selection, allowing first-time installations to choose a better model source based on the current network environment.
- Before downloading models, MinerU now prioritizes checking locally downloaded model cache files. Cache hits can be reused directly, reducing repeated downloads and unnecessary remote requests.
- For more details about model source configuration, automatic source selection, and local model usage, see the Model Source Documentation.
With the 3.4 release, MinerU further improves the parsing accuracy and processing efficiency of the
pipelinebackend in OCR scenarios. It also optimizes model downloads, cache reuse, and local configuration write-back, making first-time installation, model updates, and multi-environment deployment more stable and automated. -
-
2026/06/18 3.4 发布
本次版本更新聚焦于 pipeline 后端 OCR 能力升级、OCR 处理链路优化 与 模型下载体验改进。主要更新内容包括:
-
OCR 模型升级与处理加速
pipeline后端 OCR 模型更新至PP-OCRv6,在 OmniDocBench v1.6 评测中,OCR 相关指标提升约11%。- 移除 OCR 语言选择中的日语、繁体中文、英语、拉丁文选项,相关场景统一路由到
chOCR 模型,简化模型配置与语言选择逻辑。 - 优化 OCR 推理与处理链路,OCR 处理速度提升约
100%,显著改善批量文档和 OCR 密集型文档的解析效率。
-
模型下载逻辑优化
- 新增模型源自动选择能力,首次安装时可根据当前网络环境自动选择更合适的模型源。
- 下载模型前会优先检查本地已下载的模型缓存文件,命中缓存时可直接复用,减少重复下载和不必要的远端请求。
- 更多模型源配置、自动选择策略与本地模型使用说明,请参考 模型源说明。
在 3.4 版本,MinerU 进一步提升了
pipeline后端在 OCR 场景下的解析精度与处理效率,同时优化了模型下载、缓存复用和本地配置写入流程,让首次安装、模型更新和多环境部署更加稳定、自动化。 -
Full Changelog: mineru-3.3.1-released...mineru-3.4.0-released
mineru-3.3.1-released
What's Changed
-
2026/06/11 3.3 Released
This release focuses on Hybrid parsing performance optimization and VLM model capability upgrades. The main updates include:
-
New
effortparsing-strength parameter for the Hybrid backend- Added two parsing-strength levels,
mediumandhigh, allowing users to balance parsing speed, parsing accuracy, and feature requirements. - On OmniDocBench v1.6,
mediumreduces overall accuracy by only0.13points compared withhigh, while delivering35%~220%parsing speed improvements across different devices and scenarios:- Linux: about
80%faster for text PDF scenarios and about35%faster for OCR scenarios - Windows: about
90%faster for text PDF scenarios and about45%faster for OCR scenarios - macOS: about
220%faster for text PDF scenarios and about50%faster for OCR scenarios
- Linux: about
- The default Hybrid backend now uses
effort=medium, significantly improving overall parsing efficiency while maintaining high parsing accuracy. - The
mediumlevel does not supportimage analysis; for maximum parsing accuracy orimage analysissupport, switch to the high-strength parsing mode witheffort=high, which may have an impact on parsing speed.
- Added two parsing-strength levels,
-
VLM model upgraded to
MinerU2.5-Pro-2605-1.2B- Fixed multiple model issues found in the
2604version, further improving parsing stability on complex documents. - Added native multilingual OCR support, reducing the need for extra language-parameter configuration and improving out-of-the-box usability for multilingual documents.
- Fixed multiple model issues found in the
With the 3.3 release, MinerU further improves Hybrid backend efficiency across platforms and scenarios while maintaining high-accuracy parsing. The default
mediumeffort level is better suited for most day-to-day document processing tasks, whilehighis designed for scenarios that require maximum parsing accuracy orimage analysiscapabilities. -
-
2026/06/11 3.3 发布
本次版本更新聚焦于 Hybrid 解析性能优化 与 VLM 模型能力升级。主要更新内容包括:
-
Hybrid 后端新增
effort解析强度参数- 新增
medium与high两档解析强度,用户可根据解析速度、解析精度和功能需求灵活选择。 - 在 OmniDocBench v1.6 评测中,
medium相比high综合精度仅降低0.13,但在不同设备和场景下可获得35%~220%的解析速度提升:- Linux:文本 PDF 场景提升约
80%,OCR 场景提升约35% - Windows:文本 PDF 场景提升约
90%,OCR 场景提升约45% - macOS:文本 PDF 场景提升约
220%,OCR 场景提升约50%
- Linux:文本 PDF 场景提升约
- 默认 Hybrid 后端将使用
effort=medium,在保持高解析精度的同时显著提升整体解析效率。 medium档不支持image analysis(图片/图表分析)功能;如需极致解析精度或启用image analysis,可通过effort=high切换至高强度解析模式,但解析速度会受到一定影响。
- 新增
-
VLM 模型升级至
MinerU2.5-Pro-2605-1.2B- 修复
2604版本中存在的多处模型问题,进一步提升复杂文档场景下的解析稳定性。 - 原生支持多语言 OCR,降低多语言文档解析时对额外语言参数配置的依赖,提升跨语言场景的开箱即用体验。
- 修复
通过 3.3 版本,MinerU 在保持高精度解析能力的同时,进一步提升了 Hybrid 后端在多平台、多场景下的解析效率。默认
medium解析强度更适合大多数日常文档处理任务,而high模式则面向对解析精度和image analysis能力有更高要求的场景。 -
Full Changelog: mineru-3.2.3-released...mineru-3.3.1-released
mineru-3.3.0-released
Merge pull request #5110 from opendatalab/dev 3.3.0
mineru-3.2.3-released
What's Changed
- feat: added support for superscript and subscript detection/output.
- feat: implement post-OCR fallback mechanism for private use text handling
Full Changelog: mineru-3.2.2-released...mineru-3.2.3-released
mineru-3.2.2-released
mineru-3.2.1-released
What's Changed
适配 vLLM 0.21.0
- 放宽了 vLLM 版本上限至 0.21.0,以支持更新的 vLLM 环境。
- 将默认的 NVIDIA Docker 基础镜像更新为基于 CUDA 13 的 vLLM 0.21.0,为 Spark DGX 等 sm121 设备提供原生支持。
- 如果设备驱动版本较低,无法支持 CUDA 13,请在 Dockerfile 中将
vllm/vllm-openai:v0.21.0改为vllm/vllm-openai:v0.21.0-cu129。
Adapt to vLLM 0.21.0
- Relaxed the vLLM version upper bound to 0.21.0, enabling support for newer vLLM environments.
- Updated the default NVIDIA Docker base image to vLLM 0.21.0 with CUDA 13, providing native support for sm121 devices such as Spark DGX.
- If the device driver version is too low to support CUDA 13, please change
vllm/vllm-openai:v0.21.0tovllm/vllm-openai:v0.21.0-cu129in the Dockerfile.
Full Changelog: mineru-3.2.0-released...mineru-3.2.1-released
mineru-3.2.0-released
What's Changed
MinerU 3.2.0 版本现已发布,本次更新主要聚焦于界面体验、依赖管理、VLM 模型升级以及稳定性修复。
- 优化 Gradio 界面交互与展示效果,提升文件上传、结果查看和整体使用体验。
- 优化项目依赖管理,精简不必要依赖,降低安装与运行环境维护成本。
- 更新 VLM 模型至 2605 版本,提升视觉语言模型相关解析能力与稳定性。
- 修复若干已知问题,提升整体稳定性与兼容性。
MinerU 3.2.0 is now available. This release focuses on UI improvements, dependency optimization, VLM model updates, and general stability fixes.
- Improved the Gradio interface for a smoother upload, preview, and result-viewing experience.
- Optimized dependency management by reducing unnecessary dependencies and improving runtime maintainability.
- Updated the VLM model to the 2605 version, improving VLM-based parsing capability and stability.
- Fixed various known issues to improve overall stability and compatibility.
Full Changelog: mineru-3.1.15-released...mineru-3.2.0-released
mineru-3.1.15-released
What's Changed
- Improved Gradio preview and upload experience, including Office source-file preview links, clipboard file upload, clearer processing status, better i18n rendering, and extracted Gradio CSS/JS/header resources.
- Fixed Gradio Markdown/HTML image previews to use served file URLs instead of embedded base64, improving preview compatibility without changing exported artifacts.
- Improved Office parsing robustness, including DOCX table alignment, safer XML tag-name handling, embedded Office member normalization, and better DOCX table matching.
- Enhanced XLSX package normalization for shared strings, styles, worksheets, and row-only auto filters to improve compatibility with non-standard files.
- Optimized OCR/formula processing and image handling, including async OCR/formula execution, updated OCR defaults, larger image width limits, cached vector placeholders, and single-write image reuse.
- Added Router API docs for POST request parameters in
/file_parseand/tasks.
Full Changelog: mineru-3.1.14-released...mineru-3.1.15-released
mineru-3.1.14-released
What's Changed
- Accuracy improvements:
- Optimized the
pdf_classifyclassification pipeline. - Tuned the contrast threshold boundary for span OCR.
- Optimized the
Full Changelog: mineru-3.1.13-released...mineru-3.1.14-released
mineru-3.1.13-released
What's Changed
- Making Office parsing and rendering more stable and consistent, especially for edge cases.
- Added better Office parsing reliability for DOCX/PPTX/XLSX: fixed ordered-list restarts, list numbering recovery, and merged/hidden-cell handling in spreadsheets.
- Improved formatting fidelity in DOCX output: richer style handling (sup/sub support, visible-space improvements, safer style mapping), plus better strikethrough/inline rendering compatibility.
- Enhanced parsing/render pipeline and docs: inline vector/base64 image handling and small robustness fixes (TOC/edge trimming/VLM text normalization) plus updated examples/output docs.
Full Changelog: mineru-3.1.12-released...mineru-3.1.13-released
