Releases · opendatalab/MinerU · GitHub
Skip to content

Releases: opendatalab/MinerU

mineru-3.4.0-released

18 Jun 07:23
562ba58

Choose a tag to compare

What's Changed

  • 2026/06/18 3.4 Released

    This release focuses on OCR capability upgrades for the pipeline backend, OCR processing pipeline optimization, and model download experience improvements. The main updates include:

    • OCR model upgrade and processing acceleration

      • The OCR model for the pipeline backend has been upgraded to PP-OCRv6, improving OCR accuracy by about 11% on OmniDocBench v1.6.
      • Removed Japanese, Traditional Chinese, English, and Latin options from OCR language selection. These scenarios are now routed to the ch OCR model, simplifying model configuration and language selection.
      • Optimized the OCR inference and processing pipeline, increasing OCR processing speed by about 100% and significantly improving parsing efficiency for batch documents and OCR-intensive documents.
    • Model download logic optimization

      • Added automatic model source selection, allowing first-time installations to choose a better model source based on the current network environment.
      • Before downloading models, MinerU now prioritizes checking locally downloaded model cache files. Cache hits can be reused directly, reducing repeated downloads and unnecessary remote requests.
      • For more details about model source configuration, automatic source selection, and local model usage, see the Model Source Documentation.

    With the 3.4 release, MinerU further improves the parsing accuracy and processing efficiency of the pipeline backend in OCR scenarios. It also optimizes model downloads, cache reuse, and local configuration write-back, making first-time installation, model updates, and multi-environment deployment more stable and automated.

  • 2026/06/18 3.4 发布

    本次版本更新聚焦于 pipeline 后端 OCR 能力升级OCR 处理链路优化模型下载体验改进。主要更新内容包括:

    • OCR 模型升级与处理加速

      • pipeline 后端 OCR 模型更新至 PP-OCRv6,在 OmniDocBench v1.6 评测中,OCR 相关指标提升约 11%
      • 移除 OCR 语言选择中的日语、繁体中文、英语、拉丁文选项,相关场景统一路由到 ch OCR 模型,简化模型配置与语言选择逻辑。
      • 优化 OCR 推理与处理链路,OCR 处理速度提升约 100%,显著改善批量文档和 OCR 密集型文档的解析效率。
    • 模型下载逻辑优化

      • 新增模型源自动选择能力,首次安装时可根据当前网络环境自动选择更合适的模型源。
      • 下载模型前会优先检查本地已下载的模型缓存文件,命中缓存时可直接复用,减少重复下载和不必要的远端请求。
      • 更多模型源配置、自动选择策略与本地模型使用说明,请参考 模型源说明

    在 3.4 版本,MinerU 进一步提升了 pipeline 后端在 OCR 场景下的解析精度与处理效率,同时优化了模型下载、缓存复用和本地配置写入流程,让首次安装、模型更新和多环境部署更加稳定、自动化。

Full Changelog: mineru-3.3.1-released...mineru-3.4.0-released

mineru-3.3.1-released

11 Jun 14:27
d3cedcb

Choose a tag to compare

What's Changed

  • 2026/06/11 3.3 Released

    This release focuses on Hybrid parsing performance optimization and VLM model capability upgrades. The main updates include:

    • New effort parsing-strength parameter for the Hybrid backend

      • Added two parsing-strength levels, medium and high, allowing users to balance parsing speed, parsing accuracy, and feature requirements.
      • On OmniDocBench v1.6, medium reduces overall accuracy by only 0.13 points compared with high, while delivering 35% ~ 220% parsing speed improvements across different devices and scenarios:
        • Linux: about 80% faster for text PDF scenarios and about 35% faster for OCR scenarios
        • Windows: about 90% faster for text PDF scenarios and about 45% faster for OCR scenarios
        • macOS: about 220% faster for text PDF scenarios and about 50% faster for OCR scenarios
      • The default Hybrid backend now uses effort=medium, significantly improving overall parsing efficiency while maintaining high parsing accuracy.
      • The medium level does not support image analysis; for maximum parsing accuracy or image analysis support, switch to the high-strength parsing mode with effort=high, which may have an impact on parsing speed.
    • VLM model upgraded to MinerU2.5-Pro-2605-1.2B

      • Fixed multiple model issues found in the 2604 version, further improving parsing stability on complex documents.
      • Added native multilingual OCR support, reducing the need for extra language-parameter configuration and improving out-of-the-box usability for multilingual documents.

    With the 3.3 release, MinerU further improves Hybrid backend efficiency across platforms and scenarios while maintaining high-accuracy parsing. The default medium effort level is better suited for most day-to-day document processing tasks, while high is designed for scenarios that require maximum parsing accuracy or image analysis capabilities.

  • 2026/06/11 3.3 发布

    本次版本更新聚焦于 Hybrid 解析性能优化VLM 模型能力升级。主要更新内容包括:

    • Hybrid 后端新增 effort 解析强度参数

      • 新增 mediumhigh 两档解析强度,用户可根据解析速度、解析精度和功能需求灵活选择。
      • 在 OmniDocBench v1.6 评测中,medium 相比 high 综合精度仅降低 0.13,但在不同设备和场景下可获得 35% ~ 220% 的解析速度提升:
        • Linux:文本 PDF 场景提升约 80%,OCR 场景提升约 35%
        • Windows:文本 PDF 场景提升约 90%,OCR 场景提升约 45%
        • macOS:文本 PDF 场景提升约 220%,OCR 场景提升约 50%
      • 默认 Hybrid 后端将使用 effort=medium,在保持高解析精度的同时显著提升整体解析效率。
      • medium 档不支持 image analysis(图片/图表分析)功能;如需极致解析精度或启用 image analysis,可通过 effort=high 切换至高强度解析模式,但解析速度会受到一定影响。
    • VLM 模型升级至 MinerU2.5-Pro-2605-1.2B

      • 修复 2604 版本中存在的多处模型问题,进一步提升复杂文档场景下的解析稳定性。
      • 原生支持多语言 OCR,降低多语言文档解析时对额外语言参数配置的依赖,提升跨语言场景的开箱即用体验。

    通过 3.3 版本,MinerU 在保持高精度解析能力的同时,进一步提升了 Hybrid 后端在多平台、多场景下的解析效率。默认 medium 解析强度更适合大多数日常文档处理任务,而 high 模式则面向对解析精度和 image analysis 能力有更高要求的场景。

Full Changelog: mineru-3.2.3-released...mineru-3.3.1-released

mineru-3.3.0-released

11 Jun 11:05
1a59613

Choose a tag to compare

Merge pull request #5110 from opendatalab/dev

3.3.0

mineru-3.2.3-released

04 Jun 07:06
0a1b03d

Choose a tag to compare

What's Changed

  • feat: added support for superscript and subscript detection/output.
  • feat: implement post-OCR fallback mechanism for private use text handling

Full Changelog: mineru-3.2.2-released...mineru-3.2.3-released

mineru-3.2.2-released

02 Jun 15:16
c574a9a

Choose a tag to compare

What's Changed

  • #5033 fix: Enhance PDF processing and improve concurrency management by @myhloli in #5062
  • #5061 fix: add functionality to skip broken PDF pages during rewrite process by @myhloli in #5064

Full Changelog: mineru-3.2.1-released...mineru-3.2.2-released

mineru-3.2.1-released

28 May 16:01
9697823

Choose a tag to compare

What's Changed

适配 vLLM 0.21.0

  • 放宽了 vLLM 版本上限至 0.21.0,以支持更新的 vLLM 环境。
  • 将默认的 NVIDIA Docker 基础镜像更新为基于 CUDA 13 的 vLLM 0.21.0,为 Spark DGX 等 sm121 设备提供原生支持。
  • 如果设备驱动版本较低,无法支持 CUDA 13,请在 Dockerfile 中将 vllm/vllm-openai:v0.21.0 改为 vllm/vllm-openai:v0.21.0-cu129

Adapt to vLLM 0.21.0

  • Relaxed the vLLM version upper bound to 0.21.0, enabling support for newer vLLM environments.
  • Updated the default NVIDIA Docker base image to vLLM 0.21.0 with CUDA 13, providing native support for sm121 devices such as Spark DGX.
  • If the device driver version is too low to support CUDA 13, please change vllm/vllm-openai:v0.21.0 to vllm/vllm-openai:v0.21.0-cu129 in the Dockerfile.

Full Changelog: mineru-3.2.0-released...mineru-3.2.1-released

mineru-3.2.0-released

26 May 13:06
0ecc067

Choose a tag to compare

What's Changed

MinerU 3.2.0 版本现已发布,本次更新主要聚焦于界面体验、依赖管理、VLM 模型升级以及稳定性修复。

  • 优化 Gradio 界面交互与展示效果,提升文件上传、结果查看和整体使用体验。
  • 优化项目依赖管理,精简不必要依赖,降低安装与运行环境维护成本。
  • 更新 VLM 模型至 2605 版本,提升视觉语言模型相关解析能力与稳定性。
  • 修复若干已知问题,提升整体稳定性与兼容性。

MinerU 3.2.0 is now available. This release focuses on UI improvements, dependency optimization, VLM model updates, and general stability fixes.

  • Improved the Gradio interface for a smoother upload, preview, and result-viewing experience.
  • Optimized dependency management by reducing unnecessary dependencies and improving runtime maintainability.
  • Updated the VLM model to the 2605 version, improving VLM-based parsing capability and stability.
  • Fixed various known issues to improve overall stability and compatibility.

Full Changelog: mineru-3.1.15-released...mineru-3.2.0-released

mineru-3.1.15-released

19 May 10:56
3053436

Choose a tag to compare

What's Changed

  • Improved Gradio preview and upload experience, including Office source-file preview links, clipboard file upload, clearer processing status, better i18n rendering, and extracted Gradio CSS/JS/header resources.
  • Fixed Gradio Markdown/HTML image previews to use served file URLs instead of embedded base64, improving preview compatibility without changing exported artifacts.
  • Improved Office parsing robustness, including DOCX table alignment, safer XML tag-name handling, embedded Office member normalization, and better DOCX table matching.
  • Enhanced XLSX package normalization for shared strings, styles, worksheets, and row-only auto filters to improve compatibility with non-standard files.
  • Optimized OCR/formula processing and image handling, including async OCR/formula execution, updated OCR defaults, larger image width limits, cached vector placeholders, and single-write image reuse.
  • Added Router API docs for POST request parameters in /file_parse and /tasks.

Full Changelog: mineru-3.1.14-released...mineru-3.1.15-released

mineru-3.1.14-released

15 May 15:02
d60304f

Choose a tag to compare

What's Changed

  • Accuracy improvements:
    • Optimized the pdf_classify classification pipeline.
    • Tuned the contrast threshold boundary for span OCR.

Full Changelog: mineru-3.1.13-released...mineru-3.1.14-released

mineru-3.1.13-released

14 May 13:16
2528385

Choose a tag to compare

What's Changed

  • Making Office parsing and rendering more stable and consistent, especially for edge cases.
    • Added better Office parsing reliability for DOCX/PPTX/XLSX: fixed ordered-list restarts, list numbering recovery, and merged/hidden-cell handling in spreadsheets.
    • Improved formatting fidelity in DOCX output: richer style handling (sup/sub support, visible-space improvements, safer style mapping), plus better strikethrough/inline rendering compatibility.
    • Enhanced parsing/render pipeline and docs: inline vector/base64 image handling and small robustness fixes (TOC/edge trimming/VLM text normalization) plus updated examples/output docs.

Full Changelog: mineru-3.1.12-released...mineru-3.1.13-released