feat: mirror source directory tree in batch translation output by rudi193-cmd · Pull Request #1148 · PDFMathTranslate/PDFMathTranslate · GitHub
Skip to content

feat: mirror source directory tree in batch translation output#1148

Open
rudi193-cmd wants to merge 1 commit into
PDFMathTranslate:mainfrom
rudi193-cmd:feat/dir-mirror-output
Open

feat: mirror source directory tree in batch translation output#1148
rudi193-cmd wants to merge 1 commit into
PDFMathTranslate:mainfrom
rudi193-cmd:feat/dir-mirror-output

Conversation

@rudi193-cmd

Copy link
Copy Markdown

Summary

Closes #793.

When --dir is used, translated files now land in a subdirectory structure that mirrors the source tree instead of being flattened into the output root.

Example:

docs/
  papers/intro.pdf
  supplemental/appendix.pdf

With pdf2zh --dir docs/ -o out/ you now get:

out/
  papers/intro-mono.pdf
  papers/intro-dual.pdf
  supplemental/appendix-mono.pdf
  supplemental/appendix-dual.pdf

What changed

  • TranslateRequest (kernel/protocol.py): added source_dir: Optional[str] = None field
  • LegacyKernel.translate() (kernel/legacy.py): passes source_dir through to high_level.translate() when set
  • high_level.translate() (high_level.py): accepts source_dir kwarg; computes a relative output subdir per file using os.path.relpath, creates it with mkdir(parents=True), skips the logic for URL inputs
  • main() (pdf2zh.py): captures source_dir = os.path.abspath(files[0]) before expanding the file list, passes it into TranslateRequest

URL inputs and non---dir invocations are unaffected — they continue writing directly to Path(output).

Test plan

  • pdf2zh --dir /path/to/nested/ -o /tmp/out/ — verify output mirrors source hierarchy
  • Single-file invocation pdf2zh paper.pdf -o /tmp/out/ — verify flat output unchanged
  • URL input (pdf2zh https://…/paper.pdf -o /tmp/out/) — verify flat output unchanged

When translating a directory with --dir, output files now reflect the
source folder hierarchy instead of flattening everything into one level.
Adds source_dir field to TranslateRequest and threads it through
LegacyKernel → high_level.translate() → per-file subdir creation.

Closes PDFMathTranslate#793
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: batch recursive translation

1 participant