{{ message }}
doc: generate man pages from markdown#7540
Open
petrasovaa wants to merge 3 commits into
Open
Conversation
echoix
reviewed
Jun 14, 2026
Member
There was a problem hiding this comment.
Since this file is a script and even has a shebang, does it have the executable bit set? I’m not sure by looking at the GitHub interface
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This PR adds a converter markdown to groff (man pages). It reuses the groff Formatter. Adjusts makefiles to build the man pages from markdown.
Context: We still have HTML documentation files, ultimately we need to remove them. They are still used for man pages.
AI disclosure: This has been generated by Fable. I tried myself a while ago to write such a converter, but this converter works much better. I tested it myself, inspecting results of several pages. Fable itself verified all the syntax is covered including testing markdown in addons.
More details on the architecture of the md→man conversion (AI summary):
Details
The pipeline reuses the existing groff backend and only replaces the front end. Three stages:Stage 1 — parse Markdown to a tag tree (gmd.py). gmd.parse(text) returns (metadata, nodes). It strips YAML front matter and HTML comments, then splits the body into block-level nodes. Crucially, the output is a tree of (tag, attrs, body) tuples using HTML tag names (h2, p, dl/dt/dd, ul/li, table/tr/td, pre, b, i, code, …) — deliberately the same shape that ghtml.HTMLParser produces. This is the design pivot: by emitting HTML-flavored nodes, the markdown converter can feed the unchanged groff formatter that g.html2man already uses. Block parsing handles the constructs that actually occur in the corpus (headings, fenced code, mkdocs === "tabs", admonitions, the -indented parameter lists from --md-description, lists, tables, blockquotes), and raw HTML fragments are delegated to ghtml.HTMLParser and spliced in. Inline parsing (parse_inline) handles emphasis, code spans, links/images (text kept, target dropped, as man pages can't link), and protects backslash escapes via private-use placeholders so an escaped * can't pair with real emphasis.
Stage 2 — shape the document for a man page (g.md2man.py). build_document() takes the parsed tree and applies man-page-specific policy: it pulls name/description/keywords from front matter into .SH NAME / .SH KEYWORDS, turns the "Command line" mkdocs tab into the SYNOPSIS (dropping the Python API tabs per your earlier decision via transform_tabs/expand_tabs), drops the lead paragraphs that merely repeat the description, and adds the conventional bare-name and --help synopsis lines. Pages without front matter (index/topic pages, and html-only addon docs) keep their heading as a section instead.
Stage 3 — format to groff (ggroff.Formatter, borrowed from g.html2man). The shaped tree is walked by the existing formatter, which emits the groff macros (.SH, .SS, .IP, .TS/.TE, .nf, font escapes). g.md2man.py then prepends the .TH title line and applies the same whitespace cleanup as g.html2man.py.
Build wiring. include/Make/Html.make builds each man page from$(MDDIR)/source/%.md (the already-generated markdown) via $ (MD2MAN); man/Makefile derives the page list from the markdown wildcard and builds the md indices before the man pages. So the data flow is: tool --md-description → mkmarkdown.py (assembles the generated .md) → g.md2man.py → gmd.py → ggroff.py → .1. The net result is that man generation no longer depends on the HTML build at all, while reusing its proven groff emitter.
For context, next steps after this PR (AI summary):
First add markdown generation to CMake. cmake/modules/generate_docs.cmake needs to run each tool's --md-description and mkmarkdown.py (the way Autotools' Html.make does), producing the $(MDDIR)/source/*.md files. This is the prerequisite — g.md2man has nothing to consume until these exist.
Then switch the man rule in generate_docs.cmake from mkhtml.py+g.html2man.py to mkmarkdown.py+g.md2man.py, mirroring the Autotools change. Also wire g.md2man/gmd.py into the CMake install (the equivalent of the utils/Makefile SUBDIRS entry).
After that: remove the HTML build (delete mkhtml.py, the html Make/CMake rules, the committed .html files, and the html index builders. Two cleanups fold in here: