Show HN: Doci.py, quick and dirty literate programming / yet another Docco clone
5 hours ago
3
Doci is a quick and dirty tool for turning your script into documentation
that show explainations and code side by side, a.k.a. "semi-literate programming".
It is simple and language agnostic.
This is very much a remix of Pycco,
which itself is a Python port of Docco,
the original lightweight literate programming tool. Compared to Pycco and Docco, Doci:
is simpler and better documented (in my opinion)
processes most languages, by virtue of being dumb
has a customizable template that works on mobile out of the box
Doci is written literately and generates its own documentation,
so it should be quite easy to understand, modify and extend.
Rationale
"Programs are meant to be read by humans and only incidentally for computers to execute."
-- Donald Knuth
Literate programming is
underappreciated and underused, in part because existing tools are quite heavyweight.
Most major languages support forward definitions, so you don't need to
weave your code using an external tool. Your compiler can do it just fine.
Debugging can be a pain if you don't know where your code is going to be.
LLMs write lots of comments, so a tool that turns comments into documentation
would make LLMs doubly useful!
So like Pycco & Docco,
Doci just extracts your comments into prose, stuff your code in code blocks
and calls it a day.
Usage
Clone this repository. Doci's own documentation is generated by running
# note the horrendous quoting
uv run doci.py -m README.md -H index.html -c '#' -b '""""""""' '""""""""' doci.py
This tells Doci to read its own source doci.py, treat # as comment start
and """ as block comment delimiters, and generate README.md and index.html
from it, given that the default html template template.html exists in the same folder.
For other languages, use -l(--language) to specify the language name for syntax highlighting,
and change the comment and block comment symbols accordingly.
will create script.js.md and script.js.html from script.js,
By itself it's quite dumb and needs to rely on other tools. For example, scripts/build.py
watches doci.py for changes and reruns it automatically, which
is a very simple way to add file watching abilities to Doci.
(Maybe this is a good thing, Doci is not trying to do everything by itself ;)
extract_chunks extracts doc and code chunks from a program. We return a list of chunks,
a list of strings indicating each chunk's type (doc or code),
and a list of locations of the top of each chunk.
lc=len(lines)# line count
Contents are lines stripped from leading and trailing whitespaces and line breaks.
As a result, chunks are always interspersed, which means
for each two doc chunks there is a code chunk in between, and vice versa.
So changes in chunk type marks the start of a new chunk.
parse_block_comment takes in a list of contents and parses out block comments.
is_block_comment is 0 for non-block-comment lines,
1 for a comment start line, and -1 for a comment end line.
TODO: handle single-line block comments.
We don't handle recursive block_comments, because most languages don't support them.
If we do, we would need to increment seen_quote here.
Using max, this code works even if there are more end quotes than start quotes.
HTML's more tricky though, because it needs styling and formatting.
We use pygments for code highlighting and markdown-it for markdown formatting.
For markdown, we use the commonmark spec with tables and footnotes enabled.
highlight turns a list of code chunks into HTML using pygments.
The code chunks are merged and passed to pygments in one go,
because pygment can't highlight partial code.
Magic divider text and html adapted from pycco.
parser.add_argument("-m","--markdown",type=argparse.FileType("w",encoding="utf8"),nargs="?",const=False,help="generate markdown output",)parser.add_argument("-H","--html",type=argparse.FileType("w",encoding="utf8"),nargs="?",const=False,help="generate HTML output",)parser.add_argument("-t","--template",type=argparse.FileType("r",encoding="utf8"),nargs="?",const=False,help="HTML template file",)parser.add_argument("-l","--language",type=str,nargs=1,help="language name (for syntax highlighting)",)parser.add_argument("-c","--comment",type=str,nargs=1,default=["#"],help="comment start symbol (default: '#')",)parser.add_argument("-b","--block",type=str,nargs=2,
Some languages don't have block comments, so we default to None.
help="block comment start and end symbols (default: None)",default=None,)parser.add_argument("file",type=argparse.FileType("r",encoding="utf8"),nargs=1,help="source file to process",)returnparser.parse_args()
defrunning(f,iterable,init=None):
running calculates a running value from an iterable using a binary function f.
dedent dedents a list of strings, removing the minimum leading whitespace
from all visible lines. Invisible lines (empty or whitespace-only)
are left unchanged.