Show HN: CommonForms – open models to auto-detect PDF form fields

1 month ago 2

🪄 Automatically convert a PDF into a fillable form.

This repo contains three things:

the pip-installable commonforms package, which has a CLI and API for converting PDFs into fillable forms
the FFDNet-S and FFDNet-L models from the paper CommonForms: A Large, Diverse Dataset for Form Field Detection
the preprocessing code for the CommonForms dataset, which is hosted on HuggingFace: https://huggingface.co/datasets/jbarrow/CommonForms

CommonForms can be installed with either uv or pip, feel free to choose your package manager flavor:

uv pip install commonforms

Once it's installed, you should be able to run the CLI command on ~any PDF.

The simplest usage will run inference on your CPU using the default suggested settings:

commonforms <input.pdf> <output.pdf>

Argument Type Default Description

input	Path	Required	Path to the input PDF file
output	Path	Required	Path to save the output PDF file
--model	str	FFDNet-L	Model name (FFDNet-L/FFDNet-S) or path to custom .pt file
--keep-existing-fields	flag	False	Keep existing form fields in the PDF
--use-signature-fields	flag	False	Use signature fields instead of text fields for detected signatures
--device	str	cpu	Device for inference (e.g., cpu, cuda, 0)
--image-size	int	1600	Image size for inference
--confidence	float	0.3	Confidence threshold for detection

In addition to the CLI, you can use

from commonforms import prepare_form prepare_form( "path/to/input.pdf", "path/to/output.pdf" )

All of the above arguments are keyword arguments to the prepare_form function.

🚧 Code for dataset prep exists in the dataset folder.

If you use the tool, models, or code in an academic paper, please cite the CommonForms paper:

@misc{barrow2025commonforms, title = {CommonForms: A Large, Diverse Dataset for Form Field Detection}, author = {Barrow, Joe}, year = {2025}, eprint = {2509.16506}, archivePrefix= {arXiv}, primaryClass = {cs.CV}, doi = {10.48550/arXiv.2509.16506}, url = {https://arxiv.org/abs/2509.16506} }

If you use it in a non-academic setting, please reach out to the author (joseph.d.barrow [at] gmail.com)! I love to hear when people are using my work!

Read Entire Article

Show HN: CommonForms – open models to auto-detect PDF form fields

Related

62 chapter open-source Zig book

Double Bind

Show HN: Tsofa – The Simple, Offline Flashcard App