Show HN: Zeno – A framework for verifiable RL rewards (code, math, and more)

1 week ago 4

Verifiable RL rewards for LLMs that actually make sense.

Provides verifiable, interpretable reward functions for finetuning LLMs, especially for code, and soon, other domains.
Keeps all rewards transparent and hackable.
No secret sauce. No LLM-as-a-judge. No magical "alignment". Just actual auditable rules.

If you care about trusting your rewards, and being able to debug them, start with Zeno.

git clone https://github.com/think-a-tron/zeno.git cd zeno uv add ./zeno

Zeno ships with a set of plug-and-play reward functions for Python code completions for now. All rewards are stateless, verifiable, and don't require extra config or setup.

Reward What it rewards or penalizes Score Range

reward_docstrings	Proportion of functions/classes with docstrings	0.0 to 1.0
reward_lint	Fewer lint errors (via ruff), normalized per line	0.0 to 1.0
reward_direct_recursion	Presence/absence of direct recursion (configurable)	1.0 / 0.0 / -1.0
reward_list_comprehension	Use (or avoidance) of list comprehensions	1.0 / 0.0 / -1.0
reward_type_hints	Fraction of args/returns with type hints	0.0 to 1.0
reward_exception_handling	Has at least one try/except (reward) or none (penalize)	1.0 / -1.0
reward_functional	Prefers functional (no class) over OOP style	1.0 / -1.0