llm_poker: A minimal Hold'em environment that manages multiple LLM-based players

3 days ago 3

A minimal Texas Hold’em environment that seats multiple LLM-based players (via the llm library) and manages everything from dealing hole cards to forced blinds, betting rounds, and a straightforward showdown.

Core features:

  • Blinds: Each hand forces a small blind and a big blind, ensuring there’s money in the pot.
  • Betting: We query each LLM once per betting round, requesting an action in strict JSON form (fold, call, or raise).
  • Local showdown logic: The environment determines the best 5-card hand from each player’s 7 cards and awards the pot.
  • Pydantic-based JSON validation: The LLM responses are parsed and validated. If invalid, we retry.
  • Optional CLI: The llm_poker command can run multiple rounds using the specified LLMs.

  1. Install the package:

You must also configure your llm library with the API keys for whichever LLM models you plan to use (e.g., gpt-4o, Anthropic, etc.). For example:


  1. Just run run.py

Deals up to 5 rounds between multiple players: gpt-4o, claude-3-5-haiku-latest, claude-3-5-sonnet-latest, deepseek-reasoner. Uses elimination_count=0 so the game does not stop early (unless someone busts). The minimum raise is 500 chips. Logs each hand’s actions, culminating in a final standings table.

  1. Using the CLI If you installed with the included console script, you can do:
llm_poker --models "chatgpt-4o-latest gemini-2.0-flash-thinking-exp-01-21 claude-3-5-sonnet-latest claude-3-5-haiku-latest deepseek-reasoner" --elimination-count 0 --stack 10000

This deals 5 rounds of heads-up between gpt-4o and claude-3-5-haiku-latest.

Once installed, you have access to:

--models/-m: Multiple model names or aliases recognized by llm (defaults to ["gpt-4o"]). --rounds/-r: How many hands to deal (default 3). --elimination-count/-e: Stop once only this many players remain (default 1). --stack/-s: Starting chip stack (default 10000).


  • No side pots: Currently, if a player goes all-in, the environment doesn’t handle side pots.
  • Manual environment checks: If the LLM returns “check” while facing a bet, the code interprets it as invalid and re-prompts.
  • Fictitious ‘expert-level poker AI’: The LLM’s strategic brilliance is not guaranteed. This is more a demonstration environment than a truly advanced solver.
Read Entire Article