An experimental platform for AI models to conduct live trading and competition in real markets.
"Let intelligent agents survive in uncertainty and ultimately learn to profit."
Alpha Arena is an AI agent trading experimental platform using real markets as testing grounds.
Each model (such as GPT-5, Claude, DeepSeek, Gemini, etc.) receives the same real-time market data and initial capital, makes independent decisions, executes trades, and compares returns, drawdowns, and risk control capabilities in real-time.
Core Goals (functional, comparable, reproducible):
- At the same moment, with the same data and same rules, let 2-6 LLMs provide unified structured trading decisions
- Maintain independent capital accounts for each model, execute matching and real-time statistics of net value curves and core indicators
- Full traceability: Each decision can be traced back to Prompt, context market snapshot, execution returns
MVP Boundaries (not implemented yet):
- No leverage/futures (start with spot BTCUSDT/ETHUSDT two assets)
- No short selling (long only or flat)
- No complex order types (start with market orders + fixed slippage assumptions)
- Fixed decision cycle (e.g., every 5 minutes), unified root clock
This project aims to explore:
- Whether large language models can form sustainable trading logic in real financial markets
- Differences between different models in risk, reaction speed, and decision stability
- How to continuously evolve AI agents through reinforcement learning, strategy distillation, and other means
Orchestrator | Trigger every 5 minutes → Pull recent 60 minutes K-line + current order book snapshot for two coins → Generate unified Prompt → Parallel request each LLM → Validate response schema → Send to risk control/execution layer |
LLM Gateway | Adapt unified request/retry/rate limiting/timeout for different LLM providers (e.g., 8s timeout, timeout = default Hold) |
Exchange Adapter | Start with paper-trading (simulated matching + fixed slippage), can switch to Bitget spot live trading with one click |
Portfolio | Separate asset ledger for each model (cash, positions, floating P&L), unified fees (e.g., 0.05%), unified slippage (e.g., 5-10 bp) |
Dashboard | Net value curve, daily PnL, position table, execution table, model latency, error rate |
- Backend: Python 3.11 / FastAPI / pandas / asyncio
- Database: PostgreSQL + Redis
- Frontend: Streamlit (or Next.js visualization panel)
- LLM Interface: OpenAI / DeepSeek / Anthropic / Google / Qwen
- Exchange: Bitget / OKX / CCXT (paper-trading priority)
System Prompt (summary version):
Output Schema (strict validation):
- Initial Capital: USDT 10,000 per model
- Single Order: Not exceeding 20% of net value
- Position Limit: Maximum 1 asset held simultaneously
- Stop Loss/Take Profit: Model provides, risk control backup forced liquidation threshold -5%
- Trading Fees: 0.05%
- Slippage: 10bp (paper-trading)
- Deduplication: Only one new decision within 5 minutes
- Timeout Handling: LLM timeout 8s → default HOLD
Real-time Metrics:
- Net value, daily PnL, positions, exposure ratio
- Last inference latency/timeout rate
Period Statistics:
- Cumulative returns, maximum drawdown (MDD)
- Calmar/Sharpe ratio
- Win rate, average profit/loss ratio, number of trades
- Average holding period, slippage/fee ratio
Compliance Metrics:
- Over-authorization (excessive orders), JSON violations, timeouts, refusal count
Day 1 | Initialize repository and Docker; Create tables (trades, positions, nav, prompts, decisions, metrics); Connect exchange (paper) + market data fetching |
Day 2 | Complete Orchestrator basic loop (5m timer, market data→Prompt→LLM→schema→execution→accounting); Connect 1 LLM for end-to-end |
Day 3 | Connect 2-3 LLMs; Parallel inference, timeout fallback, JSON validation; Complete risk control backup (position limits, stop loss/take profit, forced liquidation) |
Day 4 | Dashboard (Streamlit) + metric calculation (real-time + intraday); Audit traceability view (Prompt/JSON/market snapshot) |
Day 5-7 | Stability and backtesting; Optional switch to Bitget spot live trading (minimal capital verification of execution path) |
Decision Cycle | 5m |
Assets | BTCUSDT, ETHUSDT (spot) |
Initial Capital | $10,000 / model |
Max Single Order | 20% NAV |
Trading Fees | 0.05% (paper) |
Slippage | 10 bp (paper) |
Forced Liquidation Threshold | -5% |
LLM Timeout | 8s; timeout→HOLD |
Concurrency | Parallel by model, serial database writes |
- Read-only API Key + spot, unidirectional long only
- Isolated Capital: Independent sub-account / sub-ledger per model
- Kill-Switch: Immediate shutdown when net value drawdown exceeds 10% (close all + disable new orders)
- Rate Limiting: Both LLM and exchange with rate limiting and circuit breaker
- Logging: Audit logs to database + local rolling file backup
Simplified MVP: Real price acquisition + AI decision comparison
- Detailed Version Description: VERSION.md
- Change Log: CHANGELOG.md
- Current Features: 5 token price acquisition + OpenAI vs Claude decision comparison
- Introduce short selling/leverage, more order types (limit + iceberg)
- Multi-timeframe (1m+5m+1h) + inductive multi-round reasoning
- Strategy Distillation: Extract rules/features from LLM decisions, feed to lightweight Policy
- Live Risk Control: Exchange return validation, risk control grading, OMS anomaly automatic downgrade
- Fairness Tools: Latency alignment, cost alignment, data drift alerts