Never run out of tokens again! Monitor your Cerebras AI usage in real-time with rate limit tracking, usage predictions, and warnings before you hit your limits.
1. Install (choose one):
2. Get your session token:
- Go to cloud.cerebras.ai and sign in
- Press F12 → Application → Cookies → copy the authjs.session-token value
- Set it: export CEREBRAS_SESSION_TOKEN="your-token-here"
- Or save it permanently in ~/.config/cerebras-monitor/settings.yaml (Windows: %APPDATA%\cerebras-monitor\settings.yaml)
3. Start monitoring:
That's it!
- Real-time dashboard - See your usage update live
- Rate limit tracking - Never hit unexpected limits
- Multi-organization support - Switch between orgs easily
- Usage predictions - Know when you'll hit your limits
- Token consumption monitoring - Track every request
- Clean terminal interface - Beautiful, responsive display
- Automatic request interception
- Smart alerts and warnings
- Historical usage trends
- Export capabilities
Download from the releases page.
Provides the most accurate data and full organization access.
- Log into Cerebras Cloud
- Extract session token from browser cookies:
- Open Developer Tools (F12)
- Go to Application → Cookies → https://cloud.cerebras.ai
- Copy the authjs.session-token value
- Set as environment variable or save in config file
Note: The session token is HTTP-only and must be manually copied. This tool only uses it to fetch your usage data - source code is available for inspection.
Limited functionality compared to session token:
- Shows only data for that specific key
- Cannot switch organizations
- Less accurate predictions
- Each request consumes ~5 tokens for metadata
To use:
| --session-token | string | "" | Cerebras session token |
| --org-id | string | "" | Organization ID to monitor |
| --model | string | "qwen-3-coder-480b" | Model to monitor |
| --refresh-rate | int | 10 | Data refresh rate in seconds (1-60) |
| --refresh-per-second | float | 0.75 | Display refresh rate in Hz (0.1-20.0) |
| --timezone | string | auto | Timezone (auto-detected) |
| --time-format | string | auto | Time format: 12h, 24h, or auto |
| --theme | string | auto | Display theme: light, dark, or auto |
| --log-level | string | INFO | Logging level |
| --icons | string | emoji | Icon set: emoji or nerdfont |
Cerebras enforces rate limits per API key with these response headers:
| x-ratelimit-limit-requests-day | Maximum requests per day |
| x-ratelimit-limit-tokens-minute | Maximum tokens per minute |
| x-ratelimit-remaining-requests-day | Requests remaining today |
| x-ratelimit-remaining-tokens-minute | Tokens remaining this minute |
| x-ratelimit-reset-requests-day | Daily limit reset time (seconds) |
| x-ratelimit-reset-tokens-minute | Minute limit reset time (seconds) |
- Go with spf13/cobra for CLI
- spf13/viper for configuration
- sqlc for database queries
- Go 1.24.5 or higher
- sqlc (for database code generation)
Makes requests to: https://cloud.cerebras.ai/api/graphql
Rate limit data extracted from response headers.
MIT License
Contributions welcome! Fork the repository and submit a pull request.
.png)

