I just build my own GPU dashboard, hope you like it

1 month ago 2

  • 30+ GPU Metrics - Utilization, temperature, memory, power, clocks, encoder/decoder stats, and more per GPU
  • Multi-GPU Support - Automatic detection and independent monitoring of all NVIDIA GPUs
  • Live Historical Charts - Real-time graphs with statistics (min/max/avg), threshold indicators, and contextual tooltips
  • Process Monitoring - Track active GPU processes with memory usage and PIDs
  • Clean UI - Responsive interface with glassmorphism design and smooth animations
  • WebSocket Updates - Sub-second refresh rates (2s) for real-time monitoring
  • Docker Deployment - One-command setup with NVIDIA Container Toolkit support
  • Zero Configuration - Works out of the box with any NVIDIA GPU
  • GPU & Memory Utilization (%)
  • Core & Memory Temperature (°C)
  • Memory Usage (Used/Free/Total MB)
  • Power Draw & Limits (W)
  • Fan Speed (%)
  • Clock Speeds (Graphics, SM, Memory, Video MHz)
  • PCIe Generation & Lane Width (Current/Max)
  • Performance State (P-State)
  • Compute Mode
  • Encoder/Decoder Sessions & Stats
  • Driver & VBIOS Version
  • Throttle Status Detection
  • Host CPU & RAM Usage
  • Active GPU Processes with Memory Tracking

Docker Deployment (Recommended)

git clone https://github.com/psalias2006/gpu-hot cd gpu-hot docker-compose up --build

Access the dashboard at http://localhost:1312

pip install -r requirements.txt python app.py

Access the dashboard at http://localhost:1312

  • NVIDIA GPU with drivers installed (verify with nvidia-smi)
  • Docker & Docker Compose (for containerized deployment)
  • NVIDIA Container Toolkit (for Docker GPU access)
  • Python 3.8+ (for local development)

NVIDIA Container Toolkit Setup

Required for Docker deployment to access GPUs.

Installation:

Follow the official installation guide for your distribution:
📖 NVIDIA Container Toolkit Installation Guide

The guide includes instructions for Ubuntu, Debian, RHEL, CentOS, Fedora, and other distributions.

Verify Installation:

docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
gpu-hot/ ├── app.py # Flask application with WebSocket server ├── templates/ │ └── index.html # Web dashboard with live charts ├── requirements.txt # Python dependencies ├── Dockerfile # Container configuration ├── docker-compose.yml # Docker Compose setup └── README.md # Documentation

HTTP:

  • GET / - Dashboard interface
  • GET /api/gpu-data - Current GPU metrics (JSON)

WebSocket:

  • gpu_data - Real-time metrics broadcast (2s interval)
  • connect / disconnect - Connection events

Environment Variables:

  • NVIDIA_VISIBLE_DEVICES - GPU visibility (default: all)
  • NVIDIA_DRIVER_CAPABILITIES - GPU capabilities (default: all)

Customization in app.py:

  • Update interval: Modify eventlet.sleep(2) for refresh rate
  • Port: Change in socketio.run() (default: 1312)
  • Chart history: Adjust data retention (default: 30 points)
  • Base: nvidia/cuda:12.1-devel-ubuntu22.04
  • Self-contained nvidia-smi (no host mounting required)
  • Health checks every 30s
  • Automatic restart on failure
  • Exposes port 1312

Adding New Metrics:

  1. Modify nvidia-smi query in parse_nvidia_smi()
  2. Update frontend to display new metrics
  3. Add chart configuration if needed

Local Development:

pip install -r requirements.txt python app.py

Access at http://localhost:1312 (Dashboard) or http://localhost:1312/api/gpu-data (API)

  • Verify NVIDIA drivers: nvidia-smi
  • Install NVIDIA Container Toolkit (see Installation section)
  • Restart Docker daemon: sudo systemctl restart docker
  • Test GPU access: docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
  • Check host GPU access: nvidia-smi
  • Verify Container Toolkit: nvidia-ctk --version
  • Review logs: docker-compose logs
  • Configure Docker runtime: sudo nvidia-ctk runtime configure --runtime=docker
  • Check port 1312 accessibility
  • Review browser console for errors
  • Verify firewall settings
# In app.py socketio.run(app, host='0.0.0.0', port=1312, debug=True)

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/NewFeature)
  3. Commit your changes (git commit -m 'Add NewFeature')
  4. Push to the branch (git push origin feature/NewFeature)
  5. Open a Pull Request

See the LICENSE file for full details.

  • NVIDIA for nvidia-smi
  • Flask & Socket.IO teams
  • Chart.js for visualization
  • Open-source community

Read Entire Article