TL;DR: A CLI tool for developing AMD GPU kernels on DigitalOcean. Spins up GPU droplets, syncs code, runs/profiles HIP kernels with rocprof, and pulls results back locally. Zero GPU hardware required—just write, test, and profile AMD GPU code from your laptop.
From PyPI (recommended):
Development setup:
With uv:
With venv:
-
Configuration
- chisel configure - Set up your DigitalOcean API credentials
Usage:
# Interactive configuration (recommended for first-time setup) chisel configure # Non-interactive with token chisel configure --token YOUR_DIGITALOCEAN_TOKENGetting your API token:
- Visit DigitalOcean API Tokens
- Click "Generate New Token"
- Give it a name (e.g., "chisel-cli") and ensure it has read and write access
- Copy the token immediately (you won't be able to see it again)
For detailed instructions, see the official guide.
-
Spin up GPU droplet
- chisel up - Create or reuse a GPU-accelerated droplet
Usage:
What it does:
- Checks for existing 'chisel-dev' droplet
- If none exists, creates a new droplet with:
- Size: gpu-mi300x1-192gb (AMD MI300X GPU)
- Image: AMD AI/ML Ready (ROCm pre-installed)
- Region: ATL1 (where AMD GPUs are available)
- SSH keys: Automatically injects all keys from your DO account
- Waits for droplet to be ready and SSH accessible
- Displays connection information
-
List droplets
- chisel list - Show all chisel droplets
Usage:
Shows:
- All active chisel droplets with their status
- IP addresses for SSH access
- Region, size, and creation time
- Current active droplet from local state
-
Sync code
- chisel sync - Push local files to droplet (only changed files)
Usage:
# Sync a file to /root/chisel/ (default) chisel sync simple-mm.cpp # Sync directory contents chisel sync ./src/ # Sync to custom destination chisel sync myfile.cpp --dest /tmp/What it does:
- Uses rsync for efficient file transfer
- Only transfers changed files
- Shows progress during transfer
- Creates destination directory if needed
-
Run commands
- chisel run - Execute commands remotely with live output streaming
Usage:
# Compile and run HIP kernel chisel run "hipcc /root/chisel/simple-mm.cpp -o /tmp/test && /tmp/test" # Run multiple commands chisel run "make && ./bench.sh" # Check GPU status chisel run "rocm-smi"What it does:
- SSH exec with real-time output streaming
- Returns actual exit codes
- Handles both stdout and stderr
- Works with interactive commands
-
Profile kernels
- chisel profile - Profile commands or source files with rocprof and pull results locally
Usage:
# Profile source file (auto-compiles with hipcc) chisel profile simple-mm.cpp # Profile with custom compiler flags chisel profile kernel.cpp --args "-O3 -DNDEBUG" # Profile existing binary chisel profile "/tmp/my-binary" # Profile any command chisel profile "ls -la" # Custom trace options and output directory chisel profile simple-mm.cpp --trace hip,hsa,roctx --out ./results # Auto-open results in Perfetto chisel profile kernel.cpp --openWhat it does:
- Auto-syncs source files to droplet if needed
- Compiles source files with hipcc
- Runs rocprof with specified trace options
- Downloads profile results to local directory
- Displays summary of top kernel hotspots
- Optionally opens Chrome trace in Perfetto
-
Pull artifacts
- chisel pull - Pull files or directories from the droplet to local machine
Usage:
# Pull a specific file to current directory chisel pull /tmp/results.txt # Pull to a specific local path chisel pull /tmp/logs/ --local ./local_logs/ # Pull compiled binary chisel pull /root/chisel/my-binary --local ./my-binaryWhat it does:
- Pulls any file or directory from the remote droplet
- Automatically handles files vs directories
- Creates local destination directories as needed
- Shows progress and confirmation
-
Clean up old droplets
- chisel sweep - Clean up droplets that have been running too long
Usage:
# Clean up droplets older than 6 hours (default) chisel sweep # Custom time threshold chisel sweep --hours 12 # Auto-confirm without prompting chisel sweep --yesWhat it does:
- Shows all droplets with uptime and estimated costs
- Identifies droplets running longer than specified threshold
- Prompts for confirmation before destroying old droplets
- Clears local state if active droplet was destroyed
-
Stop billing
- chisel down - Destroy the droplet to stop charges
Usage:
What it does:
- Prompts for confirmation before destroying
- Completely removes the droplet (not just powered off)
- Clears local state cache
- Stops all billing immediately
Cost Management:
- Cost warnings appear when droplets run longer than 12 hours
- Estimated at $1.99/hour for AMD MI300X GPU droplets
- Use chisel sweep to clean up old droplets automatically
- Droplets self-destruct after 15 minutes of inactivity
- AMD's droplets ship with ROCm pre-installed, so driver stack is available instantly
- Use DigitalOcean's pydo to create / destroy nodes
-
Python CLI skeleton – typer or argparse; single main.py.
-
DigitalOcean wrapper
import pydo client = pydo.Client(token=token) client.droplets.create(size='gpu-mi300x1-192gb', image='gpu-amd-base', ...) -
SSH/rsync layer – Use paramiko for exec + rsync/scp shell out (simplest); later swap to async libraries if perf matters.
-
Cloud-init script – idempotent bash that:
- apt update && apt install -y build-essential rocblas-dev …
- Adds a /etc/profile.d/chisel.sh that exports ROCm paths.
-
State cache – tiny JSON in ~/.cache/chisel/state.json mapping project → droplet ID & IP so repeated chisel run skips the spin-up step.
-
Credential handling – ENV override CHISEL_DO_TOKEN > config file, because CI.
-
Cost guardrails – warn if droplet has been alive >N hours; chisel sweep to nuke zombies.
[x] | chisel configure - DO token validation, config storage |
[x] | chisel up / down / list, cloud-init basics, state cache. |
[x] | sync + run (blocking), colored log streaming. |
[x] | profile - rocprof integration, result parsing, Perfetto. |
[x] | Artifact pull, graceful ^C handling, rudimentary tests. |
[x] | Cost warnings, README with install script, publish to PyPI. |
- concurrent runs (non-blocking sync and run)