#m(ctf)p
m(ctf)p is a semi-automated system for solving capture the flag (CTF) challenges. It has two main parts:
- mctfp-server - The MCP server used for integrating with the CTFd environment
- It's written in Go and uses the mark3labs/mcp-go library
- A Kali Linux-based environment - Basically a Kali Linux Docker image with Claude Code, as well as some scripts for spinning up new ones quickly.
Here's an example of download, solving, and submitting a simple challenge in a recent CTF:

#How well does it work?
Decently well! It can zero-shot simpler challenges (e.g. /attempt_challenge <id>, wait a bit + confirm prompts, challenge turns green on CTFd), and even when it goes totally haywire, it generally figures out at least a few pieces of useful information along the way, and you can pick up from there, then feed that back in later, and so on.
Generally, I think the instructions/Docker image can be improved quite a bit, to do things like:
- Know that PyCryptodome is imported with import Cryptodome
- Tell it to not flail helplessly and attempt to bruteforce things with generic passwords
- hashcat and john can be used when cracking is actually the solution, but it's usually pretty clear when something is supposed to be cracked vs not
- It tried this on basically every challenge when it ran out of ideas, probably spent $10 in tokens on these silly dead-ends
- Tune how it uses the notes.txt files
- It wasn't very effective with these, need to be more specific in what types of things it should record
- Actually have some headless decompiler installed
- It was totally unable to use Ghidra headless
- It did a pretty decent job using radare2 non-interactively, but usually got the first few invocations wrong
#The MCP Server
#Current Tools
- CTFd (via their API)
- Full Swagger definitions, but we only want
- Loading challenges: GET /challenges and GET /challenges/{challenge_id}
- And probably GET /challenges/{challenge_id}/files for downloading files
- Submitting answers POST /challenges/attempt
- Body should be {"submission": "<flag value>", "challenge_id": <the challenge id>}
- Loading challenges: GET /challenges and GET /challenges/{challenge_id}
- Full Swagger definitions, but we only want
- Notes
- Mostly just a persisted scratchpad for the model to think in
- VirusTotal (via their API)
#Usage
Build the Docker image/environment with docker build -t ctf .. If you change the tag, make sure to update it in scripts/new_env.sh as well.
Copy scripts/env.sh.example to scripts/env.sh and replace all the TODOs with the relevant secrets using whatever secret mechanism you use. I usually use pass
Then, run ./scripts/new_env.sh <env name> [opus] and you'll be dropped into a Dockerized Kali environment with Claude installed. Run claude to open Claude Code, hit Enter three times to get through the prompts, then run /attempt_challenge <challenge_id> to have Claude download and work on the challenge.
#TODO
- [ ] Figure out what other tools should be added
- Both via API to the MCP server, and in the Docker image
- [ ] Test out the VirusTotal API integration
- Make sure the API integration looks good first
- [x] Add the ability to choose the strong/weak models when starting a new env
- [x] Test it with Claude
- And CTFd
- [x] Make a Docker container
- Should use Kali as a base
- Add in Claude Code (npm) + the MCP server (build the binary?)
- [x] Add scripts and tooling to create new parallel invocations
- [x] Figure out how/where human + Claude notes should go
- But maybe write to a file, they'll get lost as is
- [x] Make sure Claude has access to a Python environment
- And give it common libraries that make sense
- Basically make it CyberChef
- And give instructions for how to use it
- [x] Add a custom slash command for starting a challenge
- Tell it to download from CTFd
- Tell it to go as far as it can without human intervention
- [x] Read this Claude Code: Best practices for agentic coding
- Make sure we understand how hierarchical CLAUDE.md files work
- Make sure we understand slash commands
- Look for other tips and tricks (e.g. running in Docker, MCP configs, etc)
.png)

