This repository contains the code and environments for my thesis on reinforcement learning of drone swarms using Unreal Engine 4 and Microsoft AirSim conducted at AIST, Tsukuba. It includes two separate implementations:
- Single-Agent: Training a single drone using PPO in a custom UE4 using StableBaseline3 and Gymnasium.
- Multi-Agent: Training a swarm of drones using PPO with modified versions of PettingZoo and Supersuit libraries to support multi-agent stacked observations in case of RGB camera.
All code is placed under the airsim folder and assumes AirSim and UE4 are installed following the official AirSim documentation.
⚠️ Warning: this client was developed and tested on Windows only. It may not install or run correctly on macOS or Linux.
- Unreal Engine 4: install via Epic Games Launcher. Ensure you have a UE4 project compatible with AirSim.
- Microsoft AirSim: clone and build following the AirSim installation guide AirSim creates a settings.json file in your user Documents directory under Documents/AirSim on Windows or ~/Documents/AirSim on Linux systems (microsoft.github.io).
- Conda: for managing Python environments.
-
Navigate to the single_agent folder:
-
Create and activate the Conda environment:
conda env create -f environment.yml # Python 3.9.16 conda activate deeprl_single -
Copy the airsim folder into your AirSim PythonClient directory:
cp -r airsim /path/to/AirSim/PythonClient/ -
Launch UE4, open your project, and hit Play to run the environment (at this point, the drone will be stuck waiting for training/evaluation script).
-
Navigate to the multi_agent folder:
-
Install the environment using:
conda env create -f environment.yml # Python 3.11.3 conda activate deeprl_multi -
Copy the modified libs into your Python path, or install in editable mode:
pip install -e modified_libs/pettingzoo pip install -e modified_libs/supersuit ... # Unfortunately, I can't remember which packages I modified. I advice you to install all of them to avoid bugs. -
Place your custom settings.json (with your desired number of drones) into the folder created by AirSim at Documents/AirSim on Windows or ~/Documents/AirSim on Linux, replacing the default settings file (microsoft.github.io).
-
Copy the airsim folder into the AirSim Python Client folder:
cp -r airsim /path/to/AirSim/PythonClient -
Open UE4 and run your project; the multi-drone config in settings.json will enable different sized swarm spawn.
Important: All coordinates, spawn positions, and environment parameters in the code must be adapted to your specific UE4 map. Since maps are not provided, ensure you update any hardcoded positions and settings to match the map you create.
-
Launch training via:
python airsim/train.py # Single and multi-agent -
Adjust hyperparameters or environment settings directly in the Python scripts or settings.json.
You can monitor training progress, rewards, and hyperparameters using TensorBoard.
After launching training, open a terminal and run:
This allows you to inspect:
- Rewards over time
- Episode lengths
- Custom metrics
- Learning rates and other hyperparameters
You can evaluate the trajectory of the single and multi-agent training by running:
It will evaluate the weights saved inside saved_policy.
Always be sure to be running the UE4 env before running training or evaluation, and that the settings are set correctly (and renamed to settings.json). If you want to modify the number of drones, you should modify the code.
Note: The training maps used in this project are ~200 GB.
They’re available on request; just let me know if you’d like a copy!
- Unreal Engine 4 (Epic Games)
- Microsoft AirSim
- PettingZoo & SuperSuit
- Prof. Akiya Kamimura & Prof. Andrea Roli
If you find this work useful in your research, you are welcome to cite it. An official publication is currently in preparation and will be linked here once available. In the meantime, feel free to reference this repository and acknowledge the work.
This project is licensed under the MIT License. See LICENSE for details.
.png)




