Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.
- Fused Operations - Operations are fused when possible
- GPU Acceleration - Built on CUDA/Thrust for high performance
- Chainable API - Clean, expressive syntax for complex operations
- Header-Only - Simple integration with #include "parrot.hpp"
¹ Lazy-ish means that any operation that can be lazily evaluated is lazily evaluated.
- CMake (3.10+)
- C++ compiler with C++20 support
- NVIDIA CUDA Toolkit 13.0 or later
- NVIDIA GPU with compute capability 7.0+
For detailed build instructions, see BUILDING.md.
Parrot provides significant code reduction compared to other CUDA libraries:
| Thrust | ~10x less code |
See detailed comparisons in our documentation.
- Full Documentation - Complete API reference and examples
- Examples - Standalone Parrot examples
- vs Thrust - Side-by-side comparisons with Thrust
The examples/ directory contains:
- getting_started/ - Simple examples to get started
- thrust/ - Parrot implementations of Thrust examples
We welcome contributions! Please see our CONTRIBUTING.md guide for details on:
- Developer Certificate of Origin (DCO) requirements
- Setting up the development environment
- Code standards and review process
- Running tests and code quality checks
- Building documentation
All contributions must be signed off under the DCO and comply with the Apache License 2.0.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
This project includes third-party software components. See THIRD_PARTY_LICENSES for complete license information and attributions.
Built on top of NVIDIA Thrust and CUDA.
.png)


