
A high-performance inference engine for AI models on Apple Silicon. Key features:
- Simple, high-level API
- Hybrid architecture, where layers can be computed as GPU kernels or via MPSGraph (a low-level API beneath CoreML with ANE access)
- Unified model configurations, making it easy to add support for new models
- Traceable computations to ensure correctness against the source-of-truth implementation
- Utilizes unified memory on Apple devices
First, add the uzu dependency to your Cargo.toml:
[dependencies]
uzu = { git = "https://github.com/trymirai/uzu", branch = "main", package = "uzu" }
Then, create an inference Session with a specific model and configuration:
use std::path::PathBuf;
use uzu::{
backends::metal::sampling_config::SamplingConfig,
session::{
session::Session, session_config::SessionConfig,
session_input::SessionInput, session_output::SessionOutput,
session_run_config::SessionRunConfig,
},
};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let model_path = PathBuf::from("MODEL_PATH");
let mut session = Session::new(model_path.clone())?;
session.load_with_session_config(SessionConfig::default())?;
let input = SessionInput::Text("Tell about London".to_string());
let tokens_limit = 128;
let run_config = SessionRunConfig::new_with_sampling_config(
tokens_limit,
Some(SamplingConfig::default())
);
let output = session.run(input, run_config, Some(|_: SessionOutput| {
return true;
}));
println!("{}", output.text);
Ok(())
}
For a detailed explanation of the architecture, please refer to the documentation.
uzu uses its own model format. To export a specific model, use lalamo. First, get the list of supported models:
uv run lalamo list-models
Then, export the specific one:
uv run lalamo convert meta-llama/Llama-3.2-1B-Instruct --precision float16
Alternatively, you can download a preprepared model using the sample script:
./scripts/download_test_model.sh $MODEL_PATH
You can run uzu in a CLI mode:
cargo run --release -p cli -- help
Usage: uzu_cli [COMMAND]
Commands:
run Run a model with the specified path
serve Start a server with the specified model path
help Print this message or the help of the given subcommand(s)
- uzu-swift - a prebuilt Swift framework, ready to use with SPM
This project is licensed under the MIT License. See the LICENSE file for details.
.png)