EvalKit is an open-source dashboard for your domain experts to improve your AI Agents.
-
Install/Setup Core Library:
uv add aievalkit or pip install aievalkit -
Instrument your agent code:
from evalkit import EvalKit eval_kit = EvalKit() @eval_kit.trace_interaction( agent_name="agentX", prompt_template_arg_name="prompt_template" ) def agent_function(prompt_template): return "output" with eval_kit.trace_task(task_name="My Task") as task_ctx: agent_function() pass
You would need to setup application to use it, we plan to create docker container soon for seamless build.
Currently all LLM code runs on vertex AI so you would need to configure it using below steps.
- you need to set env variable as shown in server/.env.example file.
- run "source .env_variable_file"
- you also need to create a serviceaccount.json file in server directory.
cd server && uv sync
uv run main.py
# backend now runs on http://localhost:8000
cd frontend && npm install
npm run dev # frontend would run on http://localhost:5173/
To run the basic example demonstrating core library features:
uv run examples/basic_labeling.py