Show HN: Open-source dashboard for your domain experts to improve your AI Agents

1 week ago 3

EvalKit is an open-source dashboard for your domain experts to improve your AI Agents.

  1. Install/Setup Core Library:

    uv add aievalkit or pip install aievalkit
  2. Instrument your agent code:

    from evalkit import EvalKit eval_kit = EvalKit() @eval_kit.trace_interaction( agent_name="agentX", prompt_template_arg_name="prompt_template" ) def agent_function(prompt_template): return "output" with eval_kit.trace_task(task_name="My Task") as task_ctx: agent_function() pass

You would need to setup application to use it, we plan to create docker container soon for seamless build.

Currently all LLM code runs on vertex AI so you would need to configure it using below steps.

  1. you need to set env variable as shown in server/.env.example file.
  2. run "source .env_variable_file"
  3. you also need to create a serviceaccount.json file in server directory.
cd server && uv sync uv run main.py # backend now runs on http://localhost:8000
cd frontend && npm install npm run dev # frontend would run on http://localhost:5173/

Examples (for Core Library Usage)

To run the basic example demonstrating core library features:

uv run examples/basic_labeling.py
Read Entire Article