Holo1: Cost-Efficient Web Agent Powered by Open Weights

13 hours ago 5

[Submitted on 3 Jun 2025]

View PDF HTML (experimental)

Abstract:We present Surfer-H, a cost-efficient web agent that integrates Vision-Language Models (VLM) to perform user-defined tasks on the web. We pair it with Holo1, a new open-weight collection of VLMs specialized in web navigation and information extraction. Holo1 was trained on carefully curated data sources, including open-access web content, synthetic examples, and self-produced agentic data. Holo1 tops generalist User Interface (UI) benchmarks as well as our new web UI localization benchmark, WebClick. When powered by Holo1, Surfer-H achieves a 92.2% state-of-the-art performance on WebVoyager, striking a Pareto-optimal balance between accuracy and cost-efficiency. To accelerate research advancement in agentic systems, we are open-sourcing both our WebClick evaluation dataset and the Holo1 model weights.

Submission history

From: Maxime Theillard [view email]
[v1] Tue, 3 Jun 2025 13:29:03 UTC (358 KB)

Read Entire Article