AgentDesk - Desktops for AI agents

4 months ago 28


AgentDesk provides full-featured desktop environments which can be programatically controlled by AI agents. Spin them up locally or in the cloud.

▶ Built on agentd a runtime daemon which exposes a REST API for interacting with the desktop.

▶ Implements the DeviceBay Protocol.

from agentdesk import Desktop # Create a desktop container desktop = Desktop.docker() # Launch the UI for it desktop.view(background=True) # Open a browser to Google desktop.open_url("https://google.com") # Take actions on the desktop desktop.move_mouse(500, 500) desktop.click() img = desktop.take_screenshots()
from agentdesk import Desktop desktop = Desktop.docker()
$ agentdesk create --provider docker

*requires docker

Create a remote desktop on Kubernetes

$ agentdesk create --provider kube

View the desktop in the UI

$ agentdesk view old_mckinny

*requires docker

Desktop.delete("old_mckinny")
$ agentdesk delete old_mckinny
desktop.open_url("https://google.com") coords = desktop.mouse_coordinates() desktop.move_mouse(500, 500) desktop.click() desktop.type_text("What kind of ducks are in Canada?") desktop.press_key('Enter') desktop.scroll() img = desktop.take_screenshots()

Process images to make them more accessible to LMMs.

Add a coordinate grid on top of the image

from agentdesk.processors import GridProcessor img = desktop.take_screenshots() processor = GridProcessor() grid_img = processor.process_b64(img)

Come join us on Discord.

Please open an issue before creating a PR.

Changes to the VM happen in agentd.

Read Entire Article