A Practical Experiment in Building an AI Agent Swarm

2 hours ago 1

Using SwarmSDK to Organize Years of Productions

Obie Fernandez

I’ve been making electronic music for decades (professionally since 2018), and like many producers, I have a massive archive of WAV files related to my work. The last few years worth is currently scattered across over 200 gigabytes of Dropbox folders that contain my project files and audio exports. Some of those WAV files are final masters of my music, and I need to never lose track of them. But they are mixed in amongst stems, premasters, test bounces, and sample files from Ableton projects. The organizational debt is real, and as part of a bigger project I finally decided to tackle it — not by hiring an intern, but by building an AI swarm.

This post walks through how I used SwarmSDK (the v2 rewrite of Claude Swarm) to create a specialized agent team that can intelligently search through my Dropbox and distinguish between actual master recordings and the noise. Along the way, we’ll introduce concepts related to multi-agent coordination, custom tool development, virtual filesystems, and the surprising challenges of working with lower-cost LLM models.

Finding Needles in a Haystack of Stems

My Dropbox has tens of thousands of WAV files. Here’s what a real folder looks like (from my upcoming “Foreplay” project with Solarstone):

/Foreplay Project/
/Foreplay OLD STEMS/
Solarstone & Obie Fernandez - Foreplay VLIND MASTER.wav ← THIS! (87 MB)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 27 FX.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 35 PADS.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 21 AH AH AH AH AH.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 23 BASS GROUP.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 2 KICK.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 26 OFFBEAT BASS.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 3 UNDERWORLD BASS.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 20 SNARE ROLL.wav ← NOT this (130 MB stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 22 LIL LOUIS HI STRING.wav ← NOT this (stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 31 BALEARIC PLUCKS.wav ← NOT this (stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER SUB PAD.wav ← NOT this (stem)
Solarstone & Obie Fernandez - Foreplay VLIND MASTER 8 PERC GROUP.wav ← NOT this (stem) /Foreplay/
/Samples/
/Imported/
vocals.wav ← NOT this (21 MB vocal sample)
DPA - Snare - 012.wav ← NOT this (71 KB sample)
DPA - Kick - 005.wav ← NOT this (111 KB sample)
DPA - Open Hat - 007.wav ← NOT this (63 KB sample)
/Processed/
/Freeze/ ← Ableton freeze files (NOT masters)
Freeze 17 PIKE BASS.wav ← NOT this (30 MB freeze)
Freeze 28-theRiser.wav ← NOT this (50 MB freeze)
Freeze 35 ARP.wav ← NOT this (61 MB freeze)
/Reverse/ ← Reversed samples (NOT masters)
DPTE Crash Cymbal - 005 R.wav ← NOT this (2.6 MB)
DPTE White FX - 001 R.wav ← NOT this (7.1 MB)
/Cubase/
/Mixdown/
/Foreplay Premaster Stems 44.1 24bit 132bpm/
08. Pikes.wav ← NOT this (109 MB premaster stem)
09. Filtery Eighths Synth.wav ← NOT this (109 MB premaster stem)
12. Bass Higher Octave.wav ← NOT this (109 MB premaster stem)

Out of hundreds of WAV files on any given big production, only ONE is the actual final master. In the example above, the rest are:

  • 13+ individual stems with “MASTER” in the filename but instrument names after it
  • 3+ premaster stems in a different folder
  • Dozens of Ableton freeze files, imported samples, and processed effects
  • Hundreds of tiny drum hits and sample files (63 KB — 4 MB)
  • .asd files (Ableton’s waveform analysis files)

And this is just ONE project. I have hundreds of these.

The Challenge

  1. Search recursively through Dropbox folders
  2. Find only final stereo masters (not stems, premasters, or samples)
  3. Handle pagination (thousands of files)
  4. Filter intelligently without false positives
  5. Store results for analysis

And critically: I want to do this with lower-cost models. SOTA is expensive when you’re processing thousands of files. I was ultimately able to get this to work with MiniMax M2, a very affordable model.

SwarmSDK: Multi-Agent Orchestration in Pure Ruby

SwarmSDK is my colleague Paulo Arruda’s complete rewrite of Claude Swarm that runs everything in a single process using RubyLLM for LLM interactions. Instead of spawning multiple Claude Code instances, it orchestrates lightweight agents that share a runtime and communicate through a clean internal API.

Here’s the relevant part of my swarm configuration. (This work is part of a bigger proprietary system, and I’ve redacted the other agents in my swarm.)

version: 2
swarm:
name: "Studio Crew"
lead: manager
all_agents:
provider: openrouter
model: "minimax/minimax-m2"

agents:
file-manager:
description: "Dropbox monitoring"
tools:
- DropboxGlob
- Read:
allowed_paths: ["**/*", "/swarm_workspace/**/*"]
- Write:
allowed_paths: ["**/*", "/swarm_workspace/**/*"]
- Edit:
allowed_paths: ["**/*", "/swarm_workspace/**/*"]
- Bash
system_prompt: |
You are responsible for keeping Kr8d's structured
data aligned with the Dropbox file system.

The key insight here: instead of trying to make one massive agent do everything, I created a specialized tool (DropboxGlob) and a file manager agent that knows how to use it effectively.

DropboxGlob

The first version of DropboxGlob was straightforward — just search for files matching a glob pattern. But real-world usage quickly revealed limitations:

Problem 1: Sample Files Everywhere

When I let the swarm run with pattern: "**/*.wav", it found thousands of files. Most were tiny samples from Ableton's freeze files and sample libraries. The agent was drowning in noise.

Solution: Add min_size parameter

# app/swarm_tools/dropbox_glob.rb
module SwarmTools
# DropboxGlob tool for searching files in an organization's configured Dropbox folders
#
# This tool provides glob pattern matching across an organization's Dropbox account,
# specifically searching within the folders configured in DropboxFolderPath records.
#
# Examples:
# DropboxGlob(pattern: "*.wav") # Find all WAV files
# DropboxGlob(pattern: "**/*.mp3") # Find MP3 files recursively
# DropboxGlob(pattern: "masters/*.wav") # Find WAVs in masters folder
class DropboxGlob < RubyLLM::Tool
...lots of other code including the tool description...

param :min_size,
type: "integer",
desc: "Minimum file size in bytes. Files smaller than this will be excluded.",
required: false

...lots of other code...

def meets_size_requirement?(entry, min_size)
return true if min_size.nil?
return true if entry[".tag"] == "folder"
file_size = entry["size"] || 0
file_size >= min_size
end

Now my agents could filter at the tool level: min_size: 10485760 (10MB) cuts out nearly all sample files since final masters are at least 2-3 minutes of stereo audio.

Problem 2: Ableton Project Cruft

Ableton projects have folders like /Samples/, /Stems/, and /Processed/ that contain hundreds of non-master files. Even with size filtering, these were cluttering the results.

Solution: Add exclude_paths parameter

param :exclude_paths,
type: "array",
desc: "Array of path strings to exclude. Matching is case-insensitive.",
required: false

...

# Check if Dropbox entry meets path exclusion requirements
def meets_exclusion_requirement?(entry, exclude_paths)
# No exclusions specified, accept all files
return true if exclude_paths.nil? || exclude_paths.empty?

# Get the file path (case-insensitive comparison)
path = (entry["path_display"] || entry["path_lower"] || "").downcase

# Check if path contains any of the excluded strings (case-insensitive)
# Return false if any exclusion matches (file should be excluded)
!exclude_paths.any? { |excluded| path.include?(excluded.downcase) }
end

Case-insensitive matching is crucial here — /Samples/, /samples/, and /SAMPLES/ should all be filtered out.

Problem 3: Massive Result Sets

Dropbox’s API returns results in batches. I needed cursor-based pagination that the agent could control.

Solution: Stateful cursor system

def execute(pattern:, limit: 100, path_type: nil, cursor: nil, min_size: nil, exclude_paths: nil)
# Validate inputs
return validation_error("pattern is required") if pattern.nil? || pattern.strip.empty?

effective_limit = [ [ limit || 100, 1 ].max, 500 ].min
effective_min_size = min_size&.positive? ? min_size : nil
effective_exclude_paths = exclude_paths&.is_a?(Array) ? exclude_paths : nil

# Ensure organization has Dropbox integration
integration = ensure_integration!

# Get configured folder paths
folder_paths = get_folder_paths(path_type)

if folder_paths.empty?
return info_message("No Dropbox folders configured for this organization. " \
"Configure folders in Settings → Integrations → Dropbox.")
end

# Decode cursor if provided
if cursor.present?
begin
search_state = decode_cursor(cursor)

# Validate cursor matches current search
if search_state[:pattern] != pattern || search_state[:path_type] != path_type
return validation_error("Cursor does not match current search parameters (pattern: #{search_state[:pattern]} vs #{pattern}, path_type: #{search_state[:path_type]} vs #{path_type})")
end
rescue ArgumentError => e
return validation_error(e.message)
end
else
search_state = initialize_search_state(folder_paths, pattern, path_type)
end

# Search across configured folders, resuming from cursor position
results = []
folder_index = search_state[:folder_index]
dropbox_cursor = search_state[:dropbox_cursor]

while folder_index < folder_paths.size && results.size < effective_limit
folder_path = folder_paths[folder_index]
matches, next_cursor, completed = search_folder_paginated(
integration,
folder_path,
pattern,
effective_limit - results.size,
dropbox_cursor,
effective_min_size,
effective_exclude_paths
)

results.concat(matches)

if completed
# Move to next folder
folder_index += 1
dropbox_cursor = nil
else
# More results available in current folder
dropbox_cursor = next_cursor
break
end
end

# Create continuation cursor if more results available
has_more = folder_index < folder_paths.size
next_cursor = has_more ? encode_cursor(folder_index, dropbox_cursor, pattern, path_type) : nil

# Format results
format_results(results, pattern, effective_limit, next_cursor, has_more)
rescue DropboxServiceHelpers::IntegrationMissingError => e
error("Dropbox integration not configured. #{e.message}")
rescue DropboxServiceHelpers::IntegrationDisconnectedError => e
error("Dropbox integration disconnected. #{e.message}")
rescue Dropbox::Client::Error => e
error("Dropbox API error: #{e.message}")
rescue StandardError => e
error("Unexpected error: #{e.class.name} - #{e.message}")
end

The cursor is Base64-encoded JSON containing both Dropbox’s internal cursor and our wrapper state. This lets agents pause, do other work, and resume pagination without losing their place.

The Virtual Filesystem: /swarm_workspace/

Here’s a subtle but critical design decision: agents need persistent storage for intermediate results, but I don’t want them cluttering my actual filesystem with temporary JSON files.

SwarmSDK implements a virtual filesystem backed by the database:

# SwarmWorkspace provides a virtual filesystem for agents to use for recordkeeping
#
# Agents can use familiar Read/Write/Edit operations on paths like "/swarm_workspace/file.json"
# and the data is transparently stored in organization-scoped database records instead of files.
#
# This enables:
# - Recordkeeping across paginated tool calls (e.g., storing masters found via DropboxGlob)
# - Persistent data that survives across swarm runs
# - Organization-scoped multi-tenant data isolation
# - No file management overhead
#
# Example workflow:
# 1. Agent: Write("/swarm_workspace/masters.json", initial_data)
# 2. Agent: Read("/swarm_workspace/masters.json") # get current data
# 3. Agent: Edit("/swarm_workspace/masters.json", old, new) # append more results
# 4. Agent: Read("/swarm_workspace/masters.json") # generate final report
# == Schema Information
#
# Table name: swarm_workspaces
#
# id :uuid not null, primary key
# agent_name :string not null
# content :text
# file_path :string not null
# file_type :string default("json")
# created_at :datetime not null
# updated_at :datetime not null
# organization_id :uuid not null
#
# Indexes
#
# idx_workspace_files (organization_id,agent_name,file_path) UNIQUE
# index_swarm_workspaces_on_organization_id (organization_id)
#
# Foreign Keys
#
# fk_rails_... (organization_id => organizations.id)
#
class SwarmWorkspace < ApplicationRecord
belongs_to :organization

validates :agent_name, presence: true
validates :file_path, presence: true, uniqueness: { scope: [ :organization_id, :agent_name ] }
validates :file_type, inclusion: { in: %w[json csv md txt] }

# Normalize file path to ensure consistency
before_validation :normalize_file_path

# Find or create a workspace file for an organization/agent
#
# @param organization [Organization] The organization
# @param agent_name [String] The agent name
# @param file_path [String] Virtual file path (e.g., "/swarm_workspace/masters.json")
# @return [SwarmWorkspace] The workspace record
def self.find_or_initialize_for(organization:, agent_name:, file_path:)
normalized_path = normalize_path(file_path)
file_type = File.extname(normalized_path)[1..] || "txt"

find_or_initialize_by(
organization: organization,
agent_name: agent_name,
file_path: normalized_path
) do |workspace|
workspace.file_type = file_type
workspace.content = ""
end
end

# Get content with line numbers (cat -n format for compatibility with Read tool)
#
# @return [String] Content with line numbers
def content_with_line_numbers
return "" if content.blank?

lines = content.split("\n")
lines.map.with_index(1) do |line, idx|
"#{idx.to_s.rjust(6)}→#{line}"
end.join("\n")
end

# Update content using edit-style operation (find and replace)
#
# @param old_string [String] String to find
# @param new_string [String] String to replace with
# @param replace_all [Boolean] Replace all occurrences (default: false)
# @return [Boolean] Success
def edit_content(old_string:, new_string:, replace_all: false)
return false if content.blank? && old_string.present?

if replace_all
self.content = content.gsub(old_string, new_string)
else
# Single replacement
index = content.index(old_string)
return false unless index

self.content = content[0...index] + new_string + content[(index + old_string.length)..]
end

save
end

private

def normalize_file_path
self.file_path = self.class.normalize_path(file_path) if file_path.present?
end

def self.normalize_path(path)
# Remove leading /swarm_workspace/ if present, then ensure it's there
path = path.to_s.strip
path = path.delete_prefix("/swarm_workspace/")
path = path.delete_prefix("swarm_workspace/")
"/swarm_workspace/#{path}"
end
end

The agent can store paginated results, build up filtered lists, and maintain state across tool calls — all without leaving database records or temporary files to clean up later.

The Permission System Challenge

Early in testing, I hit a mysterious error:

Permission denied: Cannot write to '/swarm_workspace/all_wavs_page1.json'

The Write tool wasn’t even being called. Turns out SwarmSDK has a permission validator that intercepts all tool calls:

# lib/swarm_sdk/permissions/validator.rb
class Validator < SimpleDelegator
def execute(**params)
# Validate paths BEFORE calling the actual tool
validated_params = validate_params(params)
super(**validated_params)
end
end

The fix was adding /swarm_workspace/**/* to allowed paths:

tools:
- Write:
allowed_paths: ["**/*", "/swarm_workspace/**/*"]

This pattern — permission checking at the framework level — is exactly right for production AI systems. You want defense in depth, not just trusting the LLM to follow instructions.

The Prompt: Teaching an Agent to Filter Stems

Even with tool-level filtering, the agent still needed to understand what a “final master” actually is. Here’s the actual prompt:

Use the DropboxGlob tool to search our Dropbox and find _all_ of the
WAV files that are FINAL MASTERS ONLY.

CRITICAL: What is a "final master"?
A final master is a complete stereo mixdown of a finished song. It is NOT:
- Individual track stems or instrument tracks
- Premasters or pre-masters (intermediate versions)
- Test bounces, samples, or project files
- DJ mixes, podcasts, or non-song content

STEP 1: GATHER ALL CANDIDATE FILES
- Use pattern "**/*.wav" with min_size: 10485760 (10MB) and
exclude_paths: ["/Samples/", "/Processed/", "/Stems/"] to
find all WAV files recursively with DropboxGlob
- Use pagination to get all results (DropboxGlob returns 100 at
a time with a cursor)
- Store all results in /swarm_workspace/all_wavs.json as you paginate
- The min_size parameter filters out small sample files automatically
(song masters are at least 2 minutes long)
- The exclude_paths parameter filters out Ableton project sample
directories, processed files, and stem folders (case-insensitive)

STEP 2: FILTER TO FINAL MASTERS ONLY
Read /swarm_workspace/all_wavs.json and apply these EXCLUSION rules in order:

EXCLUDE if filename contains "premaster" or "pre-master" anywhere
Examples to EXCLUDE:
- "Song (premaster 24-bit 44.1).wav" ✗
- "Song [Summed 24bit 44.1 Premaster].wav" ✗
- "Song Pre-Master.wav" ✗

EXCLUDE if filename matches pattern "MASTER [NUMBER]" (these are stems)
Examples to EXCLUDE:
- "Song MASTER 27 FX.wav" ✗ (numbered stem)
- "Song MASTER 35 PADS.wav" ✗ (numbered stem)
- "Song MASTER 2 KICK.wav" ✗ (numbered stem)

EXCLUDE if filename contains instrument/track identifiers after "MASTER"
Common identifiers: KICK, BASS, SNARE, PERC, PADS, FX, DRUMS, SYNTH
Examples to EXCLUDE:
- "Song MASTER KICK.wav" ✗ (kick stem)
- "Song MASTER BASS GROUP.wav" ✗ (bass stem)
- "Song MASTER PERC GROUP.wav" ✗ (percussion stem)
- "Song MASTER SUB PAD.wav" ✗ (pad stem)

INCLUDE if filename is simple and clean:
Examples to INCLUDE:
- "Song Title MASTER.wav" ✓ (complete master)
- "Song Title (Original Mix) MASTER.wav" ✓ (complete master)
- "Song Title [24 Bit Wired Master].wav" ✓ (complete master)
- "Artist – Song Title (Master).wav" ✓ (complete master)

STEP 3: HANDLE DUPLICATES
If multiple versions of the same song exist, keep only the one
with the latest timestamp.

STEP 4: GENERATE FINAL REPORT
Provide:
- Total number of FINAL MASTER files found (after filtering)
- Total number of files excluded as stems/premasters
- Range of file sizes and dates
- Observations about folder organization
- A comprehensive markdown table with columns:
File Name, File Path, File Size, File Date

IMPORTANT REMINDERS:
- Use DropboxGlob to search Dropbox, not the local filesystem Glob tool
- Be CONSERVATIVE: When in doubt about whether a file is a stem, EXCLUDE it
- Store intermediate results in /swarm_workspace/ to avoid losing
context during pagination
- Apply ALL exclusion rules carefully - a file must pass all checks
to be included

This prompt does several things:

  1. Defines the domain concept clearly (what IS a final master?)
  2. Delegates heavy lifting to the tool (filtering by size and path)
  3. Provides concrete examples (both positive and negative)
  4. Structures the work (step-by-step with intermediate storage)
  5. Conservative by default (when in doubt, exclude)

The key insight: by handling size/path filtering at the tool level, we reduce the cognitive load on the model. It can focus on semantic understanding (is this a stem?) rather than processing thousands of file paths.

Running the Swarm: What Actually Happens

Here’s an excerpt from the test run with full logging enabled:

[manager] Starting...
[manager] Using tool: DropboxGlob
Parameters: pattern: **/*.wav, min_size: 10485760, exclude_paths: ["/Samples/", "/Processed/", "/Stems/"], limit: 100

[manager] Tool result: Found 100+ file(s) matching '**/*.wav':
📄 /Music Production/2023 Releases/Euphoria MASTER.wav
Size: 45.2 MB, Modified: 2023-08-15T14:23:11Z
📄 /Music Production/2023 Releases/Midnight Drive (Original Mix) MASTER.wav
Size: 52.1 MB, Modified: 2023-09-02T18:45:33Z
...
MORE RESULTS AVAILABLE
To get the next page, call DropboxGlob with this exact cursor:
CURSOR: eyJmb2xkZXJfaW5kZXgiOjAsImRyb3Bib3hfY3Vyc29yIjoiQUFIYjN...
[manager] Using tool: Write
Parameters: path: /swarm_workspace/all_wavs_page1.json, content: [{"name":"Euphoria MASTER.wav",...}]
[manager] Using tool: DropboxGlob
Parameters: pattern: **/*.wav, min_size: 10485760, exclude_paths: [...], cursor: eyJmb2xkZXJfaW5kZXgiOjAsImRyb3Bib3hfY3Vyc29yIjoiQUFIYjN...
[manager] Tool result: Found 100+ file(s) matching '**/*.wav':
...

The agent is:

  1. Making paginated DropboxGlob calls with filtering parameters
  2. Storing results to virtual filesystem
  3. Continuing pagination with cursors
  4. Eventually reading back all results and filtering by semantic rules
  5. Generating the final report

All of this happens autonomously. I just kick it off and it generated a report.

Press enter or click to view image in full size

Cost

Developing this system from scratch using Claude Code, from the time I started asking it how to use SwarmSDK to the final report took about 3 hours of my time. The final run that generated the report shown above cost less than 50 cents in tokens. That’s cheap for experimentation or a one-off job, but not something you want to drop into a production system without further consideration including optimization and asking yourself whether this is truly a viable approach.

Press enter or click to view image in full size

Last nine calls logged when running this prompt

What I’m doing here is part of a larger project, so it makes sense to me. Your mileage will definitely vary.

Results and Lessons Learned

After iterating on the tools and prompts, the swarm successfully:

  • ✅ Found all final masters from several years of production work
  • ✅ Filtered out thousands of stems and sample files
  • ✅ Handled pagination across multiple Dropbox folders
  • ✅ Generated a clean markdown table for review

Key Lessons

1. Tool-level filtering beats prompt engineering
Don’t ask the LLM to filter thousands of files by size when you can do it in normal Ruby code. Save model intelligence for semantic decisions.

2. Virtual filesystems enable complex workflows
Giving agents persistent storage (without filesystem pollution) lets them handle multi-step tasks that exceed context windows.

3. Cursor-based pagination is essential for real-world data
You can’t process thousands of files in one shot. Design tools with pagination from day one.

4. Multi-model support requires flexibility
Different models serialize parameters differently. Make your tools robust to these variations.

5. Permissions should be enforced at the framework level
Don’t rely on the LLM to respect path restrictions. Validate before execution.

6. Conservative filtering prevents false positives
When organizing creative work, it’s better to miss a few files than pollute results with noise.

Read Entire Article