An AI agent for unsubscribing from spam

2 hours ago 1

I bet you receive a ton of unwanted spam email. While each email includes an "unsubscribe" link, most people don't have time to unsubscribe from every mailing list one-by-one.

Sadly, it's nontrivial to automate the unsubscribing process, even though email clients try to make it easier. Every email is formatted differently, so it's not obvious how to find unsubscribe links automatically. On top of this, every unsubscribe page is its own fresh hell to navigate.

To address this problem, I built an AI agent to unsubscribe from spam. The agent works in two steps: first, it goes through your email inbox and finds unsubscribe links; next, it drives a web browser to interact with the corresponding unsubscribe pages. While the agent isn't perfect, it did significantly reduce the amount of spam I receive. It's also pretty cool to watch in action! The code can be found on Github.

Finding Unsubscribe Links

The first step of unsubscribing from spam is finding the unsubscribe links in emails. For this part of the agent, I used the Gmail API to list emails in the user's inbox. For each email, we get a subject, a body "snippet", and a raw HTML (or plaintext) dump of the email.

To find the unsubscribe link, my program first makes an API call where the HTML links in the email are presented in a numbered list:

1. Great deals https://sometime.com/deals?q=aoenutsohaut 2. Unsubscribe https://somesite.com/unsubscribe ...

We prompt the model to produce the index that looks the most like an unsubscribe link, or -1 if none is found. The exact developer prompt is:

Developer Prompt 💻📝 Out of these links, choose the one that looks like an unsubscribe link. End your response with "Answer: N" where N is a link number, or -1 if none of the links look like an unsubscribe link. If more than one looks like an unsubscribe link, simply pick one of them arbitrarily.

This makes the model's job easy: it just has to pick a link out of a list. The call is also relative cheap, since it doesn't include the entire text of the email.

However, this technique is not very reliable. For example, some spam emails are formatted like "If you want to unsubscribe from future emails, click here. If you want MORE spam, click here". So there's multiple links, all with identical text.

In the above example, the HTML <a> tags themselves doesn't contain enough information to identify the unsubscribe link, and the above API call usually (correctly) returns -1.

To handle unusual unsubscribe link situations, I fall back on a brute-force approach: feeding the entire HTML source code to a model. In practice, I didn't want to exhaust context limits, so I feed the email source code chunk-by-chunk. I slightly modify the source code to add an extra data-index attribute to each <a> tag, so that the model still just has to pick a link index (rather than regurgitating a huge, messy URL). The developer prompt here is:

Developer Prompt 💻📝 Each link on this page has a data-index attribute with an integer value. Find the link that looks like an Unsubscribe link (this is an email's source code), and end your response with a line like Answer: N where N is the index. There may be no unsubscribe link, and the code may be truncated in such a way that the link is missing or hard to determine. In that case, output Answer: -1. You may think out loud before giving your answer, but give the answer on a new line in the above format.

I call the API for every chunk and take the first chunk which has an answer other than -1. This works surprisingly well. To speed things up a bit by skipping many chunks containing dumb stuff like CSS, I sort the chunks such that any chunk containing the word "unsubscribe" is first.

What's at an unsubscribe link?

Finding the unsubscribe link is the easy part. Once you follow the link, you may be met with any number of strange websites that ask you to do extra work to actually unsubscribe. For example, consider this page from PECO:

A webpage describing that the user is only unsubscribed to marketing emails.

On this page, it does say we've been unsubscribed from "PECO Marketing Promotions emails". However, I am still subscribed to other kinds of spam, and we need to click the big blue button to fix this. Once we click this button, we see yet another page

A webpage with options of email lists to unsubscribe from.

On this page, we either have to uncheck all of the boxes, or check the small "Unsubscribe from all.." box at the bottom, and then hit "Accept >>".

Every company has a different flow for something like this, and the actual structure of the webpage varies greatly as well. Clearly, humans can navigate these things (mostly) fine, but it would be difficult to hardcode rules that can handle every company's unsubscribe flow robustly. By the way, this is what they are counting on!

Navigating Unsubscribe Pages

To automate the unsubscribing process, I use AI to directly control a web browser. This way, the AI can see what a webpage is requiring it to do, and it has a lot of flexibility without any human intervention.

The AI interacts with the browser in a loop. First, the program navigates to the unsubscribe page and sends a screenshot to the agent. The agent can "think out loud" as much as it wants, but then it outputs a block of JavaScript code. This code is then run on the webpage. If the appearance of the page changes at all, a new screenshot is sent to the agent. Additionally, the output of the previous JavaScript code is sent to the agent. The loop runs like this until the agent chooses to call a success() or failure() function to indicate that the task is done or cannot be completed.

I do various things to make the agent's job easier. For example, I provide it with a scrollDown() and clickText() function. I also augment the first message with information about HTML tags on the page, as well as a model-generated summary of the HTML content of the page. The latter requires an extra (chunked) API call, and in my tests it actually didn't seem to help that much.

The developer prompt for the unsubscribing agent is as follows:

Developer Prompt 💻📝 Your goal is to figure out how to run JavaScript on the page to make sure the user is unsubscribed from this source of spam and any other sources that this vendor might send * After every message you send, I will give you a new screenshot of the page (if it has changed), and any output from print() calls. * You may think out loud in your response, but end the response with a code block to execute on the page. * If the page already says that the user has been unsubscribed from ALL emails, then output the code success(). * If the page does not seem to have anything to do with unsubscribing, or you do not know what to do, then output the code failure(). * Do not output success() prematurely; make sure you see the latest page first. * DO NOT attempt to submit forms (e.g. by pressing buttons) until you have VISUALLY CONFIRMED that you have checked the correct boxes or typed the correct text. * To get more information from the page, you can use the provided print() function, which will call toString() on its argument. I will send the outputs of all prints in the next message to allow iteration. * The output of print() will be truncated, so the page might have too much code to print directly in one call. * I have provided an extra clickText() function which finds elements that contain the given text and clicks them all. It returns true if a click was performed, false otherwise. * I have provided an extra scrollDown() function which scrolls down the page further (if it's a long page) so that you can see more content. * The user's email address is: {user_email}

One neat thing is that we can cleanly visualize what the agent is doing by looking at a chat transcript. It's amazing to watch the agent interact with webpages like this.

Chat transcript 🧵

Sometimes, it seems very smart. Other times, it seems wildly stupid. Given this, how can we measure how successful it actually is?

Testing with simulations

I created a suite of "simulations" that test the unsubscribe agent on various realistic webpages. At first, I used GPT-5 to code a few unsubscribe websites from scratch. Then, after trying the agent on some of my actual spam emails, I added some more realistic (and difficult) simulations based on these companies' websites.

To test my agent, I run four trials on each of the simulations. I then report success rate, as well alse true/false positives/negatives:

simple_1: success_rate=4/4 (tn=0 fp=0 fn=0) click_to_unsub: success_rate=4/4 (tn=0 fp=0 fn=0) enter_email: success_rate=4/4 (tn=0 fp=0 fn=0) bryant_park: success_rate=3/4 (tn=0 fp=1 fn=0) goldbelly: success_rate=4/4 (tn=0 fp=0 fn=0) honeywell: success_rate=2/4 (tn=0 fp=2 fn=0) peco: success_rate=2/4 (tn=0 fp=2 fn=0) fandango: success_rate=4/4 (tn=0 fp=0 fn=0)

The first three of these are my own simple tests. The rest are based on real spam emails I recieved.

One nice thing about these tests is that we can easily measure the effect of our prompting strategy. For example, if I remove any mention of the clickText() helper from the prompt, success rate drops for a few of the simulations:

simple_1: success_rate=4/4 (tn=0 fp=0 fn=0) click_to_unsub: success_rate=4/4 (tn=0 fp=0 fn=0) enter_email: success_rate=4/4 (tn=0 fp=0 fn=0) bryant_park: success_rate=4/4 (tn=0 fp=0 fn=0) goldbelly: success_rate=0/4 (tn=1 fp=3 fn=0) honeywell: success_rate=0/4 (tn=0 fp=4 fn=0) peco: success_rate=1/4 (tn=0 fp=3 fn=0) fandango: success_rate=4/4 (tn=0 fp=0 fn=0)

Testing on my actual email

On my own email account, the model found unsubscribe links for 38 unique domains. Of those, 31 yielded a "success" status from the agent, 6 yielded a "failure" status, and 1 ran more than 10 interaction steps (which is a limit I hardcoded). Even though the success rate looks high, sometimes the agent ran into false positives, where it claims to have succeeded but actually kept me subscribed to some spam.

The PECO simulation provides one example of a false positive. This page is particular difficult for the model because it includes labels next to checkboxes that don't reference the checkbox with a for attribute. As a result, the user can't just click the text, they have to click the checkbox itself. Thus the normal clickText() function I gave the model doesn't help at all.

Chat transcript 🧵

In this example, the model explicitly checks a checkbox by setting its value, which prevents the webpage from running its own onclick handler to uncheck all the other checkboxes. The HTML summary given to the agent includes some information about the page's JavaScript handling of this checkbox, but the agent seemingly ignores it some of the time.

Sometimes, the model does do better on this particular website (it has about a 50% success rate). Here's a case where it calls click() on the checkbox and therefore actually triggers the JavaScript.

Chat transcript 🧵

Conclusion

My main takeaway from this project is that you can do really cool and creative things by stitching together LLM API calls. However, it feels more like alchemy than engineering, and it's pretty hard to get the resulting system to be completely robust.

There are various "computer-using agent" implementations out there now, many of which support more advanced controls than just running JavaScript on webpages. In theory, it should be possible to test these agents on my small suite of unsubscribe flows to see how they fare. This could be an interesting thing for somebody to try.

One thing I didn't try for this project is the official function calling interface provided by the OpenAI API. Perhaps it would be more in-distribution for the model to run JavaScript code via a tool call than it is to run the code via a code block at the end of each message. This is another potential area for improvement to try.

Finally, I'll point out that if something like this starts becoming popular, then unsubscribe webpages will start embedding prompt injections. I can't wait to witness this dystopian future.

Read Entire Article