Agents, APIs, and Advertising: Lessons from Engineering Our MCP Server

3 weeks ago 1

Press enter or click to view image in full size

MCP, Model Context Protocol, has made quite some noise in the past year as it promises a simpler, user-driven, integration of tools into Large Language Models (LLMs). At Criteo, we believe that MCP will be key to empowering our clients and giving them the control to create and manage advertising campaigns in a new way:

If you have ever wished you could simply ask your LLM “Please create a Criteo full-funnel campaign based on the latest brief I received from the marketing team”, that may soon be the case.

ℹ️ Criteo and advertising context: full funnel campaign = advertising strategy to target users with relevant products at all stages of their shopping journey, from discovery to point of purchase.

While making MCP core to our internal services layer, we are creating a public MCP server to empower our clients and partners to access and manage their Criteo data through their LLM of choice. To build this server, we are leveraging our existing public API. In this article, we will share how MCP works, our approach to building a server on top of our API, the challenges we face, and how we are testing it.

Interested in testing it? You’ll find a survey link to share your thoughts and join our closed beta waitlist at the end of this article.

MCP 101

But what is MCP? The Model Context Protocol is an open protocol created by Anthropic that standardizes how applications provide context to LLMs. Very simplified, it’s a generic way to provide external programs to an LLM.

ℹ️ There is more to MCP than just tools, but it’s the most widely supported primitive at the moment. (Cf the feature support matrix) so we will focus on that in this article.

Tools and function calling

Tools, or function calling, have been supported for a long time by LLMs. For example, OpenAI’s Response API can take an optional tools field to provide the LLM with the definition of the tools it could call.

Press enter or click to view image in full size

Declaring a function (tool) in OpenAI’s API

LLMs have been trained to understand those tools using a JSON Schema as description for their inputs and to return specifically formatted responses if they think calling a tool would be useful to answer the current user’s prompt.

A generic way to expose tools

But until this point, implementing a tool was a custom job. What MCP promises is the ability to plug and play tools into LLMs in a generic way.

Those tools could run locally on your machine (ex: accessing local files, running commands, …) or remote services (ex: checking your next flight details from your airline server). To address those very different use cases, MCP supports different transports: local (STDIO) or remote (HTTP streaming)

In MCP, we talk about servers and clients: the server exposes a set of tools to the client, and the client can ask the server to execute those tools, usually when the LLM asks for it. In most cases the server is a dedicated piece of software, while the client is often a pre-existing agentic framework or chat application that has simply been updated to support MCP. For example, Anthropic’s Claude chat app or Mistral LeChat are both LLM chat applications and MCP clients.

An example

To clarify things a bit more, here is an example of a (mock) “Banking MCP server” plugged into Claude chat:

Press enter or click to view image in full size

MCP in action

And this is what happens under the hood:

Press enter or click to view image in full size

Sequence diagram of an interaction with a banking MCP server

In summary, MCP allows a user to interact directly with external services through the LLM of their choice. Even better, it allows to mix-and-match services from different providers. If you already use LLMs in your day-to-day, MCP enables an uninterrupted flow.

Use cases in the AdTech Industry

Now that we’ve discussed how MCP works, let’s consider why MCP can change the way we interact with advertising platforms today.

Let’s take ad campaign creation as an example. This operation, while common, requires following a complex, domain specific process which depends on clients’ specifications, and for that often requires dedicated staff on both the client and Criteo’s sides. But LLMs shine at understanding general domain knowledge and adjusting to processes dynamically. So, given the proper tools, an LLM could help abstract this complexity, giving more control back to the client.

Campaign creation is just one use case. A properly equipped LLM could help with many more use cases. For example:

“What was my best performing campaign in Germany last quarter?”
“What’s the forecast for my display placements for December, and what campaign settings are recommended?”
“What products were best performing on my listing page and still in stock in my catalog to activate?”
“How should I optimize my lower funnel ad set to spend my budget in full for my FR advertiser?”
“I want to use my demand aggregator’s agent to help me set up and launch my first Criteo campaign.”

Some of those use cases are very specific and we may think that each use case would need its own dedicated tool. But LLMs are good at translating a rich question into simpler steps, allowing them to reuse and compose simpler tools. So a good set of “simple” tools can empower an LLM to answer a vast number of use cases.

MCP and REST APIs

If you have ever worked with a REST API (and used a Swagger or an OpenAPI specification) you could think that an HTTP-based MCP server seems quite similar to a REST API:

Both use HTTP
Both expose a set of “tools” (or “operation” or “endpoints”)
Both leverage JSON schemas to describe what are the tools parameters
And since MCP version 2025–06–18, both use JSON schema to describe the tools responses

So for services that already have a REST API, it seems logical to try to somehow “repackage” it as a remote MCP server.

And, indeed, building some form of an “MCP OpenAPI proxy” is doable. To start small, we have selected only a few endpoints and built a small translation server (in part reusing a first opensource prototype itself built on top of the official MCP C# SDK). It takes our OpenAPI specification and extracts the endpoints' description alongside their input and output schemas. The task was made easy by reusing our API versioning system, segregating the selected endpoints in a dedicated “MCP” version (you can learn more about how we built Criteo’s API in this article by here by Scott McCord).

While exposing tools is straightforward, doing so in a way that an LLM can use them reliably is the real challenge.

Programmatic consumer vs LLM consumer

Ideally, a REST API endpoint should be self-descriptive and self-documented. But in real life, few are.

We have a dedicated API documentation website to smooth things over, often written by a different person than the developer who implemented the endpoint
The developer that will implement the connection with our API will deduce through a (sometimes painful) trial-and-error process which do what anyway

So, until now, there has been no strong pressure to have a rich, autonomous OpenAPI specification. But you don’t have this luxury for an MCP tool: the LLM should be able to understand it unambiguously the first time if you want to provide a good end-user experience.

Similarly, we are used to endpoints returning (or expecting) large collections of data. We are also tempted to return more data than actually needed, especially for a public API: “better for the client to have a little too much data to filter through than for them to have to make additional network calls”. This large amount of data is easy to process for a program.

But that logic falls apart for an LLM. LLMs are notoriously bad at handling large amounts of data, especially nondescript numbers and ids. And our system is full of numbers and ids. Even if the context size of models has continuously increased (reaching 1M tokens currently), context rot is still a thing. So, for LLMs less is more: you want to return just the data needed for the prompt.

The same logic applies to inputs. For a public endpoint, giving the consumer many levers to control is useful. For example, our Commerce Growth reporting endpoints allow us to request a report in JSON, CSV, XML, or Excel format. Perfect for our clients who want one of those formats, but a choice of little value for an LLM. JSON should be enough, right?

Still, exposing fine-grain filters in your tools can be useful. It allows the LLM to focus on translating the user requirements into the proper filters, getting as few data as needed in return. For example, Atlassian’s MCP server has tools that expose a JQL filter, allowing the LLM to make complex requests to find the only relevant tickets).

Ideally, tools should be designed from the ground-up with LLMs in mind.

Mitigations

But still, what can we do to improve things and reuse existing endpoints?

First, documentation: through a manual process, the description of each endpoint has been updated, in part reusing content from our documentation website, to provide a more complete explanation of what each tool does. Similarly the tool names (i.e. the OpenAPI operationIds) have been changed to better reflect what the tools do.

Then, a few of our endpoint designs have been updated (still maintaining retrocompatibility). For example, we added more sensible defaults for some of our inputs, and even hid some input fields entirely for the MCP version. Going back to our Commerce Growth reporting endpoint: the LLM no longer knows about the format input, it will simply get a JSON report.

Authorization

If we go back to our Banking MCP example, a question may have come to mind: how does authorization work with MCP?

Technically, MCP is built on top of HTTP so you can implement any form of HTTP-compatible authentication (Basic, API Key in header or query param, …). That can be a good way to start, but if you want to be compatible with as many clients as possible, you’ll want to follow the specification.

And the MCP specification provides only two options for remote servers: either unauthenticated or a public client oriented OAuth authorization flow.

That’s a big shift for a lot of APIs: MCP is not meant for server-to-server communication. It’s mainly a client-to-server protocol, so simple patterns like “API key” are not supported.

The specified flow relies on other standards to allow any MCP clients to securely obtain an end-user authorization from an authorization server:

It uses Dynamic Client Registration (RFC7591) so that each MCP client (i.e. each chat app) can create a client id on your authorization server. This is technically an optional requirement, but required by several MCP clients in practice.
The authorization flow is done through OAuth2’s PKCE (RFC7636), with the MCP client storing the refresh token
The authorization resources are self-documented via OAuth 2.0 Protected Resource Metadata (RFC9728) and OAuth 2.0 Authorization Server Metadata (RFC8414)

That’s a lot of pieces, but luckily, Criteo’s API already supports part of them. Our main problem is the Dynamic Client Registration, as we currently only allow registered users to create a connector with our API. While we work on that, we rely on a cruder header-based authentication.

Fragmentation

The final difficulty when building an MCP server is the ecosystem’s fragmentation. And the main pillars of this fragmentation are:

Features: only tools are widely supported, even if some client supports other features.
Transports: not all clients support remote servers and some only support the (deprecated) SSE mode
Authorization: not all clients support the “official” authorization flow and when they do you should expect some variation in the implementation details. And this variation is often poorly documented.

Press enter or click to view image in full size

https://modelcontextprotocol.io/clients#feature-support-matrix

Fragmented audience

One source of fragmentation is the audience. MCP is (currently) an advanced feature of LLM. And one set of users that have quickly adopted MCP are developers: when using the agent to code or maintain an application, being able to let your agent act on your codebase or with related services is enticing and helpful.

So a lot of coding-related applications have integrated support for MCP (VSCode, Cursor, Claude Code, …). But many have diverged from the specification, allowing for the implementation of simpler authorization flows to enable developers and developer-focused companies to integrate more easily.

For example, two different applications from Anthropic, the founder of MCP, implement authentication differently, because they are targeted to a different audience:

The consumer-oriented Claude Chat only supports the OAuth standard
While the programmer-oriented Claude Code supports both the standard and a custom header-based approach

claude mcp add --transport sse private-api https://api.company.com/mcp \
--header "X-API-Key: your-key-here"

From Connect Claude Code to tools via MCP — Claude Docs

How do we even test that?

Once the technical hurdles have been cleared, you have a working MCP server. But to ensure it answers your use cases, you want actual domain expertise using it.

Getting the proper domain expertise during test is a problem we have already faced when building our API. After all, our public API is built on top of dozens of internal applications, themselves maintained by 20+ teams, each with those domain specificities.

To help overcome this fragmentation, we are testing in two different ways:

Internally: we have enlisted a wide range of internal users coming from teams in different client segments and countries. We performed monkey testing, focusing on each subdomain’s reporting, campaign, and audience… This ongoing effort will continue each time we introduce new tools.
Externally: once everything is validated internally, we are beginning to test with a few clients on reporting APIs. This way, they can play with the same data available through our API endpoints, help pressure test our MCP integration, and help us learn the types of guardrails that need to be implemented. As with internal testing, we are releasing tools incrementally, focusing first on reporting tools, as their “read-only” nature lowers the risk.

With these two streams, internal and external, we are able to keep growing our external offer in a rigorously tested and safe way, as well as get internal Criteos familiar with MCP servers, offering new feedback, and use cases.

Testing feedback

Through our testing, we found several overall problems. For example:

LLMs love to extrapolate or make a choice without informing the user: in the example below, I first asked to give performance and ROAS (Return on Ad Spend, a common performance metric in advertising) for some campaigns (realizing that “performance” is a subjective term and ROAS has multiple options in our API). The problem is that the LLM decided its own definition of “performance” and made other choices (ROAS type) on my behalf without asking. This isn’t something it did wrong, as much as made the best guesses it could with the data it had. Injecting the proper guardrails into the tool descriptions could help.

Press enter or click to view image in full size

The LLM version used really matters: there is a big difference in the quality of response we get when using an older model vs a newer one. Earlier versions more frequently get confused with basic requests and are not as reliably able to self-diagnose problems.

Call to action

Now that we’ve discussed what an MCP server is, what we are learning along the way, and our testing strategy, what can you do?

You have the opportunity to be part of our closed beta testers waitlist by clicking here and replying to the survey.

Read Entire Article

Agents, APIs, and Advertising: Lessons from Engineering Our MCP Server

MCP 101

Tools and function calling

A generic way to expose tools

An example

Use cases in the AdTech Industry

MCP and REST APIs

Programmatic consumer vs LLM consumer

Mitigations

Authorization

Fragmentation

Fragmented audience

How do we even test that?

Testing feedback

Call to action

Related

LastPass Breach Exposes Dark Secret About Its Owners

Hippocampal SGK1 promotes vulnerability to depression/trauma...

Deforestation During the Roman Period