Standard Completions

1 day ago 3

Many LLM providers and open-source projects now offer OpenAI-compatible Completions and Chat Completions APIs, including Deepseek, xAI, OpenRouter, Nous Research, vLLM, and more.

However, OpenAI considers Completions a legacy API, and while Chat Completions will not be deprecated, it has been de-emphasized in favor of the OpenAI Responses API.

While OpenAI has committed to supporting the current Chat Completions API indefinitely, it does not support features like multimodal outputs, stored input objects, explicit caching, assistant prefixes (prefill), and other features. Providers offering OpenAI-compatible APIs have added these features in non-standard ways, complicating usage for developers.

For example, a request with an assistant prefix works three different ways across three different OpenAI-compatible providers:

# openrouter: trailing assistant message curl -X POST "https://openrouter.ai/api/v1/chat/completions" \ -H "Content-Type: application/json" -H "Authorization: Bearer $OPENROUTER_API_KEY" \ -d '{ "model": "anthropic/claude-3.5-sonnet", "messages": [ { "role": "user", "content": "Tell me a joke." }, { "role": "assistant", "content": "Sure! How about a knock-knock joke:" } ] }' # deepseek: "prefix": true curl -X POST "https://api.deepseek.com/beta/chat/completions" \ -H "Content-Type: application/json" -H "Authorization: Bearer $DEEPSEEK_API_KEY" \ -d '{ "model": "deepseek-chat", "messages": [ { "role": "user", "content": "Tell me a joke." }, { "role": "assistant", "content": "Sure! How about a knock-knock joke:", "prefix": true } ] }' # vLLM: "continue_final_message": true curl -X POST "https://inference-api.nousresearch.com/v1/chat/completions" \ -H "Content-Type: application/json" -H "Authorization: Bearer $NOUS_API_KEY" \ -d '{ "model": "DeepHermes-3-Mistral-24B-Preview", "messages": [ { "role": "user", "content": "Tell me a joke." }, { "role": "assistant", "content": "Sure! How about a knock-knock joke:" } ], "add_generation_prompt": false, "continue_final_message": true }'

Even worse, because there is no standard way for a provider to signal if they support features like logprobs or assistant prefixes, the only way for an API consumer to detect if the feature is supported is to try it and see what happens, or attempt to detect the upstream provider even if they are being proxied.

We hope by standardizing a superset of the OpenAI completions APIs, we can make this experience easier for developers, produce a standard SDK that providers can recommend to their users instead of the OpenAI SDK—while maintaining backwards compatibility with the OpenAI SDK—and make the LLM ecosystem more interoperable.

Join us! If you are a provider or developer interested in joining the Standard Completions working group, please get in touch via email, Twitter/X, or open an issue on the rfcs repository:

Read Entire Article