Frameworks like LangChain and LlamaIndex have exploded in popularity, promising to simplify the development of applications powered by Large Language Models (LLMs). They offer pre-built components for everything from prompt management to document retrieval. But what if the best approach for your project is to skip the framework entirely?
While frameworks can be excellent for rapid prototyping, building directly with an LLM provider's API—like OpenAI's or Google's—can offer significant advantages in performance, control, and understanding. Let's explore the compelling reasons why a framework-less approach might be the smarter choice for your next AI application.
When you use a framework, you are working with its abstractions. These layers, while convenient, can obscure what's actually happening under the hood. You might know that it works, but you don't necessarily know how it works.
Building directly with a model's API forces you to understand the fundamental mechanics of LLM interactions. You'll learn the nuances of prompt engineering, context window management, and API parameter tuning (like temperature or top_p) firsthand. This deep understanding is invaluable for debugging, optimizing, and building truly robust and innovative applications. It separates the developers who can simply use a tool from those who can architect powerful AI systems.
Every layer of abstraction adds overhead. Frameworks like LangChain often introduce additional function calls, data processing steps, and dependencies that can increase the latency of your application. In a user-facing application, even a few hundred milliseconds of delay can significantly impact the user experience.
By interacting directly with the model's API, you have a direct line of communication. You send a request and get a response, with no intermediate steps you didn't explicitly code yourself. This allows you to fine-tune your application for maximum speed. For example, you can implement custom logic for caching API calls or design a more efficient RAG (Retrieval-Augmented Generation) pipeline than the generic one a framework provides. One analysis showed that a simple, direct RAG implementation could be up to 60% faster than a comparable one built with a popular framework due to reduced overhead.
When you install a framework, you aren't just adding one dependency to your project; you're often adding dozens. The langchain package, for instance, has a vast dependency tree. A report from early 2024 noted that a standard installation could pull in over 80 different packages and in most updates it breaks things.
This dependency bloat can lead to several problems:
Larger application size: This is critical in serverless environments or containers where deployment package size affects cold starts and cost.
Security vulnerabilities: Every dependency is a potential attack vector. A larger dependency tree increases your exposure to security risks.
Version conflicts: The more packages you have, the higher the chance of encountering breaking changes or conflicts when you need to update one part of your system.
Building without a framework keeps your environment lean. You only include the libraries you absolutely need, like an HTTP client (requests or httpx) and the model provider's client library (e.g., openai).
Frameworks are inherently opinionated. They are designed to solve common problems in a specific way. While this is great for standard use cases, it can become a straitjacket when you need to implement a novel feature or a highly custom workflow.
What if you want to use a vector database that isn't officially supported by the framework? Or implement a unique, multi-step agentic workflow that doesn't fit the framework's predefined agent types? You might find yourself fighting the framework, writing complex workarounds that negate the initial benefit of using it.
Building from scratch gives you complete freedom. Your only constraints are the capabilities of the LLM itself. This flexibility is essential for businesses aiming to create a differentiated product with a unique user experience.
While not always immediately obvious, using a framework can sometimes lead to higher operational costs. The abstractions might make it easy to chain multiple LLM calls together, but this can inadvertently lead to excessive token usage. For instance, some of LangChain's more complex agents can make several "thought" calls to the LLM to decide on the next step, each one incurring a cost.
When you code the logic yourself, you have precise control over every single API call. You can design your system to be more token-efficient, implementing rules to reuse results, batch requests, or use smaller, cheaper models for simpler tasks. This granular control is key to managing costs at scale.
Choosing to bypass frameworks like LangChain is not about rejecting them entirely. They are powerful tools that have their place, especially for hackathons, proofs-of-concept, quick mvp prototyping and projects with standard requirements.
However, for serious, production-grade applications where performance, control, security, and customization are paramount, building directly on top of an LLM provider's API is often the superior long-term strategy. It requires a greater initial investment in learning, but it pays dividends by giving you a more efficient, flexible, and robust application. The next time you start an AI project, consider what you might gain by building it from the ground up.