⭐ 900+ Stars in 5 Days. Thanks to the community!
Learn to build AI agents locally without frameworks. Understand what happens under the hood before using production frameworks.
This repository teaches you to build AI agents from first principles using local LLMs and node-llama-cpp. By working through these examples, you'll understand:
- How LLMs work at a fundamental level
- What agents really are (LLM + tools + patterns)
- How different agent architectures function
- Why frameworks make certain design choices
Philosophy: Learn by building. Understand deeply, then use frameworks wisely.
- Node.js 18+
- At least 8GB RAM (16GB recommended)
- Download models and place in ./models/ folder, details in DOWNLOAD.md
Follow these examples in order to build understanding progressively:
intro/ | Code Explanation | Concepts
What you'll learn:
- Loading and running a local LLM
- Basic prompt/response cycle
Key concepts: Model loading, context, inference pipeline, token generation
openai-intro/ | Code Explanation | Concepts
What you'll learn:
- How to call hosted LLMs (like GPT-4)
- Temperature Control
- Token Usage
Key concepts: Inference endpoints, network latency, cost vs control, data privacy, vendor dependence
translation/ | Code Explanation | Concepts
What you'll learn:
- Using system prompts to specialize agents
- Output format control
- Role-based behavior
- Chat wrappers for different models
Key concepts: System prompts, agent specialization, behavioral constraints, prompt engineering
think/ | Code Explanation | Concepts
What you'll learn:
- Configuring LLMs for logical reasoning
- Complex quantitative problems
- Limitations of pure LLM reasoning
- When to use external tools
Key concepts: Reasoning agents, problem decomposition, cognitive tasks, reasoning limitations
batch/ | Code Explanation | Concepts
What you'll learn:
- Processing multiple requests concurrently
- Context sequences for parallelism
- GPU batch processing
- Performance optimization
Key concepts: Parallel execution, sequences, batch size, throughput optimization
coding/ | Code Explanation | Concepts
What you'll learn:
- Real-time streaming responses
- Token limits and budget management
- Progressive output display
- User experience optimization
Key concepts: Streaming, token-by-token generation, response control, real-time feedback
simple-agent/ | Code Explanation | Concepts
What you'll learn:
- Function calling / tool use fundamentals
- Defining tools the LLM can use
- JSON Schema for parameters
- How LLMs decide when to use tools
Key concepts: Function calling, tool definitions, agent decision making, action-taking
This is where text generation becomes agency!
simple-agent-with-memory/ | Code Explanation | Concepts
What you'll learn:
- Persisting information across sessions
- Long-term memory management
- Facts and preferences storage
- Memory retrieval strategies
Key concepts: Persistent memory, state management, memory systems, context augmentation
react-agent/ | Code Explanation | Concepts
What you'll learn:
- ReAct pattern (Reason → Act → Observe)
- Iterative problem solving
- Step-by-step tool use
- Self-correction loops
Key concepts: ReAct pattern, iterative reasoning, observation-action cycles, multi-step agents
This is the foundation of modern agent frameworks!
Each example folder contains:
- <name>.js - The working code example
- CODE.md - Step-by-step code explanation
- Line-by-line breakdowns
- What each part does
- How it works
- CONCEPT.md - High-level concepts
- Why it matters for agents
- Architectural patterns
- Real-world applications
- Simple diagrams
Simple Agent (Steps 1-5)
Tool-Using Agent (Step 6)
Memory Agent (Step 7)
ReAct Agent (Step 8)
helper/prompt-debugger.js
Utility for debugging prompts sent to the LLM. Shows exactly what the model sees, including:
- System prompts
- Function definitions
- Conversation history
- Context state
Usage example in simple-agent/simple-agent.js
- LLMs are stateless: Context must be managed explicitly
- System prompts shape behavior: Same model, different roles
- Function calling enables agency: Tools transform text generators into agents
- Memory is essential: Agents need to remember across sessions
- Reasoning patterns matter: ReAct > simple prompting for complex tasks
- Performance matters: Parallel processing, streaming, token limits
- Debugging is crucial: PromptDebugger shows what the model actually sees
Now that you understand the fundamentals, frameworks like LangChain, CrewAI, or AutoGPT provide:
- Pre-built reasoning patterns
- Tool libraries
- Memory management
- Multi-agent orchestration
- Production-ready error handling
- Observability and monitoring
You'll use them better because you know what they're doing under the hood.
- node-llama-cpp: GitHub
- Model Hub: Hugging Face
- GGUF Format: Quantized models for local inference
This is a learning resource. Feel free to:
- Suggest improvements to documentation
- Add more example patterns
- Fix bugs or unclear explanations
- Share what you built!
Educational resource - use and modify as needed for learning.
Built with ❤️ for people who want to truly understand AI agents
Start with intro/ and work your way through. Each example builds on the previous one. Read both CODE.md and CONCEPT.md for full understanding.
Happy learning!
.png)

