N8N, Local LLM, MCP proxy in 1 compose file

4 months ago 8

In today's AI-driven world, creating intelligent automation workflows requires more than just language models. You need a complete ecosystem that combines workflow orchestration, local AI inference, and secure tool integration. This comprehensive tutorial introduces the powerful three-component stack: n8n, Docker Model Runner, and MCP Toolkit – a combination that delivers enterprise-grade AI automation while maintaining complete data privacy and control.

This tutorial will guide you through building a complete local AI automation platform that can analyze code, process documents, interact with GitHub repositories, and orchestrate complex multi-step workflows – all running entirely on your machine.

What is n8n?

n8n (pronounced "n-eight-n") is a fair-code workflow automation platform that combines the flexibility of code with the speed of no-code solutions. With over 400 integrations and native AI capabilities, n8n allows you to:

Build visual workflows with drag-and-drop interface
Write custom JavaScript/Python when needed
Self-host with complete control over your data
Integrate with virtually any service or API
Create AI-powered automations

What is Docker Model Runner?

Docker Model Runner is an experimental feature in Docker Desktop 4.40+ that enables you to run large language models (LLMs) locally on your host system - not in containers. Instead, it runs models directly on your Mac using a host-installed inference server (llama.cpp) for optimal GPU acceleration. Key features include:

Host-native execution: Models run directly on your host machine, not in containers
Apple Silicon optimization: Direct access to Metal API for GPU acceleration
OCI artifact storage: Models stored as standardized artifacts in Docker Hub
OpenAI-compatible API: Familiar endpoints for easy integration
No containerization overhead: Faster inference without container layers
Mac-only currently: Optimized for Apple Silicon (M1/M2/M3/M4), Windows support coming soon

The Model Context Protocol (MCP) Toolkit is a powerful addition to Docker Desktop that enables AI agents to securely interact with external systems and data sources. MCP provides:

Standardized protocols for AI-tool interactions
Secure sandboxed execution of external tools
GitHub integration for repository operations
File system access for document processing
API connectivity to various services
Tool chaining for complex automation workflows

Why Combine n8n + Docker Model Runner + MCP Toolkit?

This three-component stack creates a complete AI automation ecosystem:

Complete Data Privacy: All processing happens locally on your host machine
Native Performance: Direct GPU access without containerization overhead
Advanced Tool Integration: MCP enables AI agents to interact with GitHub, file systems, and APIs
Cost Efficiency: No per-token charges from cloud providers
Offline Capability: Work without internet connectivity after initial setup
Standardized Distribution: Models and tools stored as OCI artifacts
Powerful Workflow Engine: n8n orchestrates complex multi-step automations
Apple Silicon Optimization: Leverages Metal API for maximum performance
Secure Tool Execution: MCP provides sandboxed access to external systems

Prerequisites

Before starting, ensure you have:

Mac with Apple Silicon (M1/M2/M3/M4) - Windows support coming soon
Docker Desktop 4.40+ with Model Runner support
8GB+ RAM (16GB+ recommended for larger models)
10GB+ free disk space (for models and containers)
Basic Docker and workflow automation knowledge

Note: Currently, Docker Model Runner is optimized for macOS with Apple Silicon. Windows support with NVIDIA GPU arrived early April 2025.

How Docker Model Runner Works

Unlike traditional containerized AI solutions, Docker Model Runner takes a unique approach:

Host-Native Execution

No containers for models: AI models run directly on your host machine using llama.cpp
Direct GPU access: Leverages Apple's Metal API without containerization overhead
Host-level process: Docker Desktop runs the inference server natively on your Mac

Model Storage & Distribution

OCI artifacts: Models stored as standardized artifacts in Docker Hub
No compression layers: Faster downloads since model weights are uncompressible
Efficient storage: No need for both compressed and uncompressed versions
Registry compatibility: Works with any Docker-compatible registry

Connection Methods

Docker Model Runner provides multiple access patterns:

From containers: http://model-runner.docker.internal/
Host via Docker socket: /var/run/docker.sock
Host via TCP: When enabled, direct port access (default: 12434)

This architecture provides the performance benefits of native execution while maintaining Docker's standardized distribution and management capabilities.

Step 1: Enable Docker Model Runner

First, enable the Model Runner feature in Docker Desktop:

Via Docker Desktop UI

Open Docker Desktop Settings
Navigate to Features in development → Beta
Enable "Enable Docker Model Runner"
Optionally enable "Enable host-side TCP support" (port 12434)
Click Apply & restart

Via Command Line

# Enable Model Runner docker desktop enable model-runner # Enable with TCP support (optional) docker desktop enable model-runner --tcp 12434

Step 2: Download AI Models

Pull your preferred models using the new docker model CLI (these run natively on your host, not in containers):

# Lightweight model (1.2GB) - fast, good for testing docker model pull ai/llama3.2:1B-Q8_0 # Balanced model (3GB) - recommended for most use cases docker model pull ai/llama3.2:3B # More capable model (2GB) - good balance of size and capability docker model pull ai/gemma2:2B # List downloaded models docker model ls # Test a model directly docker model run ai/llama3.2:1B-Q8_0 "Hello, how are you?"

Important: These models run directly on your host using llama.cpp for optimal Apple Silicon GPU acceleration, not in Docker containers.

Model Comparison

ModelSizeUse CaseSpeed

ai/llama3.2:1B-Q8_0	1.2GB	Testing, simple tasks	Very Fast
ai/llama3.2:3B	3GB	General purpose	Fast
ai/gemma2:2B	2GB	Balanced performance	Fast
ai/qwen2.5:7B	7GB	Complex reasoning	Slower
ai/mistral:7B	7GB	Code & analysis	Slower

The MCP Toolkit enables AI agents to interact with external systems securely. Install it through Docker Desktop:

Via Docker Desktop Extensions

Open Docker Desktop
Go to Extensions in the left sidebar
Search for "MCP Toolkit" in the marketplace
Click Install on the Docker MCP Toolkit extension
Wait for installation to complete

Verify MCP Installation

# Check if MCP CLI is available docker mcp --help # List available MCP servers docker mcp server ls # Install GitHub MCP server (for repository operations) docker mcp server install github-official # Verify GitHub MCP server docker mcp server ls | grep github

Configure GitHub MCP

Select Docker MCP Toolkit and explore MCP Catalog

Search for GitHub Official

Add Personal Access token. If you haven't yet created PAT, follow the steps below.

Create a GitHub Personal Access Token (PAT)

Go to GitHub.com and sign in to your account
Click your profile picture in the top-right corner
Select "Settings"
Scroll down to "Developer settings" in the left sidebar
Click on "Personal access tokens" → "Tokens (classic)"
Click "Generate new token" → "Generate new token (classic)"
Give your token a descriptive name like "Docker MCP GitHub Access"
Select the following scopes (permissions):
repo (Full control of private repositories)
workflow (if you need workflow actions)
read:org (if you need organization access)
Click "Generate token"

Configure the GitHub MCP Server in Docker

Open Docker Desktop
Navigate to the MCP Server
Find the GitHub tool (official) card and click on it to expand details. Add PAT details.

If you want to stick to the terminal, you can use docker mcp secret command to set up the GitHub token as a secret:

docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN=github_pat_YOUR_TOKEN_HERE

For example:

docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN=github_pat_11AACMRCAXXXXXXxEp_QRZW43Wo1k6KYWwDXXXXXXXXGPXLZ7EGEnse82YM Info: No policy specified, using default policy

For GitHub repository operations, configure authentication:

# Set GitHub token (get from https://github.com/settings/tokens) docker mcp server configure github-official --github-token YOUR_GITHUB_TOKEN # Test GitHub MCP tools docker mcp tools list github-official # Example tools available: # - get_repository_info # - list_repositories # - create_issue # - list_pull_requests # - search_repositories

Step 3: Project Setup

Create your project directory and required files:

mkdir n8n-ai-setup cd n8n-ai-setup mkdir shared # This will be mounted to /data/shared in n8n

Step 4: Docker Compose Configuration

Create a docker-compose.yml file:

services: postgres: image: postgres:13 restart: unless-stopped environment: POSTGRES_DB: ${POSTGRES_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"] interval: 30s timeout: 10s retries: 5 redis: image: redis:7-alpine restart: unless-stopped volumes: - redis_data:/data healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 30s timeout: 10s retries: 5 n8n: image: docker.n8n.io/n8nio/n8n:latest restart: unless-stopped environment: # Database DB_TYPE: postgresdb DB_POSTGRESDB_HOST: postgres DB_POSTGRESDB_PORT: 5432 DB_POSTGRESDB_DATABASE: ${POSTGRES_DB} DB_POSTGRESDB_USER: ${POSTGRES_USER} DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD} # Redis QUEUE_BULL_REDIS_HOST: redis QUEUE_BULL_REDIS_PORT: 6379 # n8n Configuration N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY} N8N_HOST: ${N8N_HOST} N8N_PORT: 5678 N8N_PROTOCOL: http WEBHOOK_URL: http://localhost:5678/ GENERIC_TIMEZONE: ${GENERIC_TIMEZONE} # AI Configuration N8N_AI_ENABLED: true N8N_AI_OPENAI_DEFAULT_BASE_URL: http://model-runner.docker.internal/engines/llama.cpp/v1 N8N_AI_DEFAULT_MODEL: ${N8N_AI_DEFAULT_MODEL} # Execution Mode EXECUTIONS_MODE: main # File system N8N_DEFAULT_BINARY_DATA_MODE: filesystem ports: - "5678:5678" volumes: - n8n_data:/home/node/.n8n - ./shared:/data/shared depends_on: postgres: condition: service_healthy redis: condition: service_healthy # Uncomment for queue mode (production) # n8n-worker: # image: docker.n8n.io/n8nio/n8n:latest # restart: unless-stopped # environment: # # Same environment as n8n main # DB_TYPE: postgresdb # DB_POSTGRESDB_HOST: postgres # DB_POSTGRESDB_PORT: 5432 # DB_POSTGRESDB_DATABASE: ${POSTGRES_DB} # DB_POSTGRESDB_USER: ${POSTGRES_USER} # DB_POSTGRESDB_PASSWORD: ${POSTGRES_PASSWORD} # QUEUE_BULL_REDIS_HOST: redis # QUEUE_BULL_REDIS_PORT: 6379 # N8N_ENCRYPTION_KEY: ${N8N_ENCRYPTION_KEY} # GENERIC_TIMEZONE: ${GENERIC_TIMEZONE} # N8N_AI_ENABLED: true # N8N_AI_OPENAI_DEFAULT_BASE_URL: http://model-runner.docker.internal/engines/llama.cpp/v1 # N8N_AI_DEFAULT_MODEL: ${N8N_AI_DEFAULT_MODEL} # EXECUTIONS_MODE: queue # volumes: # - n8n_data:/home/node/.n8n # - ./shared:/data/shared # depends_on: # postgres: # condition: service_healthy # redis: # condition: service_healthy # command: n8n worker volumes: postgres_data: redis_data: n8n_data:

Step 5: Environment Configuration

Create a .env file with your configuration:

# Generate a secure encryption key openssl rand -hex 32 # Database Configuration POSTGRES_DB=n8n POSTGRES_USER=n8n POSTGRES_PASSWORD=n8n_password_change_me # n8n Configuration N8N_ENCRYPTION_KEY=your_generated_32_char_hex_key_here N8N_HOST=localhost GENERIC_TIMEZONE=America/New_York # AI Model Configuration N8N_AI_DEFAULT_MODEL=ai/llama3.2:3B # Security Note: Change these default passwords! # For production: Use strong, unique passwords

Important: Always change the default passwords and generate a unique encryption key!

Step 6: Launch the Stack

Start all services:

# Start all services in detached mode docker compose up -d # Check service status docker compose ps # View logs docker compose logs -f n8n # Check specific service logs docker compose logs postgres docker compose logs redis

Step 7: Initial Setup

Open your browser and navigate to http://localhost:5678
Create your admin account when prompted
Complete the initial setup wizard

Step 8: Test Your AI Setup

Let's create a simple workflow to test the AI integration:

Basic AI Test Workflow

Create a new workflow in n8n
Add a Manual Trigger node
- Method: POST
- URL: http://model-runner.docker.internal/engines/llama.cpp/v1/chat/completions
- Headers: Content-Type: application/json
- Body:
Execute the workflow and verify you receive an AI response

Add an HTTP Request node with these settings:

{ "model": "ai/llama3.2:3B", "messages": [ { "role": "user", "content": "Hello! Please introduce yourself and explain what you can do." } ], "max_tokens": 150, "temperature": 0.7 }

Using n8n AI Nodes

For more advanced scenarios, you can use n8n's built-in AI nodes:

OpenAI Chat Model node (configured with local endpoint)
AI Agent node for complex reasoning
Text Classifier node for categorization
Information Extractor node for data extraction

Step 9: Complete Stack Testing - All Three Components

Now let's verify that all three components (n8n + Model Runner + Docker MCP Toolkit) work together harmoniously. Save the comprehensive testing script below as complete-stack-test.sh:

# Download and run the complete stack test curl -O https://raw.githubusercontent.com/ajeetraina/n8n-model-runner/main/performance-test-mcp.sh chmod +x performance-test-mcp.sh ./performance-test-mcp.sh

This script will:

✅ Test n8n health and API connectivity
✅ Verify Model Runner API and AI inference
✅ Check MCP Toolkit and GitHub integration
✅ Create integration workflow using all three components
✅ Performance test the complete stack
✅ Generate comprehensive report of capabilities

Expected Output

When running successfully, you should see:

GitHub AI Workflow Testing

The script creates a complete integration workflow (complete-integration-workflow.json) that demonstrates:

Workflow Architecture:

Webhook Trigger → AI Analysis → GitHub MCP → Process Results

What it does:

Receives repository name via webhook
Analyzes repository using local AI model
Fetches GitHub data via MCP toolkit
Processes and combines results
Returns intelligent insights about the repository

Import the workflow:

Open n8n at http://localhost:5678
Click Import from File
Select complete-integration-workflow.json
Activate the workflow

Test the workflow:

# Test the complete GitHub AI workflow curl -X POST http://localhost:5678/webhook/complete-demo \ -H "Content-Type: application/json" \ -d '{"repo_name": "ajeetraina/n8n-model-runner"}'

This will trigger the workflow and return AI-powered analysis of the repository!

Useful Resources

Community and Support

Join the n8n Community Forum for discussions and support
Follow @n8n_io on Twitter for updates
Check out the n8n GitHub repository for source code and issues

Conclusion

You now have a complete local AI automation platform that leverages the best of both worlds: n8n's powerful workflow capabilities running in containers, and AI models running natively on your host for optimal performance!

The combination of n8n's containerized workflow engine and Docker Model Runner's host-native AI inference creates a powerful platform for building intelligent automation workflows while maintaining complete control over your data and infrastructure. Whether you're processing customer emails, analyzing documents, or creating content, this local AI stack provides the power and flexibility you need.

Start building your first AI-powered workflow today, and discover the potential of local AI automation with the performance benefits of native execution!

Read Entire Article