Using AI Without Leaving the Terminal: A Guide to llm

4 months ago 6

using llm(cli-tool) to use AI

I first discovered the llm tool watching Simon Willison’s talk “Catching up on the weird world of LLMs” at North Bay Python 2023. Since then, it’s become an essential part of my development workflow. This guide covers everything you need to know to get started and use it effectively.

What is llm?

llm is a command-line tool that provides a unified interface to over 100 language models. Instead of switching between different web interfaces, you can chat with GPT-4, Claude, Gemini, or local models directly from your terminal.

Key features:

Universal interface: One command for all models
Automatic logging: Every conversation saved to SQLite
Plugin ecosystem: Extend functionality with 70+ plugins
Pipe friendly: Works naturally with Unix pipes and command chains

Installation and Setup

There are several ways to install llm. Choose the method that works best for your setup:

# Recommended: isolated environment with uv uv tool install llm # Python ≥ 3.9 (creates a venv under ~/.uv) # Quick one‑off try‑out (temporary env) OPENAI_API_KEY=sk‑... uvx llm "fun facts about skunks" # Traditional options pipx install llm # or brew install llm

Tip: uvx spins up a throw‑away virtualenv each run; switch to uv tool install when you’re ready for a permanent install.

Once installed, you’ll need to configure at least one AI provider. OpenAI is the most straightforward to start with:

llm keys set openai # paste your key llm "Ten names for a pet pelican"

Add more providers at any time:

llm install llm-anthropic llm keys set anthropic

Basic Usage Patterns

Here are the fundamental ways to interact with llm:

Prompting & system messages:

llm -m gpt-4o "Explain quantum computing in one tweet" llm -s "You are an SRE" -f server.log "Find errors in this log"

Place flags before the prompt so shell tab completion stays happy.

Working with files: Instead of copy-pasting content, you can directly reference files or pipe content:

llm -f myscript.py "Summarise this code" cat diff.patch | llm -s "Generate a conventional commit message"

Interactive chat: For longer conversations, use chat mode to maintain context across multiple exchanges:

llm chat # new conversation llm chat -c # continue last one llm chat -m claude-4-opus

Managing Conversations

All conversations are automatically saved to SQLite. Here’s how to search and manage your conversation history:

llm logs -n 10 # tail llm logs -q "vector search" # search strings llm -c "follow up question" # continue context llm logs -u # include token & cost usage llm logs --json > backup.json # export

Essential Plugins

The plugin ecosystem extends llm to work with different AI providers and add specialized functionality:

CategoryExamples

Local models	llm-mlx, llm-gguf, llm-ollama
Remote APIs	llm-anthropic, llm-gemini, llm-mistral, llm-openrouter
Tools	llm-tools-quickjs, llm-tools-sqlite
Fragments	llm-fragments-github, llm-fragments-pdf
Embeddings	llm-sentence-transformers, llm-clip
Extras	llm-cmd, llm-jq, llm-markov

Install with llm install <plugin> with no restart needed.

# Major AI providers llm install llm-anthropic # Claude models llm install llm-gemini # Google Gemini llm install llm-ollama # Local models via Ollama # Add API keys llm keys set anthropic llm keys set gemini # Now use different models llm -m claude-4-opus "Write a technical explanation" llm -m gemini-2.0-flash "Quick calculation"

Running Local Models on Apple Silicon

For privacy or offline work, you can run models locally on Apple Silicon Macs using the llm-mlx plugin:

uv tool install llm --python 3.12 # sentencepiece wheels exist here llm install llm-mlx # macOS 14.4+ only # Download a 3 B model (≈1.8 GB) llm mlx download-model mlx-community/Llama-3.2-3B-Instruct-4bit

Model recommendations by RAM:

RAMModel (4 bit)

8 GB	Llama 3.2 3B
16 GB	Mistral 7B
32 GB	Mistral Small 24B

Assign an alias for convenience:

llm aliases set local3b mlx-community/Llama-3.2-3B-Instruct-4bit

Fragments for Long Context

Fragments let you feed huge inputs without copy paste:

# Install the GitHub fragments plugin first llm install llm-fragments-github # Summarise an entire GitHub repo llm -f github:simonw/files-to-prompt "Key design decisions?" # Use the symbex loader to extract a single Python symbol llm install llm-fragments-symbex symbex my_module:some_func | llm -c "Write pytest tests"

Let models run code imperatively:

llm --functions 'def sq(x:int)->int: return x*x' \ "What is 431²?" --td # shows the tool call # Use a plugin tool llm install llm-tools-quickjs llm --tool quickjs "JSON.parse('[1,2,3]').reduce((a,b)=>a+b,0)"

Templates and Aliases

Create shortcuts and reusable prompts to speed up common tasks:

llm aliases set fast gpt-4o-mini llm aliases set smart gpt-4o llm -s "Explain like I'm five" --save eLI5 # save as template llm -t eLI5 "Why is the sky blue?"

Set a default model for the session:

llm models default gpt-4o-mini

Embeddings & Semantic Search

Build searchable knowledge bases with embeddings:

# Embed text into a SQLite DB llm embed -m clip "hello world" -d embeds.db # Find similar rows llm similar "vector databases" -d embeds.db -n 5

Development Workflow Integration

Code analysis with symbex: The symbex tool pairs perfectly with llm for code analysis and documentation:

# Install companion tools pip install symbex files-to-prompt # Analyze specific functions symbex my_function | llm -s "Explain this code" # Generate tests symbex my_function | llm -s "Write pytest tests" # Document entire codebase files-to-prompt . -e py | llm -s "Generate API documentation"

Git integration: Integrate AI into your Git workflow for better commit messages and code reviews:

# Generate commit messages git diff --cached | llm -s "Generate a conventional commit message" # Code reviews git diff HEAD~1 | llm -s "Review these changes for potential issues"

Cost Optimization

Monitor and control costs by using the right models for different tasks:

llm -u "Analyse this file" -f big.txt # prints tokens & cost llm logs -u | ttok -s # summary of past usage # Cheap model for simple tasks llm -m gpt-4o-mini "Quick draft tweet" # Continue conversations to reuse context llm "Analyze this architecture" -f project/ llm -c "Now focus on security" # Reuses previous context

Automation Examples

Create scripts to automate repetitive AI tasks:

Batch summariser:

#!/usr/bin/env bash for url in "$@"; do curl -s "$url" | strip-tags article | \ llm -s "3 bullet summary" > "summary_$(basename $url).md" done

Documentation generator:

files-to-prompt src/ -e py | \ llm -s "Generate API docs" > docs/api.md

Advanced Features

Structured output: Get responses in specific JSON formats for easier programmatic processing:

# Get JSON with specific schema llm "Analyze sentiment" --schema '{ "type": "object", "properties": { "sentiment": {"type": "string"}, "confidence": {"type": "number"} } }'

Multi-modal capabilities: Work with images and other media types using compatible models:

# Image analysis llm "Describe this image" -a screenshot.png -m gpt-4o # Extract text from images llm "Extract all text" -a document.jpg -m gpt-4o

Best Practices

Create aliases for favourite models
Save reusable templates and system prompts
Use fragments to feed large context instead of copy paste
Pick the cheapest model that solves the task
Combine with Unix pipes for powerful automation
Turn logging off with llm logs off if working with sensitive data

Conclusion

It’s been transformative integrating AI directly into my command-line workflow. Instead of context-switching between web interfaces, I can analyze code, generate documentation, or ask quick questions without leaving the terminal. The combination of universal model access, automatic conversation logging, and pipe-friendly design makes it an essential tool for any developer working with AI.

For more detailed information and advanced features, check out the official documentation at https://llm.datasette.io/

Read Entire Article