2025 feels like a breakout year for local models. Open‑weight releases are getting genuinely useful: from Google’s Gemma to recent *gpt‑oss* drops, the gap with frontier commercial models keeps narrowing for many day‑to‑day tasks.
Yet outside of this community, local LLMs still don’t seem mainstream. My hunch: *great UX and durable apps are still thin on the ground.*
If you are using local models, I’d love to learn from your setup and workflows. Please be specific so others can calibrate:
Model(s) & size: exact name/version, and quantization (e.g., Q4_K_M).
Runtime/tooling: e.g., Ollama, LM studio, etc.
Hardware: CPU/GPU details (VRAM/RAM), OS. If laptop/edge/home servers, mention that.
Workflows where local wins: privacy/offline, data security, coding, huge amount extraction, RAG over your files, agents/tools, screen capture processing—what’s actually sticking for you?
Pain points: quality on complex reasoning, context management, tool reliability, long‑form coherence, energy/thermals, memory, Windows/Mac/Linux quirks.
Favorite app today: the one you actually open daily (and why).
Wishlist: the app you wish existed.
Gotchas/tips: config flags, quant choices, prompt patterns, or evaluation snippets that made a real difference.
If you’re not using local models yet, what’s the blocker—setup friction, quality, missing integrations, battery/thermals, or just “cloud is easier”? Links are welcome, but what helps most is concrete numbers and anecdotes from real use.
A simple reply template (optional):
``` Model(s): Runtime/tooling: Hardware: Use cases that stick: Pain points: Favorite app: Wishlist: ```
Also curious how people think about privacy and security in practice. Thanks!
.png)
