Show HN: Local LLM Notepad – run a GPT-style model from a USB stick

4 months ago 7

Plug a USB drive and run a modern LLM on any PC locally with a double‑click.

No installation, no internet, no API, no Cloud computing, no GPU, no admin rights required.

Local LLM Notepad is an open-source, offline plug-and-play app for running local large-language models. Drop the single bundled .exe onto a USB stick, walk up to any computer, and start chatting, brainstorming, or drafting documents.

🔌 Portable

Drop the one‑file EXE and your .gguf model onto a flash drive; run on any Windows PC without admin rights.

🪶 Clean UI

Two‑pane layout: type prompts below, watch token‑streamed answers above—no extra chrome.

🔍 Source‑word under‑lining

Every word or number you wrote in your prompt is automatically bold‑underlined in the model’s reply. Ctrl+left click on them to view them in a separate window. Handy for fact‑checking summaries, tables, or data extractions.

💾 Save/Load chats

One‑click JSON export keeps conversations with the model portable alongside the EXE.

⚡ Llama.cpp inside

CPU‑only by default for max compatibility.

🎹 Hot‑keys

Ctrl + S to send, Ctrl + Z to stop, Ctrl + F to find, Ctrl + X to clear chat history, Ctrl + Mouse‑Wheel zoom, etc.

Download Local_LLM_Notepad-portable.exe from the Releases page.

Copy the EXE and a compatible GGUF model (e.g. gemma-3-1b-it-Q4_K_M.gguf) onto your USB.

Double‑click the EXE on any Windows computer. First launch caches the model into RAM; subsequent prompts stream instantly.

Need another model? Use File ▸ Select Model… and point to a different GGUF.

File Link Notes

Local_LLM_Notepad-portable.exe	Direct download (v1.0.1)	~45 MB, contains everything needed to run LLM on Windows computer
gemma-3-1b-it-Q4_K_M.gguf	Hugging Face	Fast CPU model (~0.8 GB) we recommend for first-time users. Achieves ~20 tokens/second on an i7-10750H CPU
Icon (optional)	Notepad icon PNG	Save as Icon.png next to the EXE and it will be used automatically