Show HN: Supercharge Your Readwise Library with Local, Semantic Search

4 months ago 6

Turn your Readwise library into a blazing-fast, self-hosted semantic search engine – complete with nightly syncs, vector search API, Prometheus metrics, and a streaming MCP server for LLM clients.

Readwise Vector DB – Self-host your reading highlights search

# ❶ Clone & install git clone https://github.com/leonardsellem/readwise-vector-db.git cd readwise-vector-db poetry install --sync # ❷ Boot DB & run the API (localhost:8000) docker compose up -d db poetry run uvicorn readwise_vector_db.api:app --reload # ❸ Verify curl http://127.0.0.1:8000/health # → {"status":"ok"} open http://127.0.0.1:8000/docs # interactive swagger UI

Tip: Codespaces user? Click "Run → Open in Browser" after step ❷.

• Python 3.12 | Poetry ≥ 1.8 | Docker + Compose

Create .env (see .env.example) – minimal:

READWISE_TOKEN=xxxx # get from readwise.io/api_token OPENAI_API_KEY=sk-... DATABASE_URL=postgresql+asyncpg://rw_user:rw_pass@localhost:5432/readwise

All variables are documented in docs/env.md.

docker compose up -d db # Postgres 16 + pgvector poetry run alembic upgrade head

# first-time full sync poetry run rwv sync --backfill # daily incremental (fetch since yesterday) poetry run rwv sync --since $(date -Idate -d 'yesterday')

curl -X POST http://127.0.0.1:8000/search \ -H 'Content-Type: application/json' \ -d '{ "q": "Large Language Models", "k": 10, "filters": { "source": "kindle", "tags": ["ai", "research"], "highlighted_at": ["2024-01-01", "2024-12-31"] } }'

Streaming Search (MCP TCP)

poetry run python -m readwise_vector_db.mcp --host 0.0.0.0 --port 8375 & # then from another shell printf '{"jsonrpc":"2.0","id":1,"method":"search","params":{"q":"neural networks"}}\n' | \ nc 127.0.0.1 8375

flowchart LR subgraph Ingestion A[Readwise API] -- highlights --> B[Back-fill Job] C[Nightly Cron] -- since-cursor --> D[Incremental Job] end B & D --> E[Embedding Service OpenAI] --> F[Postgres and pgvector] F --> G[FastAPI Search API] G --> H[MCP Server] G --> I[Prometheus Metrics]

Full SVG available at assets/architecture.svg.

Development & Contribution

Environment
poetry install --with dev poetry run pre-commit install # black, isort, ruff, mypy, markdownlint
Run tests & coverage
poetry run coverage run -m pytest && coverage report
Performance check (make perf) – fails if /search P95 >500 ms.
Branching model: feature/xyz → PR → squash-merge. Use Conventional Commits (feat:, fix: …).
Coding style: see .editorconfig and enforced linters.

See CONTRIBUTING.md for full guidelines.

CI/CD – .github/workflows/ci.yml runs lint, type-check, tests (Py 3.11 + 3.12) and publishes images to GHCR.
Back-ups – pg_dump weekly cron uploads compressed dump as artifact (Goal G4).
Releasing – bump version in pyproject.toml, run make release.

Code licensed under the MIT License. Made with ❤️ by the community, powered by FastAPI, SQLModel, pgvector, OpenAI and Taskmaster-AI.

Read Entire Article