Zero config, actually
Run the binary with nothing set and you get a complete memory system:- The Supermemory graph engine, embedded — created automatically on first boot. No database to stand up, no connection strings.
- Built-in local embeddings — vectors are computed on your machine. Nothing is sent anywhere to be embedded.
- An API key, generated for you — printed on first boot, ready to paste into any SDK.
- The full Memory API —
/v3/documents,/v4/search,/v4/profile, spaces, the works.
Runs fully offline
Supermemory works with any OpenAI-compatible endpoint, which means it runs end-to-end on your machine with a local model — Ollama, LM Studio, vLLM, llama.cpp.gpt-oss-20b is a great fit:
Drop-in with your existing code
The self-hosted server speaks the same API as the hosted platform. Point any Supermemory SDK at it with a one-line change:SUPERMEMORY_API_URL=http://localhost:6767.
Self-hosted vs. the platform
Self-hosted is free, open source, and great for local development, air-gapped environments, and privacy-sensitive workloads. The hosted platform is where the full product lives:| Self-hosted | Platform | |
|---|---|---|
| Full Memory API | ✅ | ✅ |
| Hybrid semantic search | ✅ | ✅ |
| Local embeddings | ✅ | Managed |
| File ingestion (PDFs, images) | ✅ | ✅ |
| Connectors (Google Drive, Notion, Gmail, OneDrive) | — | ✅ |
| Supermemory MCP | — | ✅ |
| Memory extraction | Your model, your key | Proprietary long-horizon models — higher quality, cheaper at scale |
| Infrastructure | Your machine | Globally distributed, scales with you |
baseURL change away. Running this for a team or organization? See Local vs. Enterprise.
Next steps
Quickstart
Install, run, and store your first memory in under two minutes
Configuration
Every environment variable: LLM providers, storage, auth, tuning