Naman Bansal
Writer and editor at Supermemory. Covering AI memory, context engineering, and the future of intelligent systems for developers and builders.
10 posts
Matryoshka Representation Learning: The Ultimate Guide & How We Use It
Embeddings are the cornerstone of any retrieval system. And the larger the embeddings, the more information they can store. But large embeddings require a lot of memory, which leads to high computational costs and latency. To reduce this high cost, we can use models that produce embeddings with sm
Never Record Again: How Montra Uses Supermemory to Rethink Video Creation
Campbell Baron, the founder of Montra, has been making videos since he was twelve. By thirteen, he was already doing brand work. Today, he’s betting on a very different future for creators: a world where recording is the exception, and most videos are generated from scratch. Montra’s vision is bold
Build Your Own Perplexity in 15 Minutes With Supermemory
Supermemory has a fascinating open-source tool called OpenSearchAI. It's essentially a search assistant similar to Perplexity, but it remembers everything you've searched for and enriches future responses with that memory. I thought to myself, “This seems cool. But how complicated is it to build so
Building an AI Compliance Chatbot With Supermemory and Google Drive
Contract compliance reviews are a serious drain on time and focus. It’s a repetitive process that takes away from actual legal thinking, and the workflow is absolutely broken. Files live in different places. You’re never sure if you’re reading the latest version. And no one has time to manually tra
Knowledge Graph For RAG: Step-by-Step Tutorial
If you’ve ever built a retrieval-augmented generation (RAG) system using embeddings and vector databases, you already know the drill: you turn your data into vectors, stuff them into a store like FAISS, and let your model retrieve similar chunks during inference. And it works, until it doesn’t. W
2 Approaches For Extending Context Windows in LLMs
Transformer-based large language models have become the poster boys of modern AI, yet they still share one stark limitation: a finite context window. Once that window overflows, performance drops like a rock or the model forgets key details. This guide walks through two complementary strategies tha
LLM Costs Skyrocketing? Real Experts Weigh In
In this blog, we're gonna walk through a fictional story, while learning how to optimize LLMs for cost, and the associated tradeoffs. Tuesday, 10 June, 2:14 PM PST The billing alert hit. I was halfway through a product demo, nodding along to myself on Zoom, saying something vaguely confident about
Best Open-Source Embedding Models Benchmarked and Ranked
If your AI agent is returning the wrong context, it’s probably not your LLM, but your embedding model. Embeddings are the hidden engine behind retrieval-augmented generation (RAG) and memory systems. The better they are, the more relevant your results, and the smarter your app feels. But here’s the
3 Ways To Build LLMs With Long-Term Memory
You’ve already met our guide on implementing short-term conversational memory using LangChain, which is great for managing context inside a single chat window. But life, therapy, and enterprise apps sprawl across days, weeks, and years. If our agents are doomed to goldfish-brain amnesia, users end
How To Add Conversational Memory To LLMs Using LangChain
Chatbots that don’t remember conversations are very frustrating to work with. Users treat AI like a human and expect it to remember. LangChain recently migrated to LangGraph, a new stateful framework for building multi-step, memory-aware LLM apps. So while the docs might still say “LangChain memory