Case Studies Case Study

How Chatarmin ditched RAG and went memory-only

Dhravya Shah June 10, 2026

Industry

WhatsApp Marketing

Use cases

Conversational memory, Real-time web search

Built with

Memory API, Web search

chatarmin.com ↗

Chatarmin is a WhatsApp marketing platform for ecommerce brands. Its AI replies were bottlenecked by a heavy RAG pipeline — slow responses and runaway token costs. Switching to Supermemory's memory layer let the team drop RAG entirely.

40s → 12s

average AI response time

40–50%

fewer tokens used

RAG infrastructure left to maintain

“We just ditched RAG completely and went memory only through Supermemory. Reduced avg response time from 40s → 12s, using about 40–50% fewer tokens.”

— Founder, Chatarmin

Chatarmin is a WhatsApp marketing platform for ecommerce brands — flows, broadcasts, and AI-assisted conversations that turn chats into revenue. As the team leaned harder on AI replies, a familiar bottleneck showed up: a heavy RAG pipeline that was both slow and expensive.

The problem: RAG was the bottleneck

Every AI response meant embedding the query, hitting a vector store, stitching context, and only then generating. Response times crept toward 40 seconds, and token usage ballooned as the same context got reprocessed on every turn. For a product where conversations need to feel instant, that latency was a non-starter.

Going memory-only

Chatarmin replaced the entire RAG stack with Supermemory's memory layer. Instead of rebuilding context on every request, conversations carry a persistent memory that Supermemory recalls in milliseconds — plus near-realtime web search for volatile information the model shouldn't try to memorize.

We just ditched RAG completely and went memory only through Supermemory.

https://x.com/saasjesus/status/2024410743135694991

The results

Average AI response time dropped from 40s to 12s.
Token usage fell by 40–50%.
Zero RAG infrastructure left to maintain.

By treating memory as the primary context source instead of bolting retrieval onto every request, Chatarmin made its AI both faster and cheaper — without losing the context that makes conversations feel personal.

Want to build like Chatarmin?

Read the docs →