Case Studies Case Study

How Chatarmin ditched RAG and went memory-only

Dhravya Shah Dhravya Shah
Share
Industry
WhatsApp Marketing
Use cases
Conversational memory, Real-time web search
Built with
Memory API, Web search

Chatarmin is a WhatsApp marketing platform for ecommerce brands. Its AI replies were bottlenecked by a heavy RAG pipeline — slow responses and runaway token costs. Switching to Supermemory's memory layer let the team drop RAG entirely.

40s → 12s
average AI response time
40–50%
fewer tokens used
0
RAG infrastructure left to maintain
“We just ditched RAG completely and went memory only through Supermemory. Reduced avg response time from 40s → 12s, using about 40–50% fewer tokens.”
— Founder, Chatarmin

Chatarmin is a WhatsApp marketing platform for ecommerce brands — flows, broadcasts, and AI-assisted conversations that turn chats into revenue. As the team leaned harder on AI replies, a familiar bottleneck showed up: a heavy RAG pipeline that was both slow and expensive.

The problem: RAG was the bottleneck

Every AI response meant embedding the query, hitting a vector store, stitching context, and only then generating. Response times crept toward 40 seconds, and token usage ballooned as the same context got reprocessed on every turn. For a product where conversations need to feel instant, that latency was a non-starter.

Going memory-only

Chatarmin replaced the entire RAG stack with Supermemory's memory layer. Instead of rebuilding context on every request, conversations carry a persistent memory that Supermemory recalls in milliseconds — plus near-realtime web search for volatile information the model shouldn't try to memorize.

We just ditched RAG completely and went memory only through Supermemory.

The results

  • Average AI response time dropped from 40s to 12s.
  • Token usage fell by 40–50%.
  • Zero RAG infrastructure left to maintain.

By treating memory as the primary context source instead of bolting retrieval onto every request, Chatarmin made its AI both faster and cheaper — without losing the context that makes conversations feel personal.

Want to build like Chatarmin?

Read the docs →