MemScore - supermemory | Memory API for the AI era

What is MemScore?

MemScore is a composite metric that captures three dimensions of memory provider performance in a single line:

accuracy% / latencyMs / contextTok

For example:

85% / 120ms / 1500tok

This tells you the provider achieved 85% accuracy, with an average search latency of 120ms, sending 1,500 tokens of context to the answering model per question.

Components

Component	What it measures	Source
Quality	Answer accuracy as a percentage	`(correct / total) * 100` from judge evaluations
Latency	Average search response time in milliseconds	Mean of all search phase durations
Tokens	Average context tokens sent to the answering model	Client-side token count of retrieved context per question

MemScore is not a single number — it’s a triple. This is intentional. Collapsing quality, latency, and cost into one score hides important tradeoffs. A provider with 90% accuracy at 5,000 tokens is very different from one with 90% accuracy at 500 tokens.

How token counting works

MemoryBench counts tokens client-side using provider-specific tokenizers:

Model provider	Tokenizer	Method
OpenAI	`js-tiktoken`	Exact count using `o200k_base` or `cl100k_base` encoding
Anthropic	`@anthropic-ai/tokenizer`	Exact count using Anthropic’s tokenizer
Google	Approximation	`Math.ceil(text.length / 4)`

Three token values are tracked per question:

promptTokens — Total tokens in the full prompt (instructions + context + question)
basePromptTokens — Tokens in the prompt without any retrieved context
contextTokens — Tokens in just the retrieved context string

The MemScore uses contextTokens because it isolates what the memory provider actually contributed.

Where MemScore appears

CLI output

After a benchmark run completes, MemScore is printed in the summary:

SUMMARY:
  Total Questions: 50
  Correct: 43
  Accuracy: 86.00%

  Quality:  86%
  Latency:  145ms (avg)
  Tokens:   1,823 (avg context sent to answering model)

  MemScore: 86% / 145ms / 1823tok

Web UI

The MemScore card appears at the top of the run overview page. Per-question token counts are shown next to each model answer in both the question list and detail views.

Report JSON

The report.json file includes both a display string and structured components:

{
  "memscore": "86% / 145ms / 1823tok",
  "memscoreComponents": {
    "quality": 86,
    "latencyMs": 145,
    "contextTokens": 1823
  },
  "tokens": {
    "totalTokens": 142500,
    "basePromptTokens": 21000,
    "contextTokens": 91150,
    "avgTokensPerQuestion": 2850,
    "avgBasePromptTokens": 420,
    "avgContextTokens": 1823
  }
}

Use memscoreComponents for programmatic comparisons — it avoids parsing the display string.

Comparing providers

MemScore is most useful when comparing providers on the same benchmark:

bun run src/index.ts compare -p supermemory,mem0,zep -b locomo -j gpt-4o

Each provider’s report will include its own MemScore, making it easy to see tradeoffs at a glance:

Provider	MemScore
Provider A	`88% / 145ms / 1200tok`
Provider B	`82% / 80ms / 2400tok`
Provider C	`85% / 110ms / 1800tok`

In this example, Provider A has the highest accuracy but the slowest search. Provider B is the fastest but sends the most context without achieving the best accuracy — suggesting its retrieval may be less precise. Provider C lands in the middle on all three axes. There’s no single “winner” — the right choice depends on whether you prioritize quality, speed, or token efficiency.

Backward compatibility

Runs from before MemScore was added will still work. If token data is not present in the checkpoint, the memscore, memscoreComponents, and tokens fields will be undefined in the report. The CLI and web UI gracefully skip the MemScore display when data is unavailable.

​What is MemScore?

​Components

​How token counting works

​Where MemScore appears

​CLI output

​Web UI

​Report JSON

​Comparing providers

​Backward compatibility

What is MemScore?

Components

How token counting works

Where MemScore appears

CLI output

Web UI

Report JSON

Comparing providers

Backward compatibility