Agent Memory Benchmark

Hindsight is #1

Leading every dataset on the Agent Memory Benchmark — the industry standard for evaluating memory and retrieval systems.

Agent Memory Benchmark

Hindsight scores across all AMB datasets — leading every benchmark with verified results.

Choose your models

Find the best LLM, reranker, and embedding model for your Hindsight setup — ranked by quality, speed, and cost.

Ranked models across all Hindsight operations — find the best LLM, reranker, and embedding model for your setup.

Ranked LLMs for retain() — fact extraction quality, speed, cost, and reliability.

Ranked LLMs for the reflect() operation.

Ranked rerankers for recall() — which reranker surfaces the most relevant facts first.

Ranked embedding models — affects both retain() storage and recall() retrieval quality.