Agent Memory Benchmark
Hindsight is #1
Leading every dataset on the Agent Memory Benchmark — the industry standard for evaluating memory and retrieval systems.
Agent Memory Benchmark
#1Hindsight scores across all AMB datasets — leading every benchmark with verified results.
Choose your models
Model Leaderboard
Find the best LLM, reranker, and embedding model for your Hindsight setup — ranked by quality, speed, and cost.
Model Leaderboard
Ranked models across all Hindsight operations — find the best LLM, reranker, and embedding model for your setup.
Retain
Ranked LLMs for retain() — fact extraction quality, speed, cost, and reliability.
Reflect
Ranked LLMs for the reflect() operation.
Reranker
Ranked rerankers for recall() — which reranker surfaces the most relevant facts first.
Embeddings
Ranked embedding models — affects both retain() storage and recall() retrieval quality.