Preseason
MatchesRankingsPrompts
GitHub
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

PrivacyTerms
@betocmn
LLM Evals
Methodology

Braintrust vs LangSmith

BraintrustBRBraintrustvsLangSmithLALangSmith
BraintrustLangSmith
47%
53%

Leading: LangSmith (52.8%)

Statistics

MetricValue
Braintrust wins806
LangSmith wins902
Abstains (no tool)105
Other tool chosen862
Decisive cases1708
Braintrust win rate (unweighted)47.2%
95% CI44.8% - 49.6%
Braintrust win rate (weighted)47.2%

Comments

Braintrust

No comments yet

Verified critics can leave comments here.

LangSmith

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierBraintrustLangSmithNoneOtherA rate
GPT 5.3 CodexFrontier13680094%
Claude Haiku 4.5Small113141989%
Claude Sonnet 4.6Frontier655812053%
DeepSeek R1 0528Frontier01197180%
Gemini 2.5 ProFrontier11169181%
Claude Opus 4.6Frontier113101899%
GPT 5.4Frontier921802284%
Kimi K2.5Frontier109037100%
GPT 5.4 MiniMid41023344%
GLM 5 TurboFrontier8418191182%
Mistral Small 4Mid01012330%
Qwen3 Coder NextMid0953430%
MiniMax M2.7Frontier316253133%
DeepSeek V3.2Mid08022260%
MiMo V2 ProFrontier1628612%
Llama 4 MaverickFrontier02221130%
GPT 5.5Frontier12000100%
MiniMax M3Frontier10011100%
Gemini 3.5 FlashSmall9012100%
GLM 5.2Frontier9003100%
DeepSeek V4 ProFrontier451244%
MiMo V2.5 ProFrontier530463%
Kimi K2.7 CodeFrontier6015100%
Claude Opus 4.8Frontier151517%
Devstral 2 2512Mid0661230%
DeepSeek V4 FlashMid141620%
Gemini 2.5 FlashSmall0311230%
Llama 4 ScoutSmall007124n/a

Per-prompt breakdown

PromptTierBraintrustLangSmithNoneOtherA rate
ai-revenue-ops-copilotIntermediate130189410141%
ai-support-agent-platformIntermediate92207512431%
ai-revenue-ops-copilotAdvanced164128212556%
ai-revenue-ops-copilotBeginner1511261014355%
ai-support-agent-platformAdvanced130142515348%
ai-support-agent-platformBeginner100876617853%
ai-engineering-workflowAdvanced1220486%
ai-agent-applicationIntermediate650955%
ai-agent-applicationBeginner385427%
ai-engineering-workflowIntermediate820780%
ai-agent-applicationAdvanced730870%
ai-engineering-workflowBeginner338650%