Preseason
MatchesRankingsPrompts
GitHub
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

PrivacyTerms
@betocmn
LLM Evals
Methodology

LangSmith vs Braintrust

LangSmithLALangSmithvsBraintrustBRBraintrust
LangSmithBraintrust
53%
47%

Leading: LangSmith (52.8%)

Statistics

MetricValue
LangSmith wins902
Braintrust wins806
Abstains (no tool)105
Other tool chosen862
Decisive cases1708
LangSmith win rate (unweighted)52.8%
95% CI50.4% - 55.2%
LangSmith win rate (weighted)52.8%

Comments

LangSmith

No comments yet

Verified critics can leave comments here.

Braintrust

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierLangSmithBraintrustNoneOtherA rate
GPT 5.3 CodexFrontier8136006%
Claude Haiku 4.5Small141131911%
Claude Sonnet 4.6Frontier586512047%
DeepSeek R1 0528Frontier1190718100%
Gemini 2.5 ProFrontier116191899%
Claude Opus 4.6Frontier11130181%
GPT 5.4Frontier189202216%
Kimi K2.5Frontier0109370%
GPT 5.4 MiniMid102433496%
GLM 5 TurboFrontier1884191118%
Mistral Small 4Mid1010233100%
Qwen3 Coder NextMid950343100%
MiniMax M2.7Frontier623153167%
DeepSeek V3.2Mid8002226100%
MiMo V2 ProFrontier62186198%
Llama 4 MaverickFrontier2202113100%
GPT 5.5Frontier012000%
MiniMax M3Frontier010110%
DeepSeek V4 ProFrontier541256%
Gemini 3.5 FlashSmall09120%
GLM 5.2Frontier09030%
MiMo V2.5 ProFrontier350438%
Devstral 2 2512Mid606123100%
Claude Opus 4.8Frontier511583%
Kimi K2.7 CodeFrontier06150%
DeepSeek V4 FlashMid411680%
Gemini 2.5 FlashSmall301123100%
Llama 4 ScoutSmall007124n/a

Per-prompt breakdown

PromptTierLangSmithBraintrustNoneOtherA rate
ai-revenue-ops-copilotIntermediate189130410159%
ai-support-agent-platformIntermediate20792512469%
ai-revenue-ops-copilotAdvanced128164212544%
ai-revenue-ops-copilotBeginner1261511014345%
ai-support-agent-platformAdvanced142130515352%
ai-support-agent-platformBeginner871006617847%
ai-engineering-workflowAdvanced2120414%
ai-agent-applicationBeginner835473%
ai-agent-applicationIntermediate560945%
ai-agent-applicationAdvanced370830%
ai-engineering-workflowIntermediate280720%
ai-engineering-workflowBeginner338650%