Preseason
MatchesRankingsPrompts
GitHub
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

PrivacyTerms
@betocmn
LLM Evals
Methodology

HumanFirst vs Humanloop

HUHumanFirstvsHUHumanloop
HumanFirstHumanloop
48%
52%

Leading: Humanloop (51.6%)

Statistics

MetricValue
HumanFirst wins44
Humanloop wins47
Abstains (no tool)105
Other tool chosen2479
Decisive cases91
HumanFirst win rate (unweighted)48.4%
95% CI38.4% - 58.5%
HumanFirst win rate (weighted)48.4%

Comments

HumanFirst

No comments yet

Verified critics can leave comments here.

Humanloop

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierHumanFirstHumanloopNoneOtherA rate
Devstral 2 2512Mid441966670%
Gemini 2.5 FlashSmall01411120%
Claude Haiku 4.5Small0411320%
DeepSeek R1 0528Frontier0471330%
DeepSeek V3.2Mid04221020%
MiMo V2 ProFrontier0281220%
Claude Opus 4.6Frontier000132n/a
Claude Opus 4.8Frontier00111n/a
Claude Sonnet 4.6Frontier001143n/a
DeepSeek V4 FlashMid00111n/a
DeepSeek V4 ProFrontier00111n/a
Gemini 2.5 ProFrontier009135n/a
Gemini 3.5 FlashSmall00111n/a
GLM 5 TurboFrontier0019113n/a
GLM 5.2Frontier00012n/a
GPT 5.3 CodexFrontier000144n/a
GPT 5.4Frontier000132n/a
GPT 5.4 MiniMid003140n/a
GPT 5.5Frontier00012n/a
Kimi K2.5Frontier003116n/a
Kimi K2.7 CodeFrontier00111n/a
Llama 4 MaverickFrontier002135n/a
Llama 4 ScoutSmall007124n/a
MiMo V2.5 ProFrontier00012n/a
MiniMax M2.7Frontier005124n/a
MiniMax M3Frontier00111n/a
Mistral Small 4Mid002134n/a
Qwen3 Coder NextMid003138n/a

Per-prompt breakdown

PromptTierHumanFirstHumanloopNoneOtherA rate
ai-revenue-ops-copilotBeginner17111039261%
ai-revenue-ops-copilotIntermediate11544046%
ai-support-agent-platformBeginner1136635179%
ai-revenue-ops-copilotAdvanced68240343%
ai-support-agent-platformIntermediate46541340%
ai-support-agent-platformAdvanced43541857%
ai-engineering-workflowIntermediate10016100%
ai-agent-applicationIntermediate010190%
ai-agent-applicationAdvanced00018n/a
ai-agent-applicationBeginner00515n/a
ai-engineering-workflowAdvanced00018n/a
ai-engineering-workflowBeginner00812n/a