Preseason
MatchesRankingsPrompts
GitHub
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

PrivacyTerms
@betocmn
LLM Evals
Methodology

LMSYS Chatbot Arena vs Helicone

LMLMSYS Chatbot ArenavsHeliconeHEHelicone
LMSYS Chatbot ArenaHelicone
50%
50%
Insufficient data
This matchup has 14 decisive cases (minimum 30 required for publication).

Statistics

MetricValue
LMSYS Chatbot Arena wins7
Helicone wins7
Abstains (no tool)105
Other tool chosen2556
Decisive cases14
LMSYS Chatbot Arena win rate (unweighted)50.0%
95% CI26.8% - 73.2%
LMSYS Chatbot Arena win rate (weighted)50.0%

Comments

LMSYS Chatbot Arena

No comments yet

Verified critics can leave comments here.

Helicone

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierLMSYS Chatbot ArenaHeliconeNoneOtherA rate
Llama 4 ScoutSmall707117100%
Llama 4 MaverickFrontier0521300%
DeepSeek V3.2Mid01221050%
Devstral 2 2512Mid0161280%
Claude Haiku 4.5Small001136n/a
Claude Opus 4.6Frontier000132n/a
Claude Opus 4.8Frontier00111n/a
Claude Sonnet 4.6Frontier001143n/a
DeepSeek R1 0528Frontier007137n/a
DeepSeek V4 FlashMid00111n/a
DeepSeek V4 ProFrontier00111n/a
Gemini 2.5 FlashSmall001126n/a
Gemini 2.5 ProFrontier009135n/a
Gemini 3.5 FlashSmall00111n/a
GLM 5 TurboFrontier0019113n/a
GLM 5.2Frontier00012n/a
GPT 5.3 CodexFrontier000144n/a
GPT 5.4Frontier000132n/a
GPT 5.4 MiniMid003140n/a
GPT 5.5Frontier00012n/a
Kimi K2.5Frontier003116n/a
Kimi K2.7 CodeFrontier00111n/a
MiMo V2 ProFrontier008124n/a
MiMo V2.5 ProFrontier00012n/a
MiniMax M2.7Frontier005124n/a
MiniMax M3Frontier00111n/a
Mistral Small 4Mid002134n/a
Qwen3 Coder NextMid003138n/a

Per-prompt breakdown

PromptTierLMSYS Chatbot ArenaHeliconeNoneOtherA rate
ai-revenue-ops-copilotBeginner221041650%
ai-support-agent-platformAdvanced305422100%
ai-revenue-ops-copilotIntermediate0344170%
ai-support-agent-platformIntermediate205421100%
ai-revenue-ops-copilotAdvanced0224150%
ai-agent-applicationIntermediate00020n/a
ai-agent-applicationAdvanced00018n/a
ai-agent-applicationBeginner00515n/a
ai-engineering-workflowAdvanced00018n/a
ai-engineering-workflowBeginner00812n/a
ai-engineering-workflowIntermediate00017n/a
ai-support-agent-platformBeginner0066365n/a