Preseason
MatchesRankingsPrompts
GitHub
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

PrivacyTerms
@betocmn
LLM Evals
Methodology

TruLens vs UpTrain

TRTruLensvsUPUpTrain
TruLensUpTrain
44%
56%

Leading: UpTrain (55.6%)

Insufficient data
This matchup has 18 decisive cases (minimum 30 required for publication).

Statistics

MetricValue
TruLens wins8
UpTrain wins10
Abstains (no tool)105
Other tool chosen2552
Decisive cases18
TruLens win rate (unweighted)44.4%
95% CI24.6% - 66.3%
TruLens win rate (weighted)44.4%

Comments

TruLens

No comments yet

Verified critics can leave comments here.

UpTrain

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierTruLensUpTrainNoneOtherA rate
DeepSeek V3.2Mid0722990%
Mistral Small 4Mid402130100%
Gemini 2.5 ProFrontier13913125%
DeepSeek R1 0528Frontier307134100%
Claude Haiku 4.5Small001136n/a
Claude Opus 4.6Frontier000132n/a
Claude Opus 4.8Frontier00111n/a
Claude Sonnet 4.6Frontier001143n/a
DeepSeek V4 FlashMid00111n/a
DeepSeek V4 ProFrontier00111n/a
Devstral 2 2512Mid006129n/a
Gemini 2.5 FlashSmall001126n/a
Gemini 3.5 FlashSmall00111n/a
GLM 5 TurboFrontier0019113n/a
GLM 5.2Frontier00012n/a
GPT 5.3 CodexFrontier000144n/a
GPT 5.4Frontier000132n/a
GPT 5.4 MiniMid003140n/a
GPT 5.5Frontier00012n/a
Kimi K2.5Frontier003116n/a
Kimi K2.7 CodeFrontier00111n/a
Llama 4 MaverickFrontier002135n/a
Llama 4 ScoutSmall007124n/a
MiMo V2 ProFrontier008124n/a
MiMo V2.5 ProFrontier00012n/a
MiniMax M2.7Frontier005124n/a
MiniMax M3Frontier00111n/a
Qwen3 Coder NextMid003138n/a

Per-prompt breakdown

PromptTierTruLensUpTrainNoneOtherA rate
ai-support-agent-platformAdvanced14542020%
ai-revenue-ops-copilotBeginner131041625%
ai-support-agent-platformIntermediate305420100%
ai-support-agent-platformBeginner03663620%
ai-agent-applicationAdvanced10017100%
ai-revenue-ops-copilotIntermediate104419100%
ai-revenue-ops-copilotAdvanced102416100%
ai-agent-applicationIntermediate00020n/a
ai-agent-applicationBeginner00515n/a
ai-engineering-workflowAdvanced00018n/a
ai-engineering-workflowBeginner00812n/a
ai-engineering-workflowIntermediate00017n/a