Preseason
MatchesRankingsPrompts
GitHub
Preseason
MatchesRankingsPromptsMethodologyContact

© 2026 Preseason. All rights reserved.

PrivacyTerms
@betocmn
LLM Evals
Methodology

Promptfoo vs LlamaIndex

PromptfooPRPromptfoovsLLLlamaIndex
PromptfooLlamaIndex
49%
51%

Leading: LlamaIndex (50.8%)

Statistics

MetricValue
Promptfoo wins31
LlamaIndex wins32
Abstains (no tool)105
Other tool chosen2507
Decisive cases63
Promptfoo win rate (unweighted)49.2%
95% CI37.3% - 61.2%
Promptfoo win rate (weighted)49.2%

Comments

Promptfoo

No comments yet

Verified critics can leave comments here.

LlamaIndex

No comments yet

Verified critics can leave comments here.

Per-model breakdown

ModelTierPromptfooLlamaIndexNoneOtherA rate
Llama 4 ScoutSmall0327920%
Mistral Small 4Mid702127100%
MiMo V2 ProFrontier608118100%
GLM 5 TurboFrontier5019108100%
Gemini 3.5 FlashSmall2019100%
GPT 5.4 MiniMid203138100%
Kimi K2.5Frontier203114100%
Kimi K2.7 CodeFrontier2019100%
MiMo V2.5 ProFrontier20010100%
Claude Opus 4.8Frontier10110100%
DeepSeek V4 ProFrontier10110100%
GLM 5.2Frontier10011100%
Claude Haiku 4.5Small001136n/a
Claude Opus 4.6Frontier000132n/a
Claude Sonnet 4.6Frontier001143n/a
DeepSeek R1 0528Frontier007137n/a
DeepSeek V3.2Mid0022106n/a
DeepSeek V4 FlashMid00111n/a
Devstral 2 2512Mid006129n/a
Gemini 2.5 FlashSmall001126n/a
Gemini 2.5 ProFrontier009135n/a
GPT 5.3 CodexFrontier000144n/a
GPT 5.4Frontier000132n/a
GPT 5.5Frontier00012n/a
Llama 4 MaverickFrontier002135n/a
MiniMax M2.7Frontier005124n/a
MiniMax M3Frontier00111n/a
Qwen3 Coder NextMid003138n/a

Per-prompt breakdown

PromptTierPromptfooLlamaIndexNoneOtherA rate
ai-revenue-ops-copilotIntermediate66440850%
ai-support-agent-platformIntermediate57541142%
ai-revenue-ops-copilotAdvanced63240867%
ai-revenue-ops-copilotBeginner441041250%
ai-support-agent-platformBeginner266635725%
ai-support-agent-platformAdvanced15541917%
ai-engineering-workflowBeginner3089100%
ai-engineering-workflowIntermediate20015100%
ai-agent-applicationIntermediate1101850%
ai-engineering-workflowAdvanced10017100%
ai-agent-applicationAdvanced00018n/a
ai-agent-applicationBeginner00515n/a