LLM Observability

Langfuse vs LangSmith

LangfuseLangSmith

33%

67%

Leading: LangSmith (66.9%)

Metric	Value
Langfuse wins	723
LangSmith wins	1461
Abstains (no tool)	57
Other tool chosen	469
Decisive cases	2184
Langfuse win rate (unweighted)	33.1%
95% CI	31.2% - 35.1%
Langfuse win rate (weighted)	33.1%

Verified critics can leave comments here.

Verified critics can leave comments here.

Model	Tier	Langfuse	LangSmith	None	Other	A rate
GPT 5.3 Codex	Frontier	129	15	0	0	90%
Claude Sonnet 4.6	Frontier	34	109	1	0	24%
GPT 5.4 Mini	Mid	26	114	1	3	19%
Gemini 2.5 Pro	Frontier	1	133	6	4	1%
GPT 5.4	Frontier	119	13	0	0	90%
Claude Opus 4.6	Frontier	110	22	0	0	83%
GLM 5 Turbo	Frontier	12	120	0	0	9%
DeepSeek R1 0528	Frontier	0	129	1	14	0%
Claude Haiku 4.5	Small	89	39	0	13	70%
DeepSeek V3.2	Mid	7	120	0	5	6%
Qwen3 Coder Next	Mid	59	67	1	16	47%
MiMo V2 Pro	Frontier	1	125	2	4	1%
Mistral Small 4	Mid	5	120	1	15	4%
Kimi K2.5	Frontier	50	65	4	0	43%
MiniMax M2.7	Frontier	1	105	3	21	1%
Llama 4 Maverick	Frontier	12	79	2	50	13%
Llama 4 Scout	Small	25	0	14	97	100%
Devstral 2 2512	Mid	0	22	17	99	0%
GPT 5.5	Frontier	7	5	0	0	58%
MiniMax M3	Frontier	7	5	0	0	58%
Kimi K2.7 Code	Frontier	5	7	0	0	42%
GLM 5.2	Frontier	4	8	0	0	33%
Gemini 3.5 Flash	Small	2	10	0	0	17%
Claude Opus 4.8	Frontier	5	6	1	0	45%
MiMo V2.5 Pro	Frontier	3	8	0	1	27%
DeepSeek V4 Flash	Mid	6	4	1	0	60%
DeepSeek V4 Pro	Frontier	3	7	1	1	30%
Gemini 2.5 Flash	Small	1	4	1	126	20%

Prompt	Tier	Langfuse	LangSmith	None	Other	A rate
ai-support-agent-platform	Intermediate	83	280	1	72	23%
ai-revenue-ops-copilot	Intermediate	91	264	1	67	26%
ai-support-agent-platform	Beginner	167	183	13	70	48%
ai-revenue-ops-copilot	Advanced	102	243	2	82	30%
ai-support-agent-platform	Advanced	126	216	1	92	37%
ai-revenue-ops-copilot	Beginner	109	228	29	71	32%
ai-agent-application	Beginner	4	13	3	0	24%
ai-engineering-workflow	Advanced	11	5	0	4	69%
ai-agent-application	Advanced	6	10	0	3	38%
ai-agent-application	Intermediate	3	13	0	3	19%
ai-engineering-workflow	Intermediate	13	1	0	5	93%
ai-engineering-workflow	Beginner	8	5	7	0	62%