Search the frontier by model, provider, price, context, and evidence.

LLMMultimodal

#31 · MiniMax

MiniMax M2.7

MiniMax M2.7 by MiniMax appears in 1 source with Reasoning at 53.06. Best read for Low cost, Open weights, Reasoning.

Top metric53.06Reasoning

Context205Ktokens

Low costOpen weightsReasoningTool use

#32 · Zhipu AI

GLM-4.6

GLM-4.6 by Zhipu AI appears in 2 sources with Reasoning at 37.73. Best read for Code quality, Coding, High intelligence.

Top metric37.73Reasoning

Code qualityCodingHigh intelligenceMath

Licensemittext + vision

#33 · OpenAI

GPT-5.1 High

GPT-5.1 High by OpenAI appears in 2 sources with Reasoning at 53.33. Best read for Coding, High intelligence, LLM.

Top metric53.33Reasoning

CodingHigh intelligenceLLMMath

#34 · OpenAI

GPT-5 mini

Input$0.25/Mpublic snapshot

GPT-5 mini by OpenAI appears in 1 source with Reasoning at 36.61. Best read for LLM, Low cost, Math.

Top metric36.61Reasoning

Context400Ktokens

LLMLow costMathMultimodal

#35 · Google

Gemma 4 31B

Gemma 4 31B by Google appears in 2 sources with Reasoning at 44.8. Best read for Coding, High intelligence, Low cost.

Top metric44.8Reasoning

Input$0.14/Mpublic snapshot

CodingHigh intelligenceLow costMultimodal

#36 · DeepSeek

DeepSeek-V4-Flash-Max

DeepSeek-V4-Flash-Max by DeepSeek appears in 1 source with Reasoning at 51.86. Best read for Code quality, Long context, Low cost.

Top metric51.86Reasoning

Input$0.14/Mpublic snapshot

#37 · xAI

Grok-3

Grok-3 by xAI appears in 1 source with Reasoning at 39.69. Best read for LLM, Low cost, Math.

Top metric39.69Reasoning

Context128Ktokens

Input$3/Mpublic snapshot

LLMLow costMathMultimodal

#38 · OpenAI

GPT-5 Medium

GPT-5 Medium by OpenAI appears in 1 source with Reasoning at 43.51. Best read for LLM, Math, Multimodal.

Top metric43.51Reasoning

LLMMathMultimodalReasoning

#39 · OpenAI

GPT-5.1 Thinking

GPT-5.1 Thinking by OpenAI appears in 1 source with Reasoning at 47.47. Best read for Code quality, LLM, Multimodal.

Top metric47.47Reasoning

#40 · Zhipu AI

GLM-4.7

GLM-4.7 by Zhipu AI appears in 2 sources with Reasoning at 43.81. Best read for Code quality, Coding, High intelligence.

Top metric43.81Reasoning

Code qualityCodingHigh intelligenceMultimodal

Licensemittext + vision

#41 · Alibaba Cloud / Qwen Team

Qwen3.6-27B

Qwen3.6-27B by Alibaba Cloud / Qwen Team appears in 1 source with Reasoning at 45.74. Best read for Code quality, Low cost, Multimodal.

Top metric45.74Reasoning

Input$0.6/Mpublic snapshot

Code qualityLow costMultimodalOpen weights

#42 · MiniMax

MiniMax M2.5

MiniMax M2.5 by MiniMax appears in 1 source with Reasoning at 52.51. Best read for Code quality, Long context, Low cost.

Top metric52.51Reasoning

#43 · OpenAI

GPT-4.1 mini

GPT-4.1 mini by OpenAI appears in 1 source with Reasoning at 15.71. Best read for LLM, Long context, Low cost.

Top metric15.71Reasoning

Input$0.4/Mpublic snapshot

#44 · Meituan

LongCat-Flash-Chat

Created by Meituan

LongCat-Flash-Chat by Meituan appears in 2 sources with Reasoning at 23.41. Best read for Code quality, Coding, High intelligence.

Top metric23.41Reasoning

Context128Ktokens

Code qualityCodingHigh intelligenceLow cost

#45 · Moonshot AI

Kimi K2 0905

Created by Moonshot AI

Kimi K2 0905 by Moonshot AI appears in 1 source with Reasoning at 25.5. Best read for LLM, Math, Reasoning.

Top metric25.5Reasoning

LLMMathReasoning

#46 · Google

Gemini 2.5 Pro

Gemini 2.5 Pro by Google appears in 2 sources with Reasoning at 35.05. Best read for Coding, High intelligence, LLM.

Top metric35.05Reasoning

Input$1.25/Mpublic snapshot

CodingHigh intelligenceLLMLong context

#47 · Google

Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite by Google appears in 1 source with Reasoning at 41.75. Best read for LLM, Long context, Low cost.

Top metric41.75Reasoning

Input$0.25/Mpublic snapshot

#48 · DeepSeek

DeepSeek-V3.2 (Non-thinking)

Top metric22.77 scoreLLM Stats Code Index (estimated from arena)

DeepSeek-V3.2 (Non-thinking) by DeepSeek appears in 1 source with LLM Stats Code Index (estimated from arena) at 22.77 score. Best read for Low cost, Open weights.

Context131Ktokens

Input$0.28/Mpublic snapshot

Low costOpen weights

#49 · MiniMax

MiniMax M2

MiniMax M2 by MiniMax appears in 1 source with Reasoning at 33.6. Best read for Code quality, Long context, Low cost.

Top metric33.6Reasoning

#50 · Anthropic

Claude Haiku 4.5

Claude Haiku 4.5 by Anthropic appears in 1 source with Reasoning at 35.4. Best read for Code quality, LLM, Low cost.

Top metric35.4Reasoning

Context200Ktokens

Input$1/Mpublic snapshot

Code qualityLLMLow costMultimodal

#51 · Moonshot AI

Kimi K2-Thinking-0905

Created by Moonshot AI

Kimi K2-Thinking-0905 by Moonshot AI appears in 1 source with Reasoning at 45.23. Best read for Code quality, Math, Open weights.

Top metric45.23Reasoning

Code qualityMathOpen weightsReasoning

#52 · OpenAI

GPT-5.1 Instant

Input$1.25/Mpublic snapshot

GPT-5.1 Instant by OpenAI appears in 1 source with Reasoning at 48.57. Best read for Code quality, LLM, Low cost.

Top metric48.57Reasoning

Context400Ktokens

Code qualityLLMLow costMultimodal

#53 · Anthropic

Claude Opus 4

Claude Opus 4 by Anthropic appears in 1 source with Reasoning at 35.16. Best read for Code quality, LLM, Multimodal.

Top metric35.16Reasoning

#54 · xAI

Grok 4 Fast

Grok 4 Fast by xAI appears in 1 source with Reasoning at 40.93. Best read for LLM, Long context, Low cost.

Top metric40.93Reasoning

#55 · Google

Gemini 2.5 Flash

Gemini 2.5 Flash by Google appears in 2 sources with Reasoning at 28.49. Best read for Coding, High intelligence, LLM.

Top metric28.49Reasoning

CodingHigh intelligenceLLMLong context

#56 · OpenAI

GPT-5

GPT-5 by OpenAI appears in 1 source with Reasoning at 44.66. Best read for Code quality, LLM, Multimodal.

Top metric44.66Reasoning

#57 · Anthropic

Claude Sonnet 4

Claude Sonnet 4 by Anthropic appears in 1 source with Reasoning at 30.27. Best read for Code quality, LLM, Multimodal.

Top metric30.27Reasoning

#58 · Mistral AI

Mistral Large 3 (675B Instruct 2512)

Created by Mistral AI

Mistral Large 3 (675B Instruct 2512) by Mistral AI appears in 1 source with Reasoning at 10.23. Best read for Low cost, Math, Multimodal.

Top metric10.23Reasoning

Input$0.5/Mpublic snapshot

Low costMathMultimodalOpen weights

#59 · MiniMax

MiniMax M2.1

MiniMax M2.1 by MiniMax appears in 1 source with Reasoning at 41.25. Best read for Code quality, Long context, Low cost.

Top metric41.25Reasoning

#60 · xAI

Grok 4.3

Top metric25.31 scoreLLM Stats Code Index (estimated from arena)

Grok 4.3 by xAI appears in 2 sources with LLM Stats Code Index (estimated from arena) at 25.31 score. Best read for Coding, High intelligence, LLM.

Input$1.25/Mpublic snapshot

CodingHigh intelligenceLLMLong context

#61 · OpenAI

GPT-4.1

GPT-4.1 by OpenAI appears in 1 source with Reasoning at 21.09. Best read for LLM, Long context, Low cost.

Top metric21.09Reasoning

Input$2/Mpublic snapshot

#62 · DeepSeek

DeepSeek-V3.2-Speciale

DeepSeek-V3.2-Speciale by DeepSeek appears in 1 source with Reasoning at 43.22. Best read for Code quality, Math, Open weights.

Top metric43.22Reasoning

#63 · Anthropic

Claude Opus 4.8

Claude Opus 4.8 by Anthropic appears in 1 source with Reasoning at 65.69. Best read for LLM, Long context, Low cost.

Top metric65.69Reasoning

Input$5/Mpublic snapshot

#64 · Xiaomi

MiMo-V2-Flash

Created by Xiaomi

MiMo-V2-Flash by Xiaomi appears in 1 source with Reasoning at 38.43. Best read for Code quality, Math, Open weights.

Top metric38.43Reasoning

#65 · xAI

Grok-4.20 Beta Reasoning

Top metric20.12 scoreLLM Stats Code Index (estimated from arena)

Grok-4.20 Beta Reasoning by xAI appears in 1 source with LLM Stats Code Index (estimated from arena) at 20.12 score. Best read for LLM, Multimodal.

LLMMultimodal

#66 · Meituan

LongCat-Flash-Thinking

Created by Meituan

LongCat-Flash-Thinking by Meituan appears in 1 source with Reasoning at 35.89. Best read for Code quality, Math, Open weights.

Top metric35.89Reasoning

#67 · OpenAI

GPT-5.4 nano

GPT-5.4 nano by OpenAI appears in 1 source with Reasoning at 39.89. Best read for LLM, Low cost, Multimodal.

Top metric39.89Reasoning

Context400Ktokens

LLMLow costMultimodalTool use

#68 · Alibaba Cloud / Qwen Team

Qwen3.5-27B

Qwen3.5-27B by Alibaba Cloud / Qwen Team appears in 1 source with Reasoning at 41.95. Best read for Code quality, Low cost, Multimodal.

Top metric41.95Reasoning

Code qualityLow costMultimodalOpen weights

#69 · Alibaba Cloud / Qwen Team

Qwen3 Max

Qwen3 Max by Alibaba Cloud / Qwen Team appears in 1 source with Reasoning at 28.55. Best read for Code quality, LLM, Math.

Top metric28.55Reasoning

Code qualityLLMMathTool use

#70 · Zhipu AI

GLM-4.7-Flash

GLM-4.7-Flash by Zhipu AI appears in 1 source with Reasoning at 31.38. Best read for Code quality, Math, Open weights.

Top metric31.38Reasoning

#71 · Meituan

LongCat-Flash-Lite

Created by Meituan

LongCat-Flash-Lite by Meituan appears in 1 source with Reasoning at 23. Best read for Code quality, Low cost, Open weights.

Top metric23Reasoning

Context256Ktokens

Code qualityLow costOpen weightsTool use

#72 · DeepSeek

DeepSeek-V3.2-Exp

DeepSeek-V3.2-Exp by DeepSeek appears in 2 sources with Reasoning at 35.69. Best read for Code quality, Coding, High intelligence.

Top metric35.69Reasoning

Code qualityCodingHigh intelligenceMath

#73 · Alibaba Cloud / Qwen Team

Qwen3.5-122B-A10B

Qwen3.5-122B-A10B by Alibaba Cloud / Qwen Team appears in 2 sources with Reasoning at 43.05. Best read for Code quality, Coding, High intelligence.

Top metric43.05Reasoning

Input$0.4/Mpublic snapshot

Code qualityCodingHigh intelligenceLow cost

#74 · Zhipu AI

GLM-4.5

GLM-4.5 by Zhipu AI appears in 2 sources with Reasoning at 33.95. Best read for Code quality, Coding, High intelligence.

Top metric33.95Reasoning

Code qualityCodingHigh intelligenceMath

#75 · OpenAI

GPT-5.1 Codex

GPT-5.1 Codex by OpenAI appears in 1 source with Reasoning at 42.49. Best read for Code quality, LLM, Multimodal.

Top metric42.49Reasoning

#76 · StepFun

Step-3.5-Flash

Created by StepFun

Step-3.5-Flash by StepFun appears in 1 source with Reasoning at 49.2. Best read for Code quality, Low cost, Open weights.

Top metric49.2Reasoning

Context66Ktokens

Code qualityLow costOpen weightsTool use

Licenseapache_2_0text

#77 · OpenAI

GPT-5.3 Chat

Top metric27.85 scoreLLM Stats Code Index (estimated from arena)

GPT-5.3 Chat by OpenAI appears in 1 source with LLM Stats Code Index (estimated from arena) at 27.85 score. Best read for LLM, Low cost, Multimodal.

Context128Ktokens

Input$1.75/Mpublic snapshot

LLMLow costMultimodal

#78 · OpenAI

GPT-5.1 Codex High

GPT-5.1 Codex High by OpenAI appears in 1 source with Reasoning at 44.32. Best read for LLM, Math, Multimodal.

Top metric44.32Reasoning

LLMMathMultimodalReasoning

#79 · OpenAI

GPT OSS 120B High

GPT OSS 120B High by OpenAI appears in 1 source with Reasoning at 31.76. Best read for Low cost, Math, Open weights.

Top metric31.76Reasoning

Context131Ktokens

Low costMathOpen weightsTool use

Licenseapache_2_0text

#80 · Alibaba Cloud / Qwen Team

Qwen3 VL 235B A22B Instruct

Qwen3 VL 235B A22B Instruct by Alibaba Cloud / Qwen Team appears in 2 sources with Reasoning at 26.03. Best read for Coding, High intelligence, Low cost.

Top metric26.03Reasoning

CodingHigh intelligenceLow costMultimodal

#81 · xAI

Grok-4 Fast Reasoning

Top metric26.07 scoreLLM Stats Code Index (estimated from arena)

Grok-4 Fast Reasoning by xAI appears in 1 source with LLM Stats Code Index (estimated from arena) at 26.07 score. Best read for LLM, Long context, Low cost.

#82 · xAI

Grok-4 Fast Non-Reasoning

Top metric23.23 scoreLLM Stats Code Index (estimated from arena)

Grok-4 Fast Non-Reasoning by xAI appears in 1 source with LLM Stats Code Index (estimated from arena) at 23.23 score. Best read for LLM, Long context, Low cost.

#83 · Alibaba Cloud / Qwen Team

Qwen3 VL 4B Thinking

Qwen3 VL 4B Thinking by Alibaba Cloud / Qwen Team appears in 1 source with Reasoning at 15.02. Best read for Low cost, Multimodal, Open weights.

Top metric15.02Reasoning

Low costMultimodalOpen weightsTool use

#84 · xAI

Grok-4.1 Fast Non-Reasoning

Top metric23.08 scoreLLM Stats Code Index (estimated from arena)

Grok-4.1 Fast Non-Reasoning by xAI appears in 1 source with LLM Stats Code Index (estimated from arena) at 23.08 score. Best read for LLM, Long context, Low cost.

#85 · OpenAI

GPT-5 nano

GPT-5 nano by OpenAI appears in 1 source with Reasoning at 24.61. Best read for LLM, Math, Multimodal.

Top metric24.61Reasoning

LLMMathMultimodalReasoning

#86 · xAI

Grok-4.20 Multi-Agent Beta

Top metric21.9 scoreLLM Stats Code Index (estimated from arena)

Grok-4.20 Multi-Agent Beta by xAI appears in 1 source with LLM Stats Code Index (estimated from arena) at 21.9 score. Best read for LLM, Multimodal.

LLMMultimodal

#87 · Anthropic

Claude 3.7 Sonnet

Claude 3.7 Sonnet by Anthropic appears in 1 source with Reasoning at 28.92. Best read for Code quality, LLM, Multimodal.

Top metric28.92Reasoning

#88 · xAI

Grok-4.1 Fast Reasoning

Top metric19.21 scoreLLM Stats Code Index (estimated from arena)

Grok-4.1 Fast Reasoning by xAI appears in 1 source with LLM Stats Code Index (estimated from arena) at 19.21 score. Best read for LLM, Long context, Low cost.

#89 · xAI

Grok Code Fast 1

Grok Code Fast 1 by xAI appears in 1 source with Reasoning at 31.93. Best read for Code quality, LLM, Low cost.

Top metric31.93Reasoning

Context256Ktokens

Code qualityLLMLow costReasoning

#90 · Alibaba Cloud / Qwen Team

Qwen3-Coder

Top metric17.3 scoreLLM Stats Code Index (estimated from arena)

Qwen3-Coder by Alibaba Cloud / Qwen Team appears in 1 source with LLM Stats Code Index (estimated from arena) at 17.3 score. Best read for Open weights.