MiniMax M3 Costs 5-10% of GPT-5.5 and Gemini 3.1 Pro While Matching Them on Coding Benchmarks
- MiniMax M3, released May 31 2026, costs $0.60/$2.40 per million input/output tokens -- roughly 8-20% of what GPT-5.5 ($5/$30) and Claude Opus 4.8 ($6.25/$25) charge.
- According to VentureBeat's June 2026 analysis, M3 scores 59.0% on SWE-Bench Pro, ahead of GPT-5.5's coding score of 58.6%, while matching Gemini 3.1 Pro on several agentic benchmarks.
- The price gap between the cheapest and most expensive frontier API models now exceeds 25x on input tokens and 12x on output tokens.
- M3's sparse MSA attention architecture powers a 1-million-token context window at a cost that undercuts full-attention rivals.
The price gap between top AI models widened sharply in late May 2026. MiniMax released its M3 model on May 31 -- and according to VentureBeat, it matches or beats GPT-5.5 and Gemini 3.1 Pro on several coding benchmarks while costing just 5-10% as much per token. For anyone paying API bills, the MiniMax M3 pricing deserves a close look.
What MiniMax M3 actually costs
MiniMax M3's standard API price is $0.60 per million input tokens and $2.40 per million output tokens. A limited launch promotion on OpenRouter halved that to roughly $0.30/$1.20, though standard rates are the benchmark worth planning around.
Compare that to the current US frontier tier:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | |---|---|---| | GPT-5.5 | $5.00 | $30.00 | | Claude Opus 4.8 | $6.25 | $25.00 | | Gemini 3.1 Pro | $2.50 | $15.00 | | Gemini 3.5 Flash | $1.50 | $9.00 | | MiniMax M3 | $0.60 | $2.40 |
Even at full rate, M3 input tokens cost 8x less than GPT-5.5 and 4x less than Gemini 3.5 Flash -- the latter of which already tripled in price over Gemini 3 Flash when Google launched it at I/O in May 2026.
How M3 benchmarks compare to GPT-5.5 and Gemini 3.1 Pro
The cost advantage is only useful if the model holds up. Per VentureBeat's June 2026 analysis, MiniMax M3:
- Scores 59.0% on SWE-Bench Pro -- ahead of GPT-5.5's coding score of 58.6%
- Hits 66.0% on Terminal-Bench 2.1 and 83.5 on BrowseComp, above Claude Opus 4.7's autonomous-browsing benchmark of 79.3
- Achieves 74.2% on MCP Atlas, matching Gemini 3.1 Pro on several agentic tasks
GPT-5.5 holds a lead on overall aggregate scoring (91 vs 76) and on multi-step agentic reasoning, averaging 81.5 against M3's 71.9. So M3 is not a universal replacement -- but for coding and long-context tasks specifically, it is competitive at a fraction of the cost.
The model ships with a 1-million-token context window, using a sparse MSA attention architecture designed to keep long-context inference cheaper to run than full-attention alternatives.
Why US frontier API prices keep rising
MiniMax M3's pricing arrives as US labs are moving in the opposite direction. According to XDA Developers, Gemini 3.5 Flash launched at $1.50/$9 per million tokens -- three times what Gemini 3 Flash cost at $0.50/$3. The Decoder noted that Google is now following Anthropic and OpenAI in pricing each new generation meaningfully higher than the last, even when performance gains are incremental.
The pattern is consistent across the major US providers: capability improves, but so does the cost per token. The gap between M3 and GPT-5.5 input pricing -- $0.60 versus $5.00 -- now spans more than 8x for a model that beats it on the SWE-Bench Pro coding benchmark.
What this cost gap means in practice
The arithmetic gets real quickly at scale. Running 10 million output tokens through GPT-5.5 costs $300; the same call through MiniMax M3 runs $24. For coding tasks where M3 is competitive, that 12x difference is difficult to justify on performance grounds alone.
The real constraints with M3 are availability and trust. It is a model from a Chinese company, the API is newer, and enterprise compliance requirements may not be satisfied. For personal use or cost-sensitive development work, those concerns are smaller.
This widening spread between frontier model prices is the same dynamic that drives bring-your-own-key tools like ByteChat -- when the cheapest and most expensive options differ by more than an order of magnitude, having per-token cost visibility and the ability to switch models mid-conversation becomes a practical advantage.
Frequently asked questions
Is MiniMax M3 really cheaper than GPT-5.5?
Yes. At standard pricing, MiniMax M3 costs $0.60 per million input tokens versus GPT-5.5's $5.00 -- roughly 8x cheaper on input. On output, M3 is $2.40 per million versus GPT-5.5's $30.00, a 12x difference. At the promotional OpenRouter rate, those gaps are roughly doubled.
Does MiniMax M3 match GPT-5.5 on performance?
On coding benchmarks, largely yes -- M3 scores 59.0% on SWE-Bench Pro versus GPT-5.5's 58.6% coding score. On broader agentic reasoning, GPT-5.5 leads with an aggregate score of 91 against M3's 76, so the right choice depends on your workload.
When did MiniMax M3 launch?
MiniMax M3 was released May 31, 2026. A 50% promotional pricing discount was available at launch on OpenRouter; standard rates are $0.60/$2.40 per million input/output tokens.
The distance between the cheapest frontier model and the most expensive is now measured in orders of magnitude.