June 20, 2026 · ByteChatTrend

Qwen 3.7 Max Pricing: 50% Discount Ends June 22 — What API Costs Look Like After

Key takeaways

Alibaba's Qwen 3.7 Max API is 50% off at $1.25/$3.75 per 1M tokens through June 22, 2026, then reverts to list price of $2.50/$7.50.
Qwen 3.7 Max offers a 1M-token context window; cached input drops to $0.25/1M (90% off) during the promotion.
At list price, Qwen 3.7 Max still undercuts Claude Fable 5 ($10/$50 per 1M) and Grok 5 ($3/$15 per 1M) by a wide margin.
Developers paying directly via API key see the full rate change; users inside subscription products typically do not.

Alibaba's flagship reasoning model Qwen 3.7 Max has been running on a 50% launch promotion since its release on May 20, 2026 — but that window closes on June 22, in two days. For developers actively evaluating AI model pricing and choosing where to route their workloads, the timing is worth a close look.

What Qwen 3.7 Max Costs Right Now

During the promotional period, Qwen 3.7 Max API pricing sits at $1.25 per million input tokens and $3.75 per million output tokens, according to Alibaba's Qwen API documentation and third-party trackers including OpenRouter and PricePerToken. Cached input drops further still, to $0.25 per million tokens — a 90% discount for workflows that reuse context heavily, such as document-processing pipelines or long-running agents.

The model's 1 million-token context window makes that cache discount meaningful. Filling the full context window costs roughly $1.25 at promotional input rates, and the cache brings repeated calls on the same context down to fractions of a cent per million tokens.

After June 22: The Standard Price Floor

Once the promotion ends, Qwen 3.7 Max reverts to list price: $2.50 per million input tokens and $7.50 per million output tokens. That's a clean 2× jump on both sides.

The output price doubling matters most for generation-heavy applications — detailed reports, code generation, multi-step reasoning chains. If you're budgeting for an app that ships into production after this weekend, plan around list pricing rather than the promotional rate.

How Qwen 3.7 Max API Cost Compares to Other Frontier Models

At list price, Qwen 3.7 Max is still one of the cheaper frontier-class models, particularly on input cost. For comparison:

Claude Fable 5: $10.00 / $50.00 per million tokens — roughly 4× Qwen's input cost, nearly 7× on output
Grok 5: $3.00 / $15.00 per million tokens — somewhat cheaper input, but output is 2× Qwen
Qwen 3.7 Max (list): $2.50 / $7.50 per million tokens

GPT-5.5 Instant is reportedly positioned in the budget-to-mid tier as well, though firm list prices vary by access channel. The key differentiator for Qwen 3.7 Max relative to comparably-priced alternatives is the 1M-token context window, which most peers in its price band don't match.

One honest caveat: raw token cost isn't the whole picture. A model that requires more back-and-forth or produces less accurate outputs on your specific task can end up costing more in practice, even if its per-token price is lower.

What the 1M-Token Context Window Changes in Practice

A million-token context window isn't a novelty — it changes which tasks become practical at all. Feeding an entire codebase, processing a lengthy legal document end-to-end, or holding very long conversation threads without chunking all benefit from extended context. At list pricing of $2.50 per million input tokens, filling that context window costs $2.50 per call before any caching. That's cheap enough to make the architecture viable for applications where it previously wasn't.

Most models with comparable reasoning benchmarks still cap at 128K–200K tokens. If long-context processing is core to your workload, Qwen 3.7 Max and a small group of peers are the realistic options — and API cost per token is a real differentiator when those calls run at scale.

Who Actually Benefits From the Price Change — And Who Doesn't

The promotional-to-list price jump affects developers using the Qwen API directly with their own API keys. If you access Qwen through an AI subscription product that doesn't pass through underlying provider rates, the price change likely won't hit your bill — the platform absorbs it.

This is the fundamental split in how AI costs flow. Developers with direct API access have real exposure to pricing changes in both directions (discounts and hikes), while subscription users pay a fixed rate regardless of what the underlying model costs. Bring-your-own-key tools like ByteChat sit explicitly on the API-direct side — users see the actual provider rate, including windows like this one, with no markup added on top.

If Qwen 3.7 Max fits your use case, the next 48 hours are the cheapest API access to it you'll see for a while.

Frequently asked questions

What does Qwen 3.7 Max cost after the June 22, 2026 promotion ends?

After the promotion, Qwen 3.7 Max reverts to $2.50 per million input tokens and $7.50 per million output tokens. The promotional cached-input rate of $0.25 per million also ends; standard cache pricing applies from June 23 onward.

Is Qwen 3.7 Max cheaper than Claude Fable 5?

At list price, yes — significantly. Qwen 3.7 Max is $2.50/$7.50 per million tokens versus Claude Fable 5 at $10/$50 per million. The trade-off is that Fable 5 is Anthropic's top-tier model targeting the most demanding reasoning tasks, while Qwen 3.7 Max is positioned as a high-context-efficiency frontier model at a lower price point.

How does the Qwen 3.7 Max 1M-token context window affect API cost?

Filling the full 1M-token context costs approximately $2.50 at list input pricing. With caching (during the promotion, $0.25/1M), repeated calls on the same long context become very cheap — around $0.25 per full-context reload. This makes Qwen 3.7 Max particularly cost-effective for document-heavy or long-session applications.