Microsoft Is Cutting Most Internal Claude Code Licenses by June 30 Over Runaway AI Costs
- Microsoft's Experiences & Devices division -- responsible for Windows, Microsoft 365, and Teams -- is canceling most internal Claude Code licenses by June 30, 2026 after per-engineer API costs reached $500 to $2,000 per month.
- Uber's engineering org saw Claude Code adoption surge from 32% to 84% of roughly 5,000 engineers by March 2026, exhausting the company's entire planned 2026 AI coding budget in four months.
- Microsoft is redirecting engineers to GitHub Copilot CLI, a cheaper, less capable tool it controls, rather than absorbing open-ended token billing from a third-party provider.
- The incidents reveal a wider enterprise reckoning with AI token billing opacity -- pay-per-token models unlock powerful tools but require spending visibility and controls that many early deployments lacked.
Microsoft's Experiences & Devices division -- the team behind Windows, Microsoft 365, Outlook, Teams, and Surface -- will cancel most of its internal Claude Code licenses by June 30, 2026, according to multiple reports. The reason: token billing that reached an estimated $500 to $2,000 per engineer per month had consumed the division's AI budget well ahead of schedule.
What happened at Microsoft and Uber
Microsoft was not the first large company to hit the enterprise AI coding cost wall. Uber's chief technology officer disclosed to The Information in April that Claude Code adoption in the company's roughly 5,000-engineer organisation had jumped from 32% to 84% of staff by March 2026 -- and in doing so, burned through the company's entire planned 2026 AI coding budget in just four months.
The pattern at both companies is the same: developers found the tool genuinely useful and used it constantly. The budget did not break because the tool underdelivered -- it broke because engineers used it so heavily that per-token costs compounded into numbers no one had planned for.
Why enterprise AI coding costs are so hard to predict
Flat subscription pricing is predictable: you know the ceiling. API token billing works differently. A developer running an AI coding assistant through a quiet afternoon might spend very little. But reading files, writing patches, running tests, and iterating through errors each generate token exchanges, and high-capability models charge a premium on output tokens. At scale -- thousands of engineers, dozens of interactions per hour -- the total grows fast, and without real-time visibility or hard spending caps, the first warning often comes when the monthly invoice arrives.
That is what caught both Uber and Microsoft. The tools were working; the accounting was invisible.
What Microsoft is replacing Claude Code with
Microsoft is steering Experiences & Devices engineers toward GitHub Copilot CLI -- a tool Microsoft already owns and controls, priced at a flat $10 per month (Pro) or $19 per seat per month (Business), with usage caps built in. The capability ceiling is lower than Claude Code, but the cost structure is predictable.
The decision illustrates the central enterprise AI trade-off: open-ended capability with open-ended billing versus constrained capability with constrained billing. For a large division with thousands of engineers and a fixed annual budget, predictability currently wins.
What individual developers and small teams should take from this
At smaller scale -- individuals and teams under a few dozen engineers -- the calculus is different. Monthly API costs in the tens or low hundreds of dollars are manageable and, for heavy users, often cheaper than a comparable subscription tier. The critical variable is the same one that tripped up enterprise teams: cost visibility.
If you can see what you are spending per session, per model, and per task in real time, you can route lighter work to cheaper models and reserve the most capable ones for tasks that actually need them. That transparency is the core premise behind bring-your-own-key tools like ByteChat -- you pay the API provider's listed price with zero markup, and per-model spend is visible as you go.
Frequently asked questions
Why is Microsoft cutting Claude Code licences in June 2026?
Microsoft's Experiences & Devices division cancelled most internal Claude Code licenses because token billing reportedly reached $500 to $2,000 per engineer per month, exhausting the division's AI budget ahead of schedule. Engineers are being moved to GitHub Copilot CLI by June 30, 2026.
How much can Claude Code cost for enterprise teams?
Claude Code bills at Anthropic's standard API rates. In heavy daily use across large engineering organisations -- with thousands of file reads, edits, and test cycles per engineer -- costs can reach hundreds to thousands of dollars per person per month, particularly when high-capability models handle output-intensive tasks at premium per-token rates.
How do developers avoid surprise AI API bills?
Set hard spending limits or alerts through your API provider's billing dashboard before rolling out AI coding tools broadly. Track per-model and per-session spend in real time so you catch runaway costs early. Routing lighter tasks to cheaper models and reserving high-capability models for complex reasoning can cut costs significantly without a noticeable quality drop.
The Microsoft and Uber situations are not a verdict on AI coding tools -- they are a signal that cost visibility needs to catch up with capability.