Logo
Back to Blog
AI & LLMsJune 3, 202615 min read

Microsoft MAI Models Developer Guide: 7 In-House AI Models

At Build 2026, Microsoft launched seven in-house MAI models across reasoning, coding, image, voice, and speech, all trained from scratch with zero distillation. Full developer breakdown: MAI-Thinking-1, MAI-Code-1-Flash, MAI-Image-2.5, MAI-Transcribe-1.5, MAI-Voice-2, benchmarks, Foundry pricing, how to access them, and where each fits in your stack.

Lushbinary Team

Lushbinary Team

AI & Cloud Solutions

Microsoft MAI Models Developer Guide: 7 In-House AI Models

On June 2, 2026, at its Build developer conference, Microsoft did something it had never done before: it shipped a full family of frontier AI models built entirely in-house. Seven of them, spanning reasoning, coding, image generation, transcription, and voice. The headline is not just the models themselves, but how they were made. Every one was trained from scratch, with what Microsoft AI chief Mustafa Suleyman called "zero distillation," on clean and appropriately licensed data.

This matters because Microsoft has spent years building its AI business on OpenAI, and more recently Anthropic. The MAI family is a deliberate step toward what Microsoft calls "long-term self-sufficiency." For developers and enterprises, that translates into models Microsoft owns end-to-end, running on Microsoft's own cloud and silicon, which the company says lets it drive down token costs and tune models to your exact workflows.

This guide breaks down all seven models: what each one does, the benchmarks Microsoft published, the Foundry pricing where available, how to access them, and where each fits in a real production stack. We have separate deep dives for MAI-Thinking-1, MAI-Code-1-Flash, the speech models, and MAI-Image-2.5.

1Why Microsoft Built Its Own Models

Microsoft's relationship with OpenAI made it the most prominent beneficiary of the generative AI boom. Copilot, Azure OpenAI Service, and a long list of product integrations all leaned on OpenAI models. But dependence on a single partner carries strategic risk: pricing, roadmap, and availability are not fully in your control. In April 2026, Microsoft renegotiated its OpenAI deal, ending exclusivity and revenue-sharing arrangements. That created the contractual room to build from scratch.

The MAI family is the result. Suleyman framed it around a single idea: "All these models are built on a shared foundation, hill-climbing from the bottom with zero distillation. They share the same data discipline, the same infrastructure and the same evaluation framework." In plain terms, Microsoft trained the entire family on the same in-house pipeline rather than copying behavior from existing frontier models.

Why this matters for developers

Because Microsoft owns the models and the cloud compute that powers them, it controls the full cost structure. That is the lever it is pulling to offer competitive token pricing on Foundry, and it is why the models are deeply wired into products you already use such as GitHub Copilot, PowerPoint, OneDrive, Teams, and Dynamics 365.

Microsoft also co-designs the models with its own Maia 200 accelerators and reports a 1.4x efficiency boost from that work. The company says it does not distill from other labs and does not rely on unlicensed or opaque data, positioning provenance and trust as first-class features rather than afterthoughts.

2The Seven Models at a Glance

The family covers five modalities. Two of the seven are Flash variants, tuned for lower cost and higher throughput. Here is the full lineup with the key facts Microsoft published.

ModelModalityHeadline Spec
MAI-Thinking-1Reasoning35B-active / ~1T-total sparse MoE, 256K context
MAI-Code-1-FlashCoding5B params, +16 pts over Haiku 4.5 on SWE-Bench Pro
MAI-Image-2.5ImageNo. 2 image editing on Arena, No. 3 text-to-image
MAI-Image-2.5-FlashImageLower-cost, scalable generation and editing
MAI-Transcribe-1.5Speech-to-text43 languages, 2.4% WER, 1 hr audio in under 15s
MAI-Voice-2Text-to-speech15 languages, voice cloning, emotion control
MAI-Voice-2-FlashText-to-speechUltra-efficient, lower-cost (coming soon)

A few of these are point upgrades of models Microsoft shipped earlier in the spring (MAI-Image-2, MAI-Voice-1, MAI-Transcribe-1), while MAI-Thinking-1 and MAI-Code-1-Flash are brand new entries into reasoning and coding.

Shared Training FoundationZero distillation, clean licensed data, Maia 200Thinking-1ReasoningCode-1-FlashCodingImage-2.5ImageTranscribe-1.5SpeechVoice-2TTSMicrosoft Foundry + 1P ProductsCopilot, VS Code, PowerPoint, OneDrive, Teams, Dynamics 365

3Reasoning & Coding: MAI-Thinking-1 and MAI-Code-1-Flash

MAI-Thinking-1

MAI-Thinking-1 is the flagship and Microsoft's first in-house reasoning model. It is a 35B-active, ~1T-total parameter sparse Mixture of Experts model with a 256K token context window, which Microsoft says is enough to process a 600-page document in a single pass. Despite the smaller active footprint, Microsoft reports it is toe-to-toe with Claude Opus 4.6 on SWE-Bench Pro, reaches 97.0% on AIME 2025 and 94.5% on AIME 2026, and was preferred over Claude Sonnet 4.6 in a blind human evaluation across 1,276 tasks run by its rating partner Surge.

It supports function calling, developer instructions, and the widely used Chat Completions API, and ships with enterprise security and compliance through Microsoft Foundry. For the full breakdown of architecture, benchmarks, and access, read our MAI-Thinking-1 developer guide.

MAI-Code-1-Flash

MAI-Code-1-Flash is a 5 billion parameter agentic coding model built end-to-end for the GitHub Copilot and VS Code harness. Microsoft trained it directly against the production Copilot harness, so it learns to interact with surrounding tools rather than just pass offline benchmarks. It outperforms Claude Haiku 4.5 across SWE-Bench Verified, SWE-Bench Pro, SWE-Bench Multilingual, and Terminal Bench 2, including a +16-point lead on SWE-Bench Pro (51.2% vs 35.2%), while solving harder problems with up to 60% fewer tokens.

It is rolling out to GitHub Copilot individual users in VS Code, both in the model picker and the default Auto picker. See our MAI-Code-1-Flash guide for the full benchmark tables and setup.

4Image, Voice & Speech Models

The multimodal side of the family is where Microsoft has the longest track record, and the new versions push hard on quality and cost.

  • MAI-Image-2.5 ranks No. 2 for image editing on Arena, ahead of Nano Banana 2.1, and No. 3 for text-to-image. It supports precise localized edits and preserves facial identity across pose and expression changes. The standard model is $5 per 1M text input tokens, $8 per 1M image input tokens, and $47 per 1M image output tokens on Foundry; the Flash variant is $1.75 / $1.75 / $19.50 respectively.
  • MAI-Transcribe-1.5 covers 43 languages with a 2.4% Word Error Rate on the Artificial Analysis leaderboard, can transcribe an hour of audio in under 15 seconds, and adds keyword biasing that cuts WER by up to 30% on domain-specific terms.
  • MAI-Voice-2 expands to 15 languages with emotion tags, zero-shot voice cloning from 5 to 60 seconds of reference audio, and code-switching for pairs like Hindi-English. It was preferred over MAI-Voice-1 72% of the time.

We cover the speech stack in the MAI-Voice-2 and MAI-Transcribe-1.5 guide and the image model in the MAI-Image-2.5 guide.

5The Hill-Climbing Machine & Frontier Tuning

Microsoft frames the MAI program around what it calls a "hill-climbing machine": a co-designed pipeline where every component of model development can be improved continually, cycle after cycle, as the team adds better data, stronger rewards, more capable environments, and more compute. Three principles guide it: capabilities should be learned rather than inherited, training data must be clean and licensed, and the entire stack from silicon to reinforcement learning framework is built in-house.

The more practical piece for enterprises is Frontier Tuning. Instead of a static model, Microsoft lets you adapt MAI models to your own workflows using reinforcement learning environments (RLEs) trained on the traces of real work your teams complete. Microsoft describes these as training gyms accessible only to you, where your institutional knowledge becomes part of a model that stays yours.

The efficiency claim to watch

Microsoft says a MAI model tuned for Excel matches GPT-5.4 while being up to 10x more efficient, and a model tuned to McKinsey's enterprise standards achieved the highest win rate of any model tested at roughly 10x lower cost. These are vendor-reported figures from controlled internal evaluations, so treat them as directional until independent benchmarks land.

Microsoft also announced a healthcare collaboration with Mayo Clinic to co-create a frontier clinical model owned by Mayo Clinic, first deployed in its own environment and later made available through Azure Foundry. For teams thinking about workflow-specific models, our guide on eval-driven development for LLM agents pairs well with the Frontier Tuning approach.

6How to Access the MAI Models

Availability differs by model, since some shipped to products before they reached the API. Here is where things stand as of the June 2, 2026 launch.

ModelWhere to Access
MAI-Thinking-1Microsoft Foundry private preview; MAI Playground public preview soon; Baseten for weight tuning
MAI-Code-1-FlashGitHub Copilot in VS Code (model picker + Auto picker)
MAI-Image-2.5 / FlashMicrosoft Foundry, MAI Playground, OpenRouter; live in PowerPoint and OneDrive
MAI-Transcribe-1.5Microsoft Foundry, MAI Playground; integrated into Copilot, Teams, GitHub, Dynamics 365
MAI-Voice-2Azure Foundry, MAI Playground; integrating into VS Code and Dynamics 365 Contact Center

Notably, Microsoft says that for the first time developers will be able to tune the weights of the models themselves through partners like Fireworks and Baseten, alongside the managed Foundry experience.

7Which MAI Model Should You Use?

The family is designed to compose, so most real applications will use more than one. A simple decision guide:

  • Complex reasoning, long documents, agents: MAI-Thinking-1 for its 256K context and strong SWE-Bench Pro and AIME results.
  • Everyday coding inside Copilot: MAI-Code-1-Flash for fast, token-efficient assistance that is tuned for the VS Code harness.
  • Product imagery and editing: MAI-Image-2.5 for maximum fidelity, MAI-Image-2.5-Flash for high-volume, cost-sensitive generation.
  • Transcription and meeting intelligence: MAI-Transcribe-1.5 for multilingual accuracy, speed, and keyword biasing.
  • Branded voice and audio experiences: MAI-Voice-2 for expressive multilingual TTS with consent-gated voice cloning.

If you are weighing MAI against other frontier options, our open-source LLM comparison and model routing guide are useful companions for building a multi-model stack.

8Why Lushbinary for MAI Integrations

A new model family is exciting, but production value comes from integration: routing the right request to the right model, controlling cost, and wiring models into real workflows. Lushbinary builds production AI integrations across Azure, AWS, and multi-cloud environments, and we help teams adopt new model families without betting the whole stack on a single vendor.

  • Model evaluation - we benchmark MAI models against your real tasks before you commit, so the choice is data-driven
  • Microsoft Foundry & Azure integration - secure, compliant deployment with the observability and guardrails enterprise teams need
  • Multi-model routing - blend MAI, OpenAI, Anthropic, and open-weights models behind a single gateway to optimize cost and quality
  • Frontier Tuning & custom models - design the data pipelines and reinforcement learning environments that make workflow-specific models pay off

🚀 Free Consultation

Evaluating Microsoft's MAI models for your product? Lushbinary will scope your use case, recommend the right model mix, and give you a realistic integration plan with cost projections, no obligation.

9Frequently Asked Questions

What are the seven Microsoft MAI models?

Microsoft launched seven in-house MAI models at Build 2026 on June 2, 2026: MAI-Thinking-1 (reasoning), MAI-Code-1-Flash (agentic coding), MAI-Image-2.5 and MAI-Image-2.5-Flash (image generation and editing), MAI-Transcribe-1.5 (speech-to-text), and MAI-Voice-2 with MAI-Voice-2-Flash coming soon (text-to-speech). All are trained from scratch with zero distillation on clean, licensed data.

Are the MAI models built on OpenAI technology?

No. Microsoft states all MAI models were trained end-to-end in-house with zero distillation from third-party models, using clean and appropriately licensed data with AI-generated content excluded from pre-training. The launch is explicitly framed as a move toward long-term self-sufficiency and reduced reliance on OpenAI and Anthropic.

Where can developers access the MAI models?

MAI models are distributed through Microsoft Foundry (Azure), with MAI-Thinking-1 in private preview. MAI-Code-1-Flash is rolling out to GitHub Copilot in VS Code. MAI-Image-2.5, MAI-Voice-2, and MAI-Transcribe-1.5 are in Foundry, and several models are also available on OpenRouter, Fireworks, and Baseten, where developers can tune the weights.

How does MAI-Thinking-1 compare to Claude and GPT models?

MAI-Thinking-1 is a 35B-active, ~1T-total sparse MoE model that Microsoft says is toe-to-toe with Claude Opus 4.6 on SWE-Bench Pro and was preferred over Claude Sonnet 4.6 in blind human side-by-side evaluations run by Surge across 1,276 tasks. It reaches 97.0% on AIME 2025 and 94.5% on AIME 2026.

What is Microsoft Frontier Tuning?

Frontier Tuning is Microsoft's approach to adapting MAI models to your own workflows using reinforcement learning environments (RLEs) trained on your data inside your environment. Microsoft reports its MAI tuned model for Excel matches GPT-5.4 while being up to 10x more efficient, and a model tuned for McKinsey's standards achieved the highest win rate at roughly 10x lower cost.

📚 Sources

Content was rephrased for compliance with licensing restrictions. Model specifications, benchmarks, and pricing sourced from official Microsoft AI and Microsoft Foundry announcements as of June 2, 2026. Benchmark and pricing figures are vendor-reported and may change - always verify on Microsoft's website.

Building With Microsoft MAI Models?

From model evaluation to Foundry integration and multi-model routing, Lushbinary helps teams ship AI features that are fast, cost-efficient, and production-ready. Let's talk about your project.

Ready to Build Something Great?

Get a free 30-minute strategy call. We'll map out your project, timeline, and tech stack - no strings attached.

Let's Talk About Your Project

Prefer email? Reach us directly:

Contact Us

Subscribe · Newsletter

Build With Microsoft MAI Models

Get practical guides on frontier models, agents, and cost control.

  • New deep-dives on AI agents and cloud architecture
  • Engineering teardowns of shipped products
  • No spam, unsubscribe in one click

We respect your inbox. Read our privacy policy.

Exclusive Offer for Lushbinary Readers
WidelAI

One Subscription. Every Flagship AI Model.

Stop juggling multiple AI subscriptions. WidelAI gives you access to Claude, GPT, Gemini, and more - all under a single plan.

Claude Opus & SonnetGPT-5.5 & o3Gemini ProSingle DashboardAPI Access

Use code at checkout for 10% off your subscription:

Microsoft MAIMAI ModelsMAI-Thinking-1MAI-Code-1-FlashMAI-Image-2.5MAI-Voice-2MAI-Transcribe-1.5Microsoft FoundryMustafa SuleymanLLM BenchmarksMultimodal AIBuild 2026

ContactUs