For the past few years, progress in AI has mostly meant one thing: bigger models trained on more data. Sakana AI, the Tokyo research lab, just shipped a product built on a different bet. On June 22, 2026, it launched Sakana Fugu, a multi-agent orchestration system that you talk to as if it were a single foundation model. Send a request to one OpenAI-compatible endpoint, and Fugu decides whether to answer directly or to assemble a team of specialist models behind the scenes.

The twist that makes Fugu interesting is that the orchestrator is itself a language model. Fugu was trained to call other LLMs in an agent pool, including instances of itself recursively, and to handle model selection, delegation, verification, and synthesis on its own. Sakana frames this as the next frontier: not owning the single smartest model, but learning how to route across many strong ones.

This guide breaks down what actually shipped: the orchestration idea, how the architecture works, the two tiers, the pricing, the vendor-reported benchmarks and their caveats, and where an orchestration model is the right tool versus where a single model still wins.

What This Guide Covers

What Is Sakana Fugu?
Orchestration Models: The Idea Behind Fugu
How Fugu Works: Route, Delegate, Verify, Synthesize
Fugu vs Fugu Ultra: The Two Tiers
Pricing and Access
Benchmarks and the Caveats That Come With Them
Where Fugu Fits (and Where It Does Not)
Why Lushbinary for Orchestration-Model Builds

1What Is Sakana Fugu?

Sakana Fugu is a multi-agent orchestration system delivered as a single, OpenAI-compatible model API. From the caller's side it looks like any other chat model: one base URL, one API key, one chat/completions request. What happens inside is the difference. Fugu treats a set of high-performing models as an agent pool and combines them according to your input, managing the whole exchange so you do not have to.

Sakana's own framing on the release page is that Fugu is "a language model trained to call various LLMs in an agent pool, including instances of itself recursively." In plain terms: the thing routing your request is not a rules engine or a classifier bolted onto a gateway, it is a model that learned coordination as a skill.

The launch also drew attention for its timing. It arrived days after a US export-control directive cut global access to Anthropic's frontier Fable 5 and Mythos 5 models, and Sakana pitched Fugu as a way to reach frontier-level capability without depending on any single vendor. We cover that angle in depth in our Fugu Ultra vs Fable 5 and Mythos comparison.

2Orchestration Models: The Idea Behind Fugu

The argument Sakana makes is straightforward. Hard, real-world tasks need a mix of skills: planning, coding, math, retrieval, careful checking. No single benchmark, and arguably no single model, is best at all of them at once. So the highest performance comes from collective intelligence: knowing which model to use for each part, delegating the work, and combining domain-specific strengths while routing around individual weaknesses.

Teams have done this manually for a while with multi-agent frameworks and LLM gateways. The catch is that those systems are complex to build and tune: you write the routing rules, the verification steps, the fallback logic, and you maintain all of it. Fugu's pitch is that the orchestration itself is learned and packaged behind one API, so you get the benefit of a coordinated team without standing up the coordination yourself.

The core shift

Monolithic models compete on size and training. Orchestration models compete on coordination: which specialist to call, when to verify, and how to combine answers. The unit of competition moves from one model to the system that conducts many.

3How Fugu Works: Route, Delegate, Verify, Synthesize

When a request arrives, Fugu runs it through a learned coordination process rather than a fixed pipeline. For a simple prompt it can answer directly to keep latency and cost low. For a complex, multi-step task it assembles a coordinated group of models and assigns roles, commonly described as a Thinker that plans, a Worker that executes, and a Verifier that checks the work, before synthesizing a single answer.

Two design choices stand out. First, the orchestrator can call instances of itself recursively, so a hard subtask can be decomposed again rather than forced onto one model. Second, the coordination is learned, which means Fugu can pick up collaboration patterns that a human writing routing rules might not think to encode.

Reporting around the launch describes a compact orchestrator, on the order of a 7B-parameter conductor trained with reinforcement learning, steering much larger frontier models in the pool. Sakana has not published a full technical paper at the time of writing, so treat the internal sizing details as reported rather than confirmed, and weight the behavior you can observe through the API over architecture claims.

4Fugu vs Fugu Ultra: The Two Tiers

Sakana ships two tiers behind the same API, so you choose per request which behavior you want.

Tier	Built for	Tradeoff
Fugu	Everyday, latency-sensitive work: chat, coding, code review	Prioritizes speed and lower cost
Fugu Ultra	Hard, multi-step problems where quality matters most	Coordinates a larger pool, so more latency and tokens

The flagship variant carries the model id fugu-ultra-20260615. A practical pattern is to default to Fugu for interactive and high-volume traffic, and reserve Fugu Ultra for the requests where a wrong answer is expensive: a complex refactor, a multi-file change, a hard reasoning or research task.

5Pricing and Access

As of June 2026, Sakana lists pay-as-you-go pricing for Fugu Ultra at roughly $5 per million input tokens and $30 per million output tokens, alongside subscription plans in the range of $20, $100, and $200 per month. Access is through one OpenAI-compatible API, so existing clients and coding harnesses can point at Fugu by changing the base URL and key.

Watch the internal token use

Because Fugu can fan a request out across several models and verify intermediate work, a single Ultra call can consume more tokens than one call to a single model. Budget against real traffic and confirm current rates and any regional availability limits on Sakana's pricing page before committing.

6Benchmarks and the Caveats That Come With Them

At launch, Sakana reported that Fugu Ultra stands with the frontier on several headline tests. The numbers below are vendor-reported as of June 2026 and not yet independently reproduced.

Benchmark	Fugu Ultra (reported)	What it measures
SWE-Bench Pro	73.7	Real-world software engineering tasks
TerminalBench 2.1	82.1	Agentic command-line and terminal tasks
GPQA-Diamond	95.5	Graduate-level science reasoning

There is a subtlety that makes these easy to misread: Fugu Ultra is a system score, not a single-model score. A high number can reflect excellent routing to the right specialist as much as raw model capability, and that is the point of the product. For a benchmark-by- benchmark walkthrough and how to validate the claims yourself, see our Fugu Ultra benchmarks guide.

7Where Fugu Fits (and Where It Does Not)

Fugu is a strong fit when your workload is varied and you would otherwise be building your own router:

Mixed task types in one product (chat, coding, research) where no single model is best across the board.
Teams that want frontier-level quality without committing to one vendor or maintaining bespoke multi-agent plumbing.
Hard, multi-step jobs where the extra latency and tokens of Ultra are worth a more reliable answer.

A single model can still be the better call when:

Latency and cost predictability dominate, and you want one model with a known token profile per request.
You need data residency or deployment control that a hosted orchestration API over a third-party pool does not give you, in which case an open-weight model you run yourself may fit better. See our open-weight model comparison.
The task is narrow and well understood, where a tuned single model plus a small amount of your own routing is cheaper to reason about.

8Why Lushbinary for Orchestration-Model Builds

Adopting an orchestration model is less about the API call and more about the system around it: how you route between Fugu and Fugu Ultra, how you control token spend when a request fans out, how you evaluate quality against your real workload, and how you keep a fallback if a provider or policy shifts under you. That is the engineering we do.

Lushbinary builds AI products and the infrastructure that makes them reliable: model routing, evaluation harnesses, cost controls, and provider-agnostic architectures that survive a vendor change. Whether you want to pilot Fugu against your current stack or design a routing layer you control, we can help you scope it and ship it.

🚀 Free Consultation

Curious whether an orchestration model like Fugu belongs in your stack? Lushbinary will review your workload, recommend a routing and evaluation approach, and give you a realistic plan with no obligation.

❓ Frequently Asked Questions

What is Sakana Fugu?

Sakana Fugu is a multi-agent orchestration system from Sakana AI, launched June 22, 2026, delivered as one OpenAI-compatible model API. Fugu is itself a language model trained to call a pool of other LLMs, and instances of itself recursively, then route, delegate, verify, and synthesize behind one endpoint.

How is Fugu different from a normal LLM?

A normal LLM answers from its own weights. Fugu decides whether to answer directly or to assemble a team of specialist models, assign roles, verify intermediate work, and combine outputs. From the outside it still behaves like a single model.

What are the Fugu and Fugu Ultra tiers?

Fugu is the standard tier, tuned for low latency and everyday work. Fugu Ultra (model id fugu-ultra-20260615) coordinates a larger pool of agents for hard, multi-step tasks where quality matters more than speed.

How much does Sakana Fugu cost?

Per Sakana's June 2026 pricing, Fugu Ultra pay-as-you-go is about $5 per million input tokens and $30 per million output tokens, with subscription plans around $20, $100, and $200 per month. Orchestration can use more tokens internally, so budget against real traffic.

Are Fugu's benchmark results independently verified?

No. The headline scores (Fugu Ultra at SWE-Bench Pro 73.7, TerminalBench 2.1 82.1, GPQA-Diamond 95.5) are vendor-reported as of launch. Validate them with your own evaluation.

Sources

Content was rephrased for compliance with licensing restrictions. Specifications, pricing, and benchmark data sourced from official Sakana AI materials as of June 2026. Benchmark figures are vendor-reported and may change. Always verify on Sakana's website.

Thinking About Orchestration Models?

Tell us about your workload and we will help you decide whether Sakana Fugu, a single model, or your own routing layer is the right fit, then build it with you.

Ready to Build Something Great?

Q: How is Fugu different from a normal LLM?

A normal LLM answers from its own weights. Fugu is an orchestration model: it decides whether to answer directly or to assemble a team of specialist models, assign roles, verify intermediate work, and combine the outputs. You still call one endpoint, so it behaves like a single model from the outside.

Q: How much does Sakana Fugu cost?

Per Sakana's pricing as of June 2026, Fugu Ultra pay-as-you-go is about $5 per million input tokens and $30 per million output tokens, with subscription plans around $20, $100, and $200 per month. Always confirm current pricing on Sakana's site, since orchestration can consume more tokens internally than a single model.

Get a free 30-minute strategy call. We'll map out your project, timeline, and tech stack - no strings attached.

Let's Talk About Your Project

Prefer email? Reach us directly:

connect@lushbinary.com

Sakana Fugu: The Multi-Agent Orchestration Model

What This Guide Covers

1What Is Sakana Fugu?

2Orchestration Models: The Idea Behind Fugu

3How Fugu Works: Route, Delegate, Verify, Synthesize

4Fugu vs Fugu Ultra: The Two Tiers

5Pricing and Access

6Benchmarks and the Caveats That Come With Them

7Where Fugu Fits (and Where It Does Not)

8Why Lushbinary for Orchestration-Model Builds

❓ Frequently Asked Questions

What is Sakana Fugu?

How is Fugu different from a normal LLM?

What are the Fugu and Fugu Ultra tiers?

How much does Sakana Fugu cost?

Are Fugu's benchmark results independently verified?

Sources

Thinking About Orchestration Models?

Ready to Build Something Great?

Contact Us

Build on Orchestration Models

One Subscription. Every Flagship AI Model.

More from the Blog

Claude Tag: Anthropic's Always-On AI Teammate in Slack

Seedance 2.5: ByteDance's 30-Second AI Video Model Guide

ContactUs

Our Address

Phone

Email