The Agent Fabric Playbook: Governing AI Sprawl Before It Governs You
Your enterprise will run agents from Agentforce, Bedrock, Foundry, OpenAI, and three vendors you have not heard of yet. Without a control plane, that becomes shadow AI at machine speed. Here is the architectural playbook for MuleSoft Agent Fabric, the four pillars that govern it, and the design patterns that keep multi-agent systems from imploding under their own weight.
Agent sprawl is the new shadow IT. Within 12 months, most enterprises will run agents from five or more vendors, with no unified way to discover, govern, or observe them. MuleSoft Agent Fabric is Salesforce’s answer: a four-pillar control plane (Agent Registry, Agent Broker, Flex Gateway, Agent Visualizer) that treats agents as first-class enterprise assets. Skip the flat orchestration architecture. Build hierarchical agent networks from day one.
In This Article
Walk into any enterprise IT meeting in 2026 and ask a simple question: how many AI agents are running in your environment right now? You will get one of two answers. Either a confident “around 30, give or take,” which is almost always wrong, or an honest shrug. Both answers point to the same problem.
Marketing has an Agentforce SDR agent. Service has a triage bot built on Amazon Bedrock. Finance is piloting something on Microsoft Foundry. The SaaS apps everyone uses are quietly shipping their own agents inside their products. And somewhere in engineering, a team built a custom agent on OpenAI last quarter that nobody outside that team knows about. Each one looked like a productivity win in isolation. Together, they look like 2014’s SaaS sprawl, except now the rogue tools can write to your database.
Andrew Comstock, SVP and GM of MuleSoft, called this out plainly. The reality is that most enterprises live in a multi-vendor world, and AI will not change that. The strategic challenge is no longer building a single agent. It is making all of them work together without burning down the house.
Agent sprawl is not a future problem. It is happening right now in every org with more than two AI use cases. The org that ignores it for six months will spend the next 18 months trying to inventory what is already running, who built it, what data it touches, and which agents are quietly calling each other in production.
Salesforce launched MuleSoft Agent Fabric in September 2025 as a direct response. At TDX 2026 in mid-April, they expanded it significantly with new discovery, authoring, and governance capabilities. The pitch is simple: a single control plane for every agent in your enterprise, regardless of where it was built or which LLM powers it. The execution, as always, is more complicated than the pitch.
If you have spent any time with MuleSoft, the four-pillar structure of Agent Fabric will feel familiar. That is intentional. The team behind Agent Fabric took the API management playbook (Anypoint Exchange, API Manager, Flex Gateway, Anypoint Monitoring) and applied it to a new asset type: AI agents. The naming is different, the protocols are new, but the architectural logic is the same one MuleSoft has been refining for over a decade.
Here is what each pillar actually does, and why each one matters when you are running 50 agents instead of 5.
Anypoint Exchange becomes the universal catalog for agents, MCP servers, and LLM providers. Every agent, internal or third-party, gets registered with metadata, ownership, lifecycle state, and version. No more “wait, we built that?” conversations.
Agent Broker is an intelligent routing service powered by an LLM of your choice. It decomposes a request into tasks, picks the best agent for each, and coordinates the handoffs. Connected through MCP and A2A, it handles the multi-step processes that no single agent can complete alone.
Every agent-to-agent and agent-to-tool call routes through Flex Gateway. PII detection, schema validation, attribute-based access control, prompt guardrails, and token rate limits all enforced at the network layer. Policies attach to agents, not to code.
A real-time interactive map of your agent network. See declared and runtime interactions, identify circular invocation patterns, track per-agent token spend, and replay sessions step by step. The “who called whom and why” view that was missing from every multi-agent system before this.
The pillars are designed to work together, but they do not have to. You can adopt Agent Fabric purely as a registry and governance layer while keeping your existing orchestration. You can use Agent Broker for orchestration while routing governance through a different platform. The modularity is the point. Most enterprises will start with one pillar and expand outward, which is exactly the right approach.
The Two Protocols You Need to Care About
Agent Fabric is built on two open standards that together define how agents talk to anything else: MCP (Model Context Protocol) and A2A (Agent-to-Agent). The distinction matters more than the marketing makes it sound.
- Use when an agent needs to call an API, query a database, or hit any non-agent system
- The agent acts as the MCP client; the system being called acts as the MCP server
- Salesforce wraps non-MCP APIs through MuleSoft so legacy systems become agent-ready without code changes
- Most LLMs lose accuracy past about 20-25 tools per context, so curate the tool list ruthlessly
- Think of it as: REST for agents
- Use when one agent needs another agent to complete a sub-task within a larger workflow
- Each agent exposes an “Agent Card” describing its skills, capabilities, and contract
- Enables multi-agent collaboration across vendor boundaries (Agentforce calling a Bedrock agent, for example)
- Flex Gateway intercepts every A2A call to enforce policies, even if the target agent is otherwise unsecured
- Think of it as: how agents introduce themselves to each other
Most teams skim past this part of the documentation, which is a mistake. Agent Fabric uses a specification-first model: every agent network is defined in a YAML file, and that YAML is what gets registered, deployed, governed, and versioned. The execution runtime is decoupled from the definition, which is the same pattern that made Kubernetes manifests and OpenAPI specs so durable.
Here is what that looks like in practice. You open Anypoint Code Builder, run the “Create an Agent Network Project” command, and get two files: agent-network.yaml and exchange.json. The YAML defines your brokers, the agents and MCP servers they can call, the LLM providers they use, and the policies that apply. That single file becomes the source of truth.
YAML# Simplified agent-network.yaml structure brokers: - name: order-fulfillment-broker card: name: "Order Fulfillment Broker" skills: ["verify-customer", "allocate-inventory", "calculate-shipping"] spec: llm: claude-sonnet-4 tools: - mcp: inventory-mcp-server allowed: ["check_stock", "reserve_units"] links: - agent: salesforce-customer-agent - agent: stripe-payment-agent policies: - pii-detector - prompt-guard - schema-validation
The reason this matters: portability. The same YAML can move between dev, staging, and production. Roles, regions, and subsidiaries can fork the definition without rebuilding flows. Versioning is built in. Audit trails are inherent. When the SOC team asks which agent had access to PII last Tuesday, you have an answer instead of a search party.
What Goes in a Broker Definition
A broker definition has two main sections: card and spec. The card follows the A2A specification and describes the broker’s contract: skills, capabilities, and how other agents can find it. The spec is where the actual logic lives.
Reference one of the LLMs defined in the services section. Different brokers can use different models. Use Claude for complex reasoning, GPT for structured output, smaller models for routing decisions. Cost optimization happens here, not in code.
Free-text instructions for this broker. Skip the meta-instructions like “split the prompt into tasks” or “select the best tool.” The broker handles that on its own. Use this for actual business rules: “Always check incident severity before paging on-call. Never escalate unless the customer is enterprise tier.”
Each MCP server you reference can expose dozens of tools. Use the allowed list to expose only what this broker needs. Modern LLMs degrade noticeably past 25 tools in context. Less surface area means better routing decisions and lower latency.
This is where you wire up inter-agent collaboration. Every linked agent here is something this broker can delegate to. The graph you draw with these links becomes your runtime topology. Get this wrong and you create cycles, hot spots, or single points of failure.
Agent Brokers compile down to standard MuleSoft applications running on CloudHub 2.0. That is a meaningful detail. It means brokers inherit the same logging, metrics, scaling characteristics, and operational tooling as any other Mule workload. You are not running a separate AI infrastructure stack. You are running MuleSoft, with agents on top.
This is the part of Agent Fabric most teams will get wrong on the first try. The intuitive design, when you have a handful of specialist agents, is to put a single super-broker at the top with access to all of them. Every request hits the super-broker, the super-broker picks the right agent, and the agent does its thing. Clean. Simple. Wrong.
The Salesforce architecture team is direct about this in their MuleSoft Agent Fabric Deep Dive: a flat, unrestricted architecture proves detrimental to overall efficiency and reliability as soon as the system grows. The reason is the LLM context window. The broker has to load the description and capability metadata for every agent it might call. Past about 20-25 entries, decision quality drops noticeably. The broker suffers from option paralysis, routing slows down, and consistency breaks.
The Hierarchical Alternative
The recommended pattern is a multi-level hierarchy that mirrors how real organizations work. A top-level broker delegates to domain-level sub-brokers, which delegate to specialist brokers, which finally call leaf agents that do the actual work. Each layer only loads the agents directly beneath it into context, which keeps token usage under control and decision quality high.
Salesforce architects describe two ways to slice the hierarchy. The first follows Conway’s Law: model the agent network on your real-world org chart. The C-suite, VPs, directors, and managers map to broker layers. Specialists and individual contributors map to leaf agents. Intuitive, easy to explain to business stakeholders, and matches the way humans already think about delegation.
The downside of the org chart approach is that companies reorganize. Every reorg becomes an agent network rebuild. The alternative is Domain-Driven Design: organize agents by business capability rather than by team. New employee onboarding crosses HR, IT, and Security, so the onboarding agent network spans those domains regardless of which VP owns which team this quarter.
- Easier to explain to business stakeholders
- Ownership and accountability map naturally
- Permissions inherit from existing org structures
- Brittle to reorganizations: every restructure means rework
- Best for: stable orgs with clear functional boundaries
- Stable across reorganizations
- Forces clear thinking about cross-functional processes
- Better fit for end-to-end customer journeys
- Harder to staff and assign ownership
- Best for: orgs that reorganize frequently or operate across business units
Guided Determinism: The Honest Acknowledgement
One of the more intellectually honest parts of the TDX 2026 announcement was the introduction of “guided determinism.” It is Salesforce admitting publicly what most experienced architects already suspected: fully autonomous multi-agent orchestration is not enterprise-ready. The architecture works as a hybrid. You define fixed handoff rules, escalation paths, and decision boundaries. The LLM handles reasoning within those guardrails, not around them.
This matters because it changes who needs to be in the room when you design your agent network. The architecture is sound. The institutional readiness is the harder part. Most organizations rushing to adopt Agent Fabric have not actually written down their escalation rules, defined what a good agent handoff looks like in their context, or identified who owns those decisions when they need to change.
Agent Fabric solves real problems, but it is not magic. Here are the issues you should expect to encounter and plan for, in roughly the order they tend to surface.
- Modern LLMs lose accuracy past about 25 tools per context
- Default broker behavior exposes every tool from every linked MCP server
- By the time you notice, the broker is already making bad routing decisions silently
- Fix: Use allowed lists aggressively. Curate tools per broker, not per server
- Easy to create accidentally when agents have overlapping capabilities
- Each cycle burns tokens, latency, and money before anyone notices
- Agent Visualizer can detect circular patterns, but only after you have run them
- Fix: Design your hierarchy with clear delegation rules. Leaf agents must not call back up
The Cost Conversation You Need to Have Early
Token costs in multi-agent systems compound in ways that single-agent systems do not. Every broker invocation consumes tokens for routing decisions, even before the underlying work happens. Multi-step workflows can easily traverse five or six agents, each making its own LLM calls. A flow that costs $0.02 in a single-agent system can cost $0.20 in a multi-agent one. Multiply by production traffic.
Per-agent token tracking is in the Agent Visualizer roadmap and is rolling out incrementally. Until it is fully there, set hard token rate limits at the Flex Gateway layer. The “AI Basic Token Rate Limiting” policy is one of the most important guardrails available, and most teams forget to enable it.
The two Flex Gateways requirement (one ingress, one egress) catches teams off guard during rollout planning. You need both, deployed in your private space, before you can run production traffic. This is non-negotiable for governance to work end-to-end. Budget the infrastructure for this in your initial scoping, not as an afterthought.
Observability is Still Maturing
Agent Visualizer is good. It is also clearly an early-version product. Basic logs, traces, and the a2a_total_calls and mcp_total_calls metrics are available today. Detailed request tracing through the full prompt and reasoning chain, agent health monitoring, agent session playback, and DAG visualization are all on the roadmap. If your security team requires immutable cognitive traces for audit before you go live, plan around the gap.
The mistake teams make with Agent Fabric is treating it as a big-bang initiative. “We will roll out Agent Fabric across the entire org in Q3.” That framing fails because it requires aligning too many stakeholders before you have anything concrete to show. The teams getting this right are starting smaller and earning permission to expand.
Here is the sequence that works in practice.
The most valuable artifact you can produce in your first 90 days with Agent Fabric is not a working agent. It is a reference YAML that your team can fork and adapt. Spend time getting that right, document the design choices, and treat it as the equivalent of a project skeleton. The next five agent networks will be 10x faster to build because of it.
For architects evaluating Agent Fabric in your org right now, the question is not “should we adopt this?” The question is: “Do we know what agents are already running, and do we have a control plane for them, or are we trusting that nothing has gone wrong yet because we cannot see when it does?”
If the answer is the second one, you know where to start.
