Stop Prompting, Start Scripting: Why Agent Script Changes Everything for Agentforce

Stop Prompting, Start Scripting: Why Agent Script Changes Everything for Agentforce | Enterprise Architecture
Agentforce · Agent Script · Enterprise Architecture

Stop Prompting, Start Scripting: Why Agent Script Changes Everything for Agentforce

Salesforce just quietly admitted that prompt engineering alone does not scale for enterprise AI agents. Agent Script, the new TypeScript-based language inside Agentforce Builder, is a hybrid reasoning framework that replaces “hope the LLM interprets this correctly” with executable guardrails. Here is what architects need to know.

Reading time: ~11 minutes | Published: April 2026 | Published By: Sandip Patel, Salesforce Architect
RELEASE Spring ’26 Builder and Agent Script are now GA
LANGUAGE TypeScript Built on TypeScript, not a proprietary DSL
RELIABILITY Hybrid Deterministic logic plus LLM reasoning
RENAME Subagents Topics renamed to subagents in April 2026
TL;DR

Agent Script is Salesforce’s answer to the reliability problem that plagued first-generation Agentforce builds. Instead of writing natural language topic instructions and praying the LLM interprets them the same way every time, you write declarative logic that compiles to an Agent Graph consumed by the Atlas Reasoning Engine. The architects who understand when to script and when to let the LLM think freely will build agents that actually survive production. The ones still writing paragraph-length prompts will not.

1
Why Prompt Engineering Hit a Wall
The reliability problem nobody wanted to name

Here is the conversation I have had with three different VPs of Engineering in the last two months. They built an Agentforce agent. The demo looked great. They moved it to UAT. It passed. They moved it to production. And then, for no reason anyone could explain, the agent started behaving differently on Tuesday than it did on Monday. Same user, same question, different answer.

They escalated to Salesforce. The response was polite and technically correct: LLMs are non-deterministic by design. Your topic instructions were interpreted slightly differently this time. You can try adding more examples to the prompt. You can try constraining the temperature. You can try rewriting the instructions to be more explicit.

That is not an engineering answer. That is a coping strategy.

The Core Problem

Every first-generation Agentforce build was a bet that a large language model would reliably interpret paragraphs of English instructions the same way, every time, across thousands of edge cases. For low-stakes use cases, it mostly worked. For anything touching money, compliance, or a customer’s actual record, “mostly” is not good enough.

Salesforce saw this too. They had thousands of Agentforce customers running pilots, and the ones that stalled all hit the same wall. Not because the model was bad. Because pure prompt-driven reasoning has no guardrails. An LLM told “if the order is over one hundred dollars, offer free shipping” will get it right most of the time. It will also, on occasion, decide that a ninety-nine dollar order is close enough. Or that shipping is already free. Or that the rule does not apply to wholesale customers even though you never mentioned wholesale customers.

In February 2026, Salesforce shipped their fix. Agent Script. And three days ago, they published a refreshed Agentforce Guide that flat out tells builders to move away from pure prompt-driven reasoning. That is a big deal. That is Salesforce, who spent an entire year marketing “just tell the agent what you want in plain English,” publicly pivoting.

2
What Agent Script Actually Is
Hybrid reasoning, not just a new syntax for old problems

Agent Script is a TypeScript-based scripting framework that sits inside Agentforce Builder. At compile time, it converts into an Agent Graph, which is the structured specification the Atlas Reasoning Engine actually consumes at runtime. You are not writing a prompt. You are defining a graph that tells the agent what it knows, what tools it can call, when it should follow rules, and when it should hand control back to the LLM for natural language reasoning.

The key phrase Salesforce is now using is “hybrid reasoning.” It is a genuinely useful framing. Pure prompt-driven agents are one extreme: everything is interpreted by the LLM. Pure rule engines are the other extreme: nothing is interpreted, every path is explicit. Agent Script sits in between, and lets you decide per step which mode you want.

🔐
Deterministic Logic

If/else conditions, variable comparisons, transitions, and loops. These run exactly as written, every time. No LLM interpretation, no drift.

🧠
LLM Reasoning Blocks

Handoff to the model for fuzzy tasks: intent classification, summarization, natural language responses. Use it where flexibility actually matters.

🔌
Tool Invocation

Call Apex methods, Flows, REST APIs, or subagents directly from the script. Tool selection can be deterministic or model-chosen depending on context.

📚
Subagents

What used to be called “topics” are now subagents as of April 2026. Each handles one domain, and the parent agent routes between them declaratively.

📝
Variables and State

Pass data between steps with typed variables. State survives across turns, so you can build multi-step workflows without losing context.

🔍
Testability

Scripts compile, which means they can be validated, version-controlled, and replayed. The Testing Center can run regression suites against them deterministically.

Here is what a simple Agent Script fragment looks like. If you have written any pseudocode or basic Flow logic, this will feel familiar:

AGENT SCRIPT# Shipping policy subagent
subagent shipping_policy:
  instructions: |
    Handle customer shipping questions for standard orders.
    Never promise delivery dates without calling the carrier lookup action.

  reasoning:
    if order.total >= 100:
      set shipping_fee = 0
      respond: "Free shipping applied to your order."
    else:
      set shipping_fee = 9.99
      -> let_llm_explain_fee

  actions:
    - get_carrier_eta(order.id)
    - update_order_shipping(shipping_fee)

Notice what is happening. The hundred-dollar threshold is not an instruction to the LLM. It is a condition the runtime evaluates. The agent cannot decide that ninety-nine dollars is close enough, because nothing is asking the agent’s opinion. The model only gets involved when the script explicitly hands over, like in the let_llm_explain_fee handoff above.

“Agent Script did not replace prompts. It put prompts in their place.”
3
Agentforce Builder: Three Views, One Source of Truth
Canvas, Script, and Chat modes, and when to use each

The new Agentforce Builder is the other half of this story. It is the workspace where Agent Script gets written, and Salesforce was smart about not forcing every admin to become a TypeScript developer overnight. The Builder gives you three views of the same underlying script, and you can switch between them at any time without losing state.

Chat View
Natural Language

Describe what you want in plain English. “If the customer asks about returns and the order is older than thirty days, escalate to a human.” The Builder generates Agent Script behind the scenes. Great for admins and first drafts. Not great when you need precision.

Canvas View
Low-Code Blocks

A visual block-based editor that reads the same underlying script. You can drag, drop, and expand individual steps. This is where most Salesforce admins will live. Think of it as Flow Builder for agents, except what you see is a rendering of real compiled logic, not hidden metadata.

Script View
Pro-Code Editor

The raw Agent Script editor with syntax highlighting, autocomplete, and real-time validation. Use this when you need precision, when you are version-controlling your agents in Git, or when the Canvas view cannot express the logic you need. This is where architects will spend their time.

Preview Panel
Simulate + Live Test

Two modes for testing. Simulate runs the script without executing actions or hitting real data, so you can validate logic in a sandbox. Live Test executes real actions against sandbox data. The Preview panel shows every step, every variable, and every LLM call, making debugging genuinely practical for the first time.

The critical architectural point: all three views read from and write to the same Agent Script source. An admin using Canvas view and a developer in Script view are working on the same object. No more “the agent looks fine in the builder but does something different in production” because the Builder is not hiding settings somewhere. What you see is what compiles.

Architect’s Tip

Treat Agent Script like real source code. Put it in version control. Review changes in pull requests. Build a regression suite in the Testing Center that runs on every deploy. The orgs that do this will be operating AI agents with the same rigor as microservices. The ones that keep editing in Canvas view with no change management will be back to non-deterministic production in a month.

4
The Architect’s Playbook: When to Script, When to Prompt
When to script, when to prompt

The temptation, when you get a new tool, is to use it everywhere. Do not do that here. Scripting every decision in an agent defeats the purpose of having an LLM at all. If you wanted rigid rules for everything, you already had Flow. The whole point of hybrid reasoning is deciding, step by step, where determinism matters and where flexibility matters.

Here is the rule I use when reviewing agent designs. Start by asking: if the agent gets this step wrong, what happens?

📋
Script It
When determinism matters more than flexibility
  • Business rules tied to money, discounts, pricing tiers, or contract terms
  • Compliance-sensitive logic where the same input must always produce the same output
  • Tool selection that should not vary based on interpretation
  • State transitions, especially anything that writes to a record
  • Authentication, permission checks, and identity verification steps
  • Rule of thumb: if you would not want a junior admin guessing, do not let the LLM guess either
🧠
Prompt It
When flexibility is the actual value
  • Understanding what the user meant, especially when they phrase things oddly
  • Generating natural-sounding responses and follow-up questions
  • Summarizing long records or knowledge articles into conversational answers
  • Classifying intent when the categories are fuzzy or overlapping
  • Anything where a human-in-the-loop would have used judgment, not a checklist
  • Rule of thumb: if a good answer varies based on tone or context, let the model handle it

A Practical Split for a Typical Service Agent

Take a service agent that handles return requests. A reasonable split looks like this. Use the LLM to understand that the customer is asking about a return and to extract the order number from free text. Use a scripted condition to check if the return window has expired. Use a scripted action to pull the order record. Use the LLM again to write a friendly response explaining the decision. Use a scripted action to create the return case in Salesforce with the right owner and priority.

In that flow, the LLM is doing what it is good at: understanding ambiguous language and generating natural responses. The script is doing what it is good at: enforcing policy and making sure data gets written correctly. Neither one is asked to do the other’s job.

DESIGN BUILD OPERATE
1
Map every decision to a determinism level
Before writing any script, list every decision the agent will make. For each one, ask: does the answer vary based on judgment, or is it always the same given the same inputs? Mark each as “deterministic,” “fuzzy,” or “policy-bound.” You will find that most real agents are about 60% deterministic and 40% fuzzy.
Do this before opening Builder
2
Split by subagent, not by feature
Each subagent should handle one coherent domain (shipping, returns, identity verification). Small, focused subagents are easier to test and easier to reuse across agents. A single giant subagent is where reliability goes to die.
The “topic bloat” trap
3
Write the deterministic spine first
Script the policy checks, state transitions, and tool calls before you write any natural language instructions. The LLM prompts are the last thing you add, because they sit in the gaps between deterministic steps. If you start with prompts, you will end up retrofitting the structure.
Highest-leverage habit
4
Build tests before you build confidence
Use the Testing Center to create a regression suite as you build each subagent. Include the happy path, the edge cases you thought of, and at least three adversarial inputs. Run the suite before every deploy. Scripted agents are testable in a way prompt-only agents never were. Take advantage.
This is the real unlock
5
Instrument conversations and review them weekly
Use Enhanced Event Logs to capture every production conversation. Review a random sample every week, looking for places where the LLM’s interpretation was surprising. Those are candidates for converting back to scripted logic. Production is the best teacher you have.
Close the loop
6
Version control from day one
Agent Script is code. Treat it like code. Export scripts to Git, review changes in pull requests, and deploy through your normal change management pipeline. Orgs that skip this will rediscover all the pain that made DevOps a discipline in the first place.
Non-negotiable for enterprise
Mostly deterministic agent (shipping, pricing, compliance)
~80% scripted, 20% LLM
Service agent (returns, case routing, FAQs)
~60% scripted, 40% LLM
Exploratory agent (research, writing, ideation)
~25% scripted, 75% LLM
5
Gotchas Salesforce Did Not Advertise
Honest limitations and migration pain

Agent Script is a real step forward, but it is not magic. Here is the stuff that is not in the keynote demos.

Migrating Existing Agents Is Not One-Click

If you built an agent in the classic Agent Builder with topic instructions written as paragraphs of English, you cannot just press a “convert to Agent Script” button and walk away. The conversion tool exists, but what it generates is a script that preserves your original structure, not one that takes advantage of deterministic logic. You still have to do the architectural work of identifying which decisions should become scripted conditions. In practice, migration is a rebuild, not a refactor.

The Topic-to-Subagent Rename Is Everywhere

Starting in April 2026, Salesforce renamed “topics” to “subagents” across the platform. The documentation, the Builder UI, the error messages, and most of the community content are now a patchwork of old and new terminology. If you are learning this cold, expect confusion. The functionality is identical. Only the name changed. But if you are reading a tutorial from December 2025, mentally substitute “subagent” every time you see “topic.”

Whitespace Sensitivity
Python developers will feel at home. Everyone else will not.
  • Agent Script uses indentation to define structure, like Python or YAML
  • Mixing tabs and spaces in a single script causes parsing errors that are hard to spot
  • Copy-pasting from Slack or email can introduce invisible whitespace corruption
  • Fix: use the Builder’s script view with proper syntax validation, and pick one indentation style per org
💡
The Determinism Tax
Scripting everything is its own failure mode
  • Teams see Agent Script and start scripting every decision “for reliability”
  • The result is a brittle rule engine that cannot handle the unusual inputs LLMs were good at
  • You end up with all the maintenance cost of custom code and none of the flexibility of AI
  • Fix: be explicit about which steps need determinism and leave the rest alone

Debugging Hybrid Logic Is Harder Than Either Extreme

When a pure prompt-based agent failed, you knew where to look: the prompt. When a pure rule engine fails, you know where to look: the rules. When a hybrid agent fails, the bug could be in your scripted logic, in the model’s interpretation of a handoff prompt, in the data the agent read before deciding, or in the interaction between any two of those. The Preview panel helps a lot, but expect your first few production incidents to take longer to root-cause than you planned for.

Real Talk

The Agentforce Testing Center is essential, not optional. With pure prompt agents, you could get away with spot-testing because the system was fuzzy by design. With hybrid agents, you have deterministic steps that must behave deterministically. If they drift, you need to know immediately. Regression testing was a “nice to have” before. It is now the difference between a reliable agent and an expensive mess.

The other thing no one mentions: Agent Script is evolving fast. Between the GA announcement in Spring ’26 and the refreshed guidance published just this month, Salesforce has already changed recommended patterns twice. If you are building something you need to run for years, expect to revisit your scripts every release. This is normal for a platform at this stage of maturity, but it is different from the stability most Salesforce architects are used to.

6
What This Means for the Next 12 Months
Context engineering is the new prompt engineering

The phrase Salesforce is pushing in the new guidance is “context engineering,” and I think it is the right framing. Prompt engineering treated the LLM as the whole system. Context engineering treats the LLM as one component inside a larger, partly-deterministic system. Your job as an architect is to decide what context the model sees, when it gets to decide versus when the script decides, and what guardrails surround every step.

That is a different skill than writing clever prompts. It is closer to distributed systems design. Which pieces need strong consistency? Which pieces can tolerate eventual consistency? Where are the boundaries between trusted and untrusted reasoning? The Salesforce architects who will stand out in 2026 are the ones who can answer those questions for an agent the same way they already answer them for microservices.

Phase 1
Understand Intent
LLM extracts what the user actually wants
Phase 2
Apply Policy
Scripted rules enforce business logic
Phase 3
Call Tools
Deterministic actions touch real data
Phase 4
Generate Response
LLM writes the reply in natural language

Salesforce’s own acquisition strategy points the same direction. The Informatica deal was about data quality. The Momentum acquisition was about unstructured context. The direction is clear: agents succeed or fail based on the quality of the context they reason over and the rigor of the logic that surrounds their reasoning. Agent Script is one piece of that. Data 360 is another. The architects who treat these as a single system will build something that works. The ones who treat them as separate projects will keep writing longer prompts and wondering why nothing gets better.

If you are an architect evaluating Agentforce right now, the question has shifted. It is no longer “can we build an agent?” The answer to that is yes, and it has been yes for a year. The question is: “Can we build an agent that behaves the same way on Tuesday as it did on Monday, that survives an audit, and that we would be willing to let touch a customer’s actual record without a human in the loop?”

With Agent Script, for the first time, the answer can be yes. But only if you use it the way it was designed to be used.

7
Frequently Asked Questions
Agent Script, Builder, and migration
Do I need to know TypeScript to use Agent Script?
No, but it helps. Agent Script is built on TypeScript, and the Script view will feel natural to anyone with JavaScript or TypeScript experience. That said, most admins will work primarily in Canvas view, which is a low-code block editor that generates the script behind the scenes. The Chat view lets you describe what you want in plain English. You only need to touch raw script when you need precision the Canvas view cannot express, which is where architects tend to spend their time.
Do I have to migrate existing Agentforce agents to Agent Script?
Not immediately. Existing topic-based agents continue to work, and Salesforce has not announced a hard deprecation date. But the refreshed Agentforce Guide makes the direction clear: Agent Script is where the investment is going, and new features are landing there first. For any agent you expect to run in production for more than six months, migrating is worth scoping now rather than later. Treat it as a rebuild, not a find-and-replace, because the real value is rethinking which decisions should be deterministic.
What is the difference between a topic and a subagent?
Nothing functionally. As of April 2026, Salesforce renamed “topics” to “subagents” across the platform. Old documentation still uses “topic,” and the Builder UI is being updated. Mentally substitute subagent whenever you see topic in a tutorial written before this change. The rename is part of a broader positioning shift: Salesforce wants builders to think of agents as composable, with each subagent handling one coherent domain.
Can I version control Agent Script like regular code?
Yes, and you should. Agent Script is stored as metadata and can be retrieved through the standard Salesforce metadata API. Put it in Git, review changes in pull requests, and deploy through your normal release pipeline. The orgs treating agents with the same discipline as microservices are the ones that will scale this. The ones editing directly in Canvas view with no change management will have the same reliability problems they had with unmanaged Flows, just with higher stakes.
How does Agent Script interact with Data 360?
Agent Script handles the logic and orchestration. Data 360 provides the context that the script reasons over. A typical production agent will use Data 360 to pull a unified customer profile, then use Agent Script to apply policy to that data and decide what to do next. The two are complementary. Trying to run Agentforce without a solid data foundation will fail no matter how good your scripts are, because the scripts will be reasoning against incomplete or inconsistent data.
Is there a cost difference between scripted logic and LLM calls?
Yes, and this is underappreciated. Every time the agent hands control to the LLM, you consume Einstein Requests or their equivalent under whatever Agentforce billing model your org is on. Scripted conditions, variable updates, and tool calls do not consume LLM capacity. A well-designed hybrid agent can be significantly cheaper to run than a pure prompt-based equivalent, because you are only paying for the model when you actually need its reasoning. For high-volume use cases, modeling this cost difference is worth doing before you go live.

Leave a reply

Your email address will not be published. Required fields are marked *