Four Durable Agentic Patterns

Many so-called agentic applications still begin as deterministic workflows. That is often the right starting point, but it does not take long before the workflow becomes rigid, brittle, and difficult to evolve. The image below summarizes four durable agentic patterns that gradually increase autonomy, from deterministic workflows to agent orchestration. In this post, we go through each pattern and the trade-offs it brings. If you want the walkthrough first, watch this recording.

What is a durable AI agent?

A durable agent persists every step of its execution loop: each LLM call, tool invocation, and decision, to an immutable state store. If the process crashes, it continues exactly where it left off, without re-executing completed steps. That gives you:

Reliability. The agent survives failures. A crash between step 3 and step 4 does not mean starting over from step 1.
Auditability. Every step is recorded with its inputs and outputs, so you can inspect exactly what the agent did and what the LLM returned.
Troubleshooting. The full execution can be traced end-to-end, making it possible to pinpoint exactly where things went wrong and why.

All four patterns are built with Dapr Agents, and you can run each one from the companion repo.

The use case

All four patterns use the same customer support scenario: a system that checks customer entitlement, assesses urgency, and generates a resolution. The use case stays the same, but the architecture changes, from a hardcoded two-step deterministic workflow to a fully autonomous multi-agent system.

Deterministic AI workflow

The simplest pattern is a Dapr Workflow that makes direct LLM calls. This is how many teams start building AI applications, not with a fully autonomous agent, but by implementing the loop themselves in a workflow. It is not a truly autonomous agent. It is a hand-built AI application that behaves like one.

*A durable workflow that makes direct LLM calls and uses coded branching to classify a ticket and generate a resolution.*

In our example, the workflow performs two LLM calls and implements the loop manually. The first call classifies a support ticket by priority and category. Then a plain Python if statement decides what happens next. If the priority is "high", the workflow calls a second LLM to generate a detailed resolution. Otherwise, it returns a standard acknowledgement.

@wfr.workflow(name="support_workflow")
def support_workflow(ctx: DaprWorkflowContext, input_data: dict):
   # Step 1: LLM classifies the ticket
   classification = yield ctx.call_activity(
       classify_ticket,
       input=ticket_info,
   )

   # Step 2: Coded gate, deterministic branching
   if classification.get("priority") != "high":
       return {
           "status": "acknowledged",
           "priority": priority,
       }

   # Step 3: LLM generates resolution
   resolution = yield ctx.call_activity(
       generate_resolution,
       input=resolution_input,
   )

What Dapr Workflow gives you here is durable execution. If the application crashes after classification but before resolution, it resumes from the last completed step without re-running previous work.

*Diagrid Catalyst visualizing a workflow execution with inputs, outputs, timing, and branching between workflow steps.*

This pattern works well when you know the exact steps in advance. The control flow is fully deterministic, and for a fixed two-step pipeline that means predictable behavior, exactly two LLM calls, and minimal cost. But it is brittle too. The workflow hardcodes the sequence: classify, gate, resolve. Any change, such as adding a step, handling an edge case, or adjusting the threshold, requires code changes and often new workflows and activities. The gate is binary. if priority != "high" leaves no room for nuance. A "normal" ticket mentioning data corruption may still deserve escalation, but the workflow cannot reason about that. It also cannot go back to the user for more information, and each activity has its own isolated prompt unless you explicitly pass context through.

So let's give this system more autonomy and turn it into an agent that can handle more varied inputs and respond more dynamically.

Autonomous AI agent

Now we convert the workflow into an agent. We give it access to an LLM and a set of tools, but this time the agent controls the interaction loop. Instead of us defining the steps, the LLM decides which tools to call and in what order.

*A durable agent that uses an LLM and tools to decide how to handle a support request.*

The agent has two tools: check_entitlement, which returns whether a customer has active support, and get_customer_environment, which returns infrastructure details such as Kubernetes version and region. You give the agent a role, a goal, and instructions. The LLM figures out the rest.

agent = DurableAgent(
   name="support-agent",
   role="Customer Support Agent",
   goal="Handle customer support tickets by checking entitlement and providing resolutions.",
   instructions=[
       "Check entitlement first. If not entitled, reject the request.",
       "If entitled, get environment details and provide a resolution.",
   ],
   tools=[check_entitlement, get_customer_environment],
   llm=DaprChatClient(component_name="agent-llm-provider"),
)

The LLM reads the customer's issue, calls the appropriate tools, interprets the results, and generates a resolution. It also maintains a conversation thread, so every step has access to previous context without manual state passing. Under the hood, Dapr Agents wraps the reasoning loop in a durable workflow, so if the agent crashes mid-execution, it resumes from the right point.

The agent can call tools in different orders, skip a tool entirely, or take an extra step depending on the input. For example, if the customer already includes environment details, the agent may skip the lookup. The trade-off is determinism. The workflow guarantees exactly two LLM calls in a fixed order. The agent may take more turns and reason differently on the same input.

If you look at this agent closely, it is trying to do two things at once. It reasons about entitlement and urgency, and it also acts as a technical expert. As you add more tools and responsibilities, the agent becomes harder to steer and less effective. A natural next step is to split it into specialized agents. That is what we do next.

Deterministic multi-agent orchestration

In this pattern, we split the single agent into two specialized agents, while keeping the coordination between them deterministic.

*A parent workflow coordinates specialized agents in a fixed sequence, with triage first and expert resolution second.*

The triage agent knows only about entitlement and urgency. It has one tool, check_entitlement, and focused instructions. The expert agent knows only about environments and resolutions. It has one tool, get_customer_environment, and instructions to diagnose and resolve issues.

Every Dapr durable agent runs as a workflow. That means a parent workflow can invoke each specialized agent as a child workflow. The parent controls the sequence, and each agent handles reasoning within its own domain.

@workflow_runtime.workflow(name="support_multi_agent_workflow")
def support_multi_agent_workflow(ctx: DaprWorkflowContext, input_data: dict):
   # Call triage agent as a child workflow
   triage_response = yield ctx.call_child_workflow(
       workflow="agent_workflow",
       input={"task": f"Customer: {name}. Issue: {issue}. Assess entitlement and urgency."},
       app_id="triage-agent",
   )

   if not triage_result.get("entitled"):
       return {"status": "rejected"}

   # Call expert agent as a child workflow
   expert_response = yield ctx.call_child_workflow(
       workflow="agent_workflow",
       input={"task": f"Customer: {name}. Issue: {issue}. Propose a resolution."},
       app_id="expert-agent",
   )
   return {"status": "completed", "result": expert_result}

This gives you deterministic coordination, triage always runs before resolution, and the expert agent is called only for entitled customers. At the same time, each agent still reasons autonomously within its own scope. Each one is a separate Dapr app with its own state, logs, and tools. In practice, each specialized agent is an independent microservice that can be reached over REST API, pub/sub, or as a child workflow. Different teams can own, test, and deploy these agents independently.

The parent workflow persists state after each child workflow call. If the system crashes between agents, the parent resumes and invokes the next agent without re-running the previous one.

This is a solid pattern, but the flow is still fixed. If you want to add a billing agent or a sentiment analysis agent, you still have to update the workflow code. So let's look at what happens when the orchestration itself becomes LLM-driven.

Autonomous multi-agent orchestration

These four patterns reflect a natural progression for AI applications. You start with a few LLM calls in a deterministic workflow, then move to a single autonomous agent, and then to multiple agents. In the final pattern, we take that progression one step further and replace the workflow coordinator with an orchestrator agent that uses an LLM to decide which agents to delegate to and in what order.

*An orchestrator agent dynamically discovers specialist agents, delegates tasks, and combines their results into a final response.*

The orchestrator is a DurableAgent with OrchestrationMode.AGENT. It does not have tools, and it does not try to solve the problem itself. Instead, it discovers specialist agents from a shared registry, creates a plan, delegates tasks, and returns the final response.

orchestrator = DurableAgent(
   name="support-agent-orchestrator",
   goal="Coordinate triage and expert agents to handle customer issues.",
   instructions=[
       "Delegate to triage-agent to check entitlement and assess urgency.",
       "If entitled, delegate to expert-agent to diagnose and resolve the issue.",
       "Synthesize results into a customer-friendly response.",
   ],
   execution=AgentExecutionConfig(
       orchestration_mode=OrchestrationMode.AGENT,
   ),
   registry=AgentRegistryConfig(store=StateStoreService(store_name="agent-registry")),
)

The triage and expert agents register themselves on startup, so the orchestrator can discover them at runtime. When a new task arrives, the orchestrator creates a plan, delegates work to the right agents, validates progress after each response, and synthesizes the final result.

The key difference from the deterministic multi-agent pattern is dynamic delegation. Add a new specialist agent, and the orchestrator can discover and use it at runtime without code changes. If a triage result is ambiguous, the orchestrator can ask for clarification before continuing.

This pattern pays off when the logic is not deterministic, when the flow may change depending on the input. The trade-off is predictability. It is harder to know exactly how the orchestrator will handle a given request, and the extra LLM calls add latency. But because every delegation is persisted as a workflow step, you can still trace the full execution path from the orchestrator to each sub-agent. That makes even complex interactions inspectable and debuggable.

Durable execution all the way down

These four patterns all build on the same foundation: durable execution with Dapr Workflows.

In the deterministic workflow, you define the activities and control flow explicitly. In the autonomous agent, Dapr Agents wraps the reasoning loop in a workflow automatically. In the deterministic multi-agent pattern, you have a deterministic business process with autonomy at the edges. The workflow decides when and which agent runs, while each agent reasons independently. In the autonomous orchestration pattern, you have a fully autonomous business process. You provide intent, and the orchestrator figures out how to satisfy it.

Pattern	Control	Flexibility	Cost
Deterministic AI Workflow	Fully deterministic	Low, hardcoded steps	Lowest, exactly 2 LLM calls
Autonomous AI Agent	LLM-driven	Medium, adapts to input	Medium, variable LLM calls
Deterministic Multi-Agent System	Deterministic process, autonomous agents	High, modular agents	Medium, 2 agent invocations
Autonomous Multi-Agent System	Fully LLM-driven	Highest, dynamic discovery	Highest, orchestrator plus agent LLM calls

As you move toward more autonomy, the need for visibility, control, and reliability only grows. That is where Diagrid Catalyst helps. The Catalyst dashboard lets you visualize workflow executions in real time, inspect each step's input and output, drill into agent reasoning loops, and debug failed workflows. When an agent produces an unexpected result, you can trace the exact sequence of LLM calls and tool invocations, from the orchestrator down to each sub-agent.

If you want to see these patterns demonstrated end to end, check out this recording. You can also try all four patterns from the companion repo. Sign up for Diagrid Catalyst and start building.

To watch the full Dapr 1.17 Celebration Event, including all sessions and demos, you can view the recording here.

Four Durable Agentic Patterns

What is a durable AI agent?

The use case

Deterministic AI workflow

Autonomous AI agent

Deterministic multi-agent orchestration

Autonomous multi-agent orchestration

Durable execution all the way down

Related Articles

Still Not Durable

Checkpoints Are Not Durable Execution

Agent Identity: The Foundational Layer that AI Is Still Missing