Skip to content
Cogitate
Go back

Notes on agentic applications in business processes

| Björn Roberg, GPT-5.1 Edit page

Position: durable workflows pair naturally with agentic workflows.

Put together, you get:

This post walks through:


Where Durable + Agentic Workflows Shine

These are business areas where “smart but fragile” agents become “smart and production-safe” once wrapped in durable orchestration.

High-Fit Business Areas

AreaAgent Does…Durable Workflow Does…
Customer onboarding & KYCUnderstand docs, ask for missing info, choose checksTrack all steps, retry APIs, enforce SLAs and approvals
Loan / credit underwritingInterpret financials, edge cases, draft rationaleOrchestrate bureaus, risk models, audit logs, notifications
Claims processingRead narratives/photos, propose coverage decisionsCoordinate intake → adjuster → documents → payout
Order-to-cash / quote-to-orderConfigure quotes, negotiate constraintsRoute approvals, contract flow, provisioning, invoicing
IT / HR service desksTriage, root-cause reasoning, run automated fixesManage SLAs, escalations, multi-team handoffs
Marketing & sales cadencesPersonalize outreach, adapt per responsesHandle timing, throttling, channel sequencing, logging
Procurement & vendor mgmtCompare bids, summarize contractsRun RFx stages, approvals, onboarding, renewals
Compliance workflowsInterpret regs, draft policies, review exceptionsEnforce required steps, evidence retention, sign-offs
Non-diagnostic health adminCoordinate benefits Q&A, reminders, educationManage multi-visit journeys, pre-auth, scheduling, follow-ups
Account management / CSDraft QBRs, suggest plays, interpret signalsRun multi-quarter plans, task orchestration, renewals

Pattern: anywhere you have multi-step, cross-system processes with human nuance, this pairing is strong.


FSMs: Putting Guardrails Around Agents

A core challenge with agents is that they’re “too free-form.” FSMs are a simple, powerful way to constrain behavior without killing flexibility.

Basic Idea

Think of it as:

Toy FSM for an Agentic Workflow

type State =
  | "CollectRequirements"
  | "Disambiguate"
  | "Plan"
  | "ExecuteTools"
  | "Summarize"
  | "Escalate";

interface Context {
  userInputs: string[];
  plan?: string;
  toolsRun: number;
  errors: string[];
  readyToExecute: boolean;
  done: boolean;
}

function transition(state: State, ctx: Context): State {
  switch (state) {
    case "CollectRequirements":
      return ctx.readyToExecute ? "Plan" : "CollectRequirements";

    case "Plan":
      return ctx.plan ? "ExecuteTools" : "Escalate";

    case "ExecuteTools":
      if (ctx.done) return "Summarize";
      if (ctx.errors.length > 2) return "Escalate";
      return "ExecuteTools";

    case "Summarize":
      return "Summarize";

    case "Disambiguate":
    case "Escalate":
      return state;
  }
}

The LLM’s job is to update Context (e.g., set readyToExecute, fill plan, mark done). The FSM decides what’s allowed next.


FSMs + MCP Servers: Structured Tool Use

MCP servers provide tooling backends (APIs, DB access, services). An FSM can:

Example:

StateAllowed MCP CapabilitiesRequired Before Transition
CollectRequirementsNone (chat only)Mandatory fields present (email, accountId, goal)
PlanRead-only MCP tools (search, knowledge base, schemas)Plan text + a list of tool calls with arguments
ExecuteToolsFull MCP access for this domainEither done=true or maxSteps reached
SummarizeRead-only history + notification toolsAt least one tool run, or explicit “no-op” explanation
EscalateTicketing / human handoff toolsEscalation reason + relevant context bundle

This yields a more enforceable contract between your agent and your infrastructure.


Markov Chains: Probabilistic Control over Agent Strategies

FSMs are deterministic; sometimes you want probabilistic “what tends to work next?” behavior. That’s where a Markov-style model fits: treat “what the agent tries next” as a stochastic policy.

Framing

This is a small Markov Decision Process (MDP) over agent strategies.

Simple Markov Policy Table

Imagine we’re handling tool errors:

State (simplified)Next ActionProbability
(errorClass=timeout, retries=0)retry_last_tool0.7
switch_tool_family0.2
escalate0.1
(errorClass=timeout, retries>=2)switch_tool_family0.6
escalate0.4
(errorClass=validation, retries=0)ask_followup0.8
escalate0.2

You can learn or hand-tune these numbers based on what historically works.

Pseudocode for a Markov Policy Wrapper

type HighLevelAction =
  | "ask_followup"
  | "retry_last_tool"
  | "switch_tool_family"
  | "escalate"
  | "short_circuit_success";

interface MarkovState {
  errorClass: "none" | "timeout" | "validation" | "unknown";
  retries: number;
  stepIndex: number;
}

function sampleNextAction(s: MarkovState): HighLevelAction {
  if (s.errorClass === "timeout" && s.retries === 0) {
    return weightedSample({
      retry_last_tool: 0.7,
      switch_tool_family: 0.2,
      escalate: 0.1,
    });
  }

  if (s.errorClass === "timeout" && s.retries >= 2) {
    return weightedSample({
      switch_tool_family: 0.6,
      escalate: 0.4,
    });
  }

  // …other cases…

  return "escalate";
}

The LLM still decides content and tool arguments; the Markov layer decides which high-level move to try next.


Combining It All with Durable Workflows

Durable workflows give you reliability over time; FSMs/Markov give you command and control; agents/MCP give you semantics and capabilities.

Execution Loop Sketch

  1. Load workflow instance

    • Current FSM/Markov state
    • Context (user inputs, history, plan, tool results)
  2. Decide control step

    • Use FSM or Markov policy to pick:
      • Next state (if FSM), and/or
      • Next high-level action (strategy).
  3. Invoke agent + MCP tools

    • Call the LLM with:
      • System prompt describing current state + allowed actions/tools
      • Conversation history and context
    • If needed, call MCP tools the LLM selected.
  4. Update state & persist

    • Update context from LLM + tool results (e.g., readyToExecute, done, errors[]).
    • Compute next FSM/Markov state.
    • Persist again; schedule next “tick” or finish.
  5. Repeat until terminal state (Summarize or Escalate).

Pseudo-workflow tick:

async function workflowTick(instanceId: string) {
  const { state, context } = await loadInstance(instanceId);

  const nextState = transition(state, context);           // FSM step
  const highLevelAction = sampleNextAction({
    errorClass: context.errors.at(-1)?.class ?? "none",   // Markov step
    retries: context.retries,
    stepIndex: context.stepIndex,
  });

  const llmResult = await callAgent({
    state: nextState,
    action: highLevelAction,
    context,
    allowedTools: toolsForState(nextState),
  });

  const { updatedContext, toolCalls } = await runTools(llmResult, context);

  await saveInstance(instanceId, {
    state: nextState,
    context: updatedContext,
  });

  if (!updatedContext.done && nextState !== "Summarize") {
    await scheduleNextTick(instanceId);
  }
}

Practical Ways to Go Further

Here are concrete next steps to turn these ideas into something real and robust.

1. Start with a Single, Narrow Use Case

Pick one process that is:

Examples:

Implement:

2. Make States and Actions Observable

Log:

This makes it easy to:

3. Explicitly Define Contracts per State

For each state, write down:

This can live as:

4. Close the Loop with Metrics

Track at least:

Then:

5. Gradually Increase Autonomy and Scope

Once the initial flow is stable:

Always keep:


Edit page
Share this post on:

Previous Post
Introducing casq (a simple content-addressable file storage CLI and library)
Next Post
Terminal multiplexers