Authoring workflows

This is the deep reference for authoring Sapiom workflows — the step model and the patterns you reach for once you’re past the Quickstart. Your scaffolded project also ships an AGENTS.md with the same guidance inline, for your coding agent to read locally.

The step model

A workflow is one defineOrchestration({ name, entry, steps }). Each step is a defineStep({ name, next, run }); run(input, ctx) is ordinary async code that returns a directive telling the engine what to do next.

import { defineOrchestration, defineStep, goto, terminate } from "@sapiom/orchestration";

const start = defineStep({
  name: "start",
  next: ["finish"],          // the steps this one may hand off to
  async run(input, ctx) {
    return goto("finish", { greeting: "hello" });
  },
});

const finish = defineStep({
  name: "finish",
  next: [],
  terminal: true,            // a terminal step ends the workflow
  async run(input, ctx) {
    return terminate({ done: true });
  },
});

export const orchestration = defineOrchestration({
  name: "my-workflow",
  entry: "start",
  steps: { start, finish },
});

defineStep accepts:

Field	Purpose
`name`	the step’s id
`next`	step names this step may `goto` (the graph edges)
`terminal`	`true` if this step ends the workflow
`run(input, ctx)`	your code; returns a directive
`inputSchema`	optional Zod schema validating this step’s input
`timeoutMs`	optional per-step timeout
`canFail`	set `true` to allow returning `fail()`
`pause`	declares a pause/resume signal (see below)

Inside run, ctx gives you ctx.input, ctx.shared (a typed cross-step store), ctx.logger, ctx.attempts (this step’s retry count), ctx.executionId, and ctx.sapiom (the capability client).

Control flow & passing data

Advance with goto(targetStep, input); end with terminate(output). The value passed to goto becomes the next step’s input. For data several steps need, use ctx.shared (a typed key/value store). The entry input reaches only the entry step’s input — to use any of it in later steps, write it into ctx.shared from the entry step (as compose does below). Validate a step’s input with inputSchema (Zod):

import {
  defineOrchestration,
  defineStep,
  goto,
  terminate,
  type OrchestrationExecutionContext,
} from "@sapiom/orchestration";
import { z } from "zod/v4";

interface Shared extends Record<string, unknown> {
  salutation: string;
}

const inputSchema = z.object({ name: z.string().min(1) });

const compose = defineStep({
  name: "compose",
  next: ["format"],
  inputSchema,
  async run(input, ctx: OrchestrationExecutionContext<Shared>) {
    ctx.shared.set("salutation", `Hello, ${input.name}`);
    return goto("format", {});
  },
});

const format = defineStep({
  name: "format",
  next: [],
  terminal: true,
  async run(_input, ctx: OrchestrationExecutionContext<Shared>) {
    return terminate({ greeting: `${ctx.shared.get("salutation")}!` });
  },
});

export const orchestration = defineOrchestration<z.infer<typeof inputSchema>, Shared>({
  name: "greeting",
  entry: "compose",
  steps: { compose, format },
});

Failure handling & retries

There’s no magic auto-retry — you express failure handling explicitly, which keeps it visible in the graph. The common shape is a bounded loop that escalates to a human: a worker step, an evaluate step that branches, and a counter in ctx.shared that caps the retries before pausing for review. The shape (steps abbreviated):

import { defineStep, goto, type OrchestrationExecutionContext } from "@sapiom/orchestration";

// evaluate → ship on success, else → reconsider
const evaluate = defineStep({
  name: "evaluate",
  next: ["ship", "reconsider"],
  async run(input: { passed: boolean }, ctx) {
    return input.passed ? goto("ship", {}) : goto("reconsider", {});
  },
});

// reconsider → loop back to the worker, or escalate once we hit the cap
const reconsider = defineStep({
  name: "reconsider",
  next: ["work", "escalate"],
  async run(_input, ctx: OrchestrationExecutionContext<{ attempt: number; maxAttempts: number }>) {
    const attempt = ctx.shared.get("attempt") ?? 0;
    const max = ctx.shared.get("maxAttempts") ?? 3;
    return attempt < max ? goto("work", {}) : goto("escalate", {});
  },
});

When you goto back into a step that declares an inputSchema, the payload you pass must still satisfy that schema.

The SDK also gives you finer-grained directives when a step should handle its own failure:

retry({ delayMs }) — re-run this step; bound it with ctx.attempts.
fail(reason, { output }) — end the workflow as failed (the step must set canFail: true).
timeoutMs on a step caps how long its run may take.

Long-running steps: pause & resume

A step’s run completes in one dispatch and can’t block across processes, so a long-running capability (the coding agent) is launched, and the step pauses until it finishes — then a resume step receives the result as its input.

import {
  defineStep,
  pauseUntilSignal,
  terminate,
  type OrchestrationExecutionContext,
} from "@sapiom/orchestration";
import { CODING_RESULT_SIGNAL } from "@sapiom/tools";

interface Shared extends Record<string, unknown> {
  codingRunId: string;
}

const launch = defineStep({
  name: "launch",
  next: ["collect"],
  pause: { signal: CODING_RESULT_SIGNAL, resumeStep: "collect" },
  async run(input: { task: string }, ctx: OrchestrationExecutionContext<Shared>) {
    const run = await ctx.sapiom.agent.coding.launch({ task: input.task });
    ctx.shared.set("codingRunId", run.runId);
    return pauseUntilSignal(run, { resumeStep: "collect" });
  },
});

const collect = defineStep({
  name: "collect",
  next: [],
  terminal: true,
  // The resumed step's input IS the coding run's result.
  async run(input: { status: string; summary: string | null }, ctx) {
    return terminate({ status: input.status, summary: input.summary });
  },
});

Key points:

Declare pause: { signal, resumeStep } on the launching step and return pauseUntilSignal(handle, { resumeStep }) — the handle carries the signal and correlation id, so you don’t wire them by hand.
The resumed step’s input is the run’s result (for a dispatch pause — a launched capability). It crossed a process boundary, so it carries no live handles — stash anything else the resumed step needs in ctx.shared before pausing, and re-attach a sandbox from the result’s executionEnvironment if you need one.
For a manual human-gate pause, the resume payload is whatever fires the signal. Note that under run_local a manual gate auto-resumes with an empty {} unless you stub it — so type the resumed step’s input defensively (optional fields).
For a human gate, use the manual form and fire the signal from your approval UI (or a capability callback) to resume:

return pauseUntilSignal({
  signal: "my.approval",
  resumeStep: "finalize",
  correlationId: ctx.executionId, // makes the awaited signal unique to this run
});

Sub-workflows

A step can run another deployed orchestration and await its result — the same launch-and-pause pattern as the coding agent, via ctx.sapiom.orchestrations.

For a quick child run, run inline:

const result = await ctx.sapiom.orchestrations.run({
  definition: "enrich-lead",          // the child orchestration's deployed slug
  input: { domain: "acme.com" },
});

For a long-running child, launch it and pause until it signals — so the parent step doesn’t time out:

import { defineStep, pauseUntilSignal } from "@sapiom/orchestration";
import { ORCHESTRATIONS_RESULT_SIGNAL } from "@sapiom/tools";

const launchChild = defineStep({
  name: "launchChild",
  next: ["useResult"],
  pause: { signal: ORCHESTRATIONS_RESULT_SIGNAL, resumeStep: "useResult" },
  async run(input: { domain: string }, ctx) {
    const handle = await ctx.sapiom.orchestrations.launch({
      definition: "enrich-lead",
      input: { domain: input.domain },
    });
    return pauseUntilSignal(handle, { resumeStep: "useResult" });
  },
});

The resumed step’s input is the child’s OrchestrationRunResultPayload — discriminated on status, so branch on "completed" vs "failed" rather than catching an exception. Import the type from @sapiom/tools.

Determinism

A step body runs once on the happy path and re-runs only on retry (after a throw or a retry()). Don’t rely on a value being recomputed identically across a pause/resume or a retry — capture non-deterministic values (timestamps, ids) once and pass them forward via goto input or ctx.shared.

Testing what you authored

The local loop is offline and free — no real capability calls, no spend:

npm run typecheck — confirms every ctx.sapiom.* call and directive you used actually exists.
check — bundles index.ts and validates the step graph.
run_local — runs your real step code with every capability stubbed; a paused step (a coding launch or a manual human-gate signal) auto-resumes locally — a dispatch with its stub result, a manual gate with {} unless you stub it — so the happy path runs end to end.

run_local needs no stubs to start (capabilities return sensible defaults). Add overrides in .sapiom-dev/stubs.json only when a step branches on a specific result:

{ "version": 1, "steps": { "collect": { "agent.coding.run": { "status": "completed" } } } }

Stub the method your step actually calls — agent.coding.launch if you launched it, agent.coding.run if you awaited inline. Capability paths use the plural namespace for namespace calls (repositories.list) and the singular handle for handle methods (repository.pushFromSandbox). run_local reports unusedStubs (a key matched nothing — usually a typo or a singular/plural slip) and stubWarnings (a key matched but the value was the wrong shape) — a green run with either non-empty means a stub silently didn’t apply.

Browse capabilities The full ctx.sapiom.* catalog your steps can call.

Back to the Quickstart Scaffold, authenticate, and run your first workflow.

The SDK & public-API reference — the full lifecycle and the trigger/invocation surface — is coming next in this track.