Operating Live Instances

This page describes how operators inspect and act on running and completed process instances from the BPMN UI.

The two views you'll spend time in:

View	Purpose
BPMN Instances	List of instances across all deployed processes — filter, drill in, cancel
Instance detail	Everything about a single instance: incidents, history, jobs, user tasks, child instances, variables, and a live diagram view

Instances list

Open BPMN Instances from the BPMN navigation. You'll see one row per instance with workflow ID, status, started timestamp, and a link into the detail view.

Status	Meaning
RUNNING	The instance is in flight
COMPLETED	The instance reached an end event
FAILED	The instance terminated with an error
CANCELED	The instance was cancelled (by an operator or by a parent process)

Filtering

The status filter is deep-linkable — ?status=RUNNING, ?status=FAILED, etc. The operational dashboard's KPI cards link directly here.

The list can also be scoped to a single deployed process, in which case only instances of that process are shown.

Auto-refresh

When any instance on the page is RUNNING, the list auto-refreshes every 8 seconds. Once everything's terminal, the polling stops.

Cancelling from the list

Each running instance has a cancel action. It asks for confirmation and then cancels the instance immediately. Completed, failed, and canceled instances cannot be cancelled.

Instance detail

Click into a row to open the detail page. The top of the page shows:

The instance's status (with a count of incidents if any are open).
The workflow ID, in monospace, for copy/paste.
A breadcrumb back to parent instances, when the instance was started by a call activity (see Drilling into child instances).
A View Diagram action that opens a full-screen diagram view.
A Cancel action (running instances only).

If the instance is FAILED, the failure reason is shown in a banner at the top.

The body is a multi-open accordion with the following sections.

Incidents

An incident is what an instance becomes when an activity throws an error that has nowhere to go. The instance pauses at the failed node and waits for an operator to resolve it.

Each incident shows:

Field	Description
Node	The failed node's ID
Type	Error type as reported by the activity
Message	The error message
Time	When the incident was created

What causes an incident

An error becomes an incident when no matching handler can catch it. The common paths:

Source	When it produces an incident
External worker	A worker reports an error and the retry budget is exhausted with no error boundary catching the BPMN error
User task	A caller submits `ThrowError` with no matching error boundary on the user task or up the scope chain
Script task	A FEEL expression in the script body raises an evaluation error — division by zero, unresolved variable, type mismatch — with no error boundary attached
Business-rule task	The called DMN decision fails to evaluate or the result mapping fails, with no error boundary attached
Uncaught error throw	An error end event or thrown error from an activity that bubbles up to the root with no matching error boundary or error event sub-process anywhere on the path

If an error is caught — by an error boundary on the activity, a parent's boundary, or an error event sub-process — there's no incident. The token routes through the handler.

To act on an incident, open the diagram view and use the run panel — see Resolving incidents.

Activity history

A chronological table of every activity that has fired in the instance:

Field	Description
Node	The activity's ID
Type	The element type (`serviceTask`, `userTask`, `exclusiveGateway`, etc.)
Status	`Started`, `Completed`, `Failed`, or `Canceled`
Started	Timestamp of entry
Duration	Elapsed time, or "Running" if still active

This is the same data the replay slider works from.

External jobs

Service-task and worker-mode send-task jobs that the engine has dispatched. Useful when investigating a stuck job or confirming that a worker actually picked one up.

Field	Description
Node	The activity that produced the job
Task type	The worker type string the job was dispatched with
Status	`PENDING`, `COMPLETED`, `FAILED`, or `CANCELED`
Cancel reason	The reason the job was cancelled, when applicable
Created / Completed	Timestamps

For the worker-author side of this, see External workers.

User tasks

The user-task lifecycle for the instance:

Field	Description
Node	The user task's ID
Assignee	The single user assigned (if any)
Candidate groups	Groups eligible to claim the task
Status	`CREATED`, `COMPLETED`, `FAILED`, or `CANCELED`
Detail	Error code (FAILED) or cancel reason (CANCELED)
Created	When the task was registered

Child instances

Instances spawned by call activities in this process. Each row links to the child's detail page; clicking Open preserves the breadcrumb so you can walk back up.

Variables

The instance's current variable scope as a JSON object. Empty when the instance has no variables set.

Diagram view

The View Diagram action opens a full-screen modal with the deployed BPMN diagram, overlaid with the instance's live state.

Overlay colors

Color	Meaning
Active	The node is currently executing
Completed	The node finished normally
Failed	The node threw an error
Incident	The node has an unresolved incident

The overlay updates every 3 seconds while the instance is running.

Replay slider

Below the diagram, a replay slider scrubs through the recorded history one event at a time. The slider runs from the first event (step 1) to the latest (step N). Each tick reflects one history entry, labelled with the node ID and what happened (Started, Completed, Failed, or Canceled).

Mode	Behavior
Live (default)	The diagram shows the current state; the run panel is interactive
Replay	The diagram shows historical state at the chosen step; the run panel is hidden

Click Live to drop out of replay mode and return to the current state.

Run panel

To the right of the diagram (visible only when the instance is RUNNING and you have edit permission, and not in replay mode), the run panel lets you act on the instance directly.

The panel sections, top to bottom:

Active elements

A list of nodes that are currently active — the same nodes highlighted on the diagram.

Pending tasks

For each pending external job, a card lets you:

Inspect the job's input variables and headers.
Complete the job by submitting an output variables JSON — the engine treats this exactly like a worker completion.
Throw error by submitting an error code and optional variables — the engine treats this as a worker-reported BPMN error, which routes the token through any matching error boundary.

This is most useful when you have no worker yet and want to drive the instance forward by hand, or to recover from a stuck worker.

Ad-hoc scopes

When the instance is inside an ad-hoc sub-process, the run panel offers per-scope controls:

Activate inner — pick one of the ad-hoc's child activities and start it. Useful when the scope's activeElementsCollection doesn't match what you want, or when activities should be triggered manually.
Update scope variables — submit a JSON object that's merged into the scope's variables. The completion condition is re-evaluated.

Incidents

Each open incident shows the node ID, error message, and error code (if any) with a Retry button. Retrying resolves the incident and re-executes the failed node from the beginning.

Publish signal or message

Send a signal or message directly from the panel.

For a signal (broadcast — wakes every subscriber):

Field	Notes
Name	Signal name to broadcast
Variables	JSON payload merged into the recipients' scope
TTL	Optional buffer expiry — ISO 8601 duration (`PT30M`), duration string (`30m`), or absolute timestamp

For a message (point-to-point — wakes one matching subscriber):

Field	Notes
Name	Message name
Correlation	Either a single value (number, boolean, or string) or a JSON object — leave empty for "no correlation"
Variables	JSON payload
TTL	Optional buffer expiry, same formats as signal

If the value field has a number, boolean, or quoted string, it's parsed and sent with the right type. Anything else is sent as a plain string.

Variables

The instance's current variables as a collapsible JSON block. Expands to a scrollable code view.

Activity history

A condensed view of the activity table from the detail page — node, status (with badge), start time, duration. Useful when you want to glance at recent activity without leaving the diagram view.

Cancelling an instance

A Cancel action is available in two places:

The cancel icon next to each running row in the instances list.
The cancel button in the instance detail header (running instances only).

Both ask for confirmation and terminate the workflow immediately. Cancellation propagates to child instances spawned by call activities.

Resolving incidents

When an instance has unresolved incidents, work the diagram view:

Click View Diagram on the instance detail page.
In the run panel, find the incident in the Incidents section.
Click Retry. The engine clears the incident and re-runs the activity that produced it.

What "retry" actually does

A retry does not restart the whole instance. It re-runs only the activity that the incident is bound to:

For a service task, the engine creates a fresh job in the queue. A worker (or an operator using Pending tasks) picks it up and tries again.
For a user task, the task is re-registered as CREATED and waits for an assignee to act on it.
For a script task or business-rule task, the engine re-evaluates the script or the decision.

Variables and tokens elsewhere in the instance are unaffected. The activity's scope is the same as it was at the original entry — input mappings are not re-evaluated.

Adjusting input before retry

The resolve API accepts an optional variables map that's merged into the activity's scope before the retry runs. Use this when the input itself was the problem — for example, a FEEL expression failed because a value was missing or had the wrong shape, and you want to inject a correction without changing the model.

The run panel's Retry button submits an empty variables map by default. To pass corrections, call the API directly:

POST /projects/{projectID}/bpmn/instances/{workflowID}/incidents/{incidentID}/resolve

{ "variables": { "amount": 49.95 } }

When the cause persists

Resolving doesn't fix the underlying problem — it just re-runs the activity. If the cause is still there (the worker is still buggy, the input is still bad, the decision is still misconfigured), the activity fails again and a new incident is created on the next failure.

If you need to abandon the instance entirely, cancel it instead of retrying.

Drilling into child instances

Call activities spawn child workflows. The detail view exposes the parent/child relationship in two places:

The Child instances accordion section lists all children spawned by this instance.
The diagram view's call-activity nodes are clickable when an instance is running — clicking jumps to the child's detail page.

Both paths preserve a breadcrumb in the URL, so the navigation walks back up cleanly. If you deep-link directly to a child instance (no URL breadcrumb but the API knows it has a parent), an Up to parent button appears in the header for a single hop.

What's API-only today

Two operational features exist on the API but don't yet have a UI:

Feature	Effect
Modify token state	Insert a new token before a node (`START_BEFORE_NODE`) or cancel an active token (`CANCEL_TOKEN`)
Migrate to a new process version	Move running instances to a new deployed version with an explicit migration plan

Both are exposed through the REST API. UI surfaces are on the roadmap.

Instances list​

Filtering​

Auto-refresh​

Cancelling from the list​

Instance detail​

Incidents​

What causes an incident​

Activity history​

External jobs​

User tasks​

Child instances​

Variables​

Diagram view​

Overlay colors​

Replay slider​

Run panel​

Active elements​

Pending tasks​

Ad-hoc scopes​

Incidents​

Publish signal or message​

Variables​

Activity history​

Cancelling an instance​

Resolving incidents​

What "retry" actually does​

Adjusting input before retry​

When the cause persists​

Drilling into child instances​

What's API-only today​

Instances list

Filtering

Auto-refresh

Cancelling from the list

Instance detail

Incidents

What causes an incident

Activity history

External jobs

User tasks

Child instances

Variables

Diagram view

Overlay colors

Replay slider

Run panel

Active elements

Pending tasks

Ad-hoc scopes

Incidents

Publish signal or message

Variables

Activity history

Cancelling an instance

Resolving incidents

What "retry" actually does

Adjusting input before retry

When the cause persists

Drilling into child instances

What's API-only today