Operating Live Instances
This page describes how operators inspect and act on running and completed process instances from the BPMN UI.
The two views you'll spend time in:
| View | Purpose |
|---|---|
| BPMN Instances | List of instances across all deployed processes — filter, drill in, cancel |
| Instance detail | Everything about a single instance: incidents, history, jobs, user tasks, child instances, variables, and a live diagram view |
Instances list
Open BPMN Instances from the BPMN navigation. You'll see one row per instance with workflow ID, status, started timestamp, and a link into the detail view.
| Status | Meaning |
|---|---|
| RUNNING | The instance is in flight |
| COMPLETED | The instance reached an end event |
| FAILED | The instance terminated with an error |
| CANCELED | The instance was cancelled (by an operator or by a parent process) |
Filtering
The status filter is deep-linkable — ?status=RUNNING, ?status=FAILED, etc. The operational dashboard's KPI cards link directly here.
The list can also be scoped to a single deployed process, in which case only instances of that process are shown.
Auto-refresh
When any instance on the page is RUNNING, the list auto-refreshes every 8 seconds. Once everything's terminal, the polling stops.
Cancelling from the list
Each running instance has a cancel action. It asks for confirmation and then cancels the instance immediately. Completed, failed, and canceled instances cannot be cancelled.
Instance detail
Click into a row to open the detail page. The top of the page shows:
- The instance's status (with a count of incidents if any are open).
- The workflow ID, in monospace, for copy/paste.
- A breadcrumb back to parent instances, when the instance was started by a call activity (see Drilling into child instances).
- A View Diagram action that opens a full-screen diagram view.
- A Cancel action (running instances only).
If the instance is FAILED, the failure reason is shown in a banner at the top.
The body is a multi-open accordion with the following sections.
Incidents
An incident is what an instance becomes when an activity throws an error that has nowhere to go. The instance pauses at the failed node and waits for an operator to resolve it.
Each incident shows:
| Field | Description |
|---|---|
| Node | The failed node's ID |
| Type | Error type as reported by the activity |
| Message | The error message |
| Time | When the incident was created |
What causes an incident
An error becomes an incident when no matching handler can catch it. The common paths:
| Source | When it produces an incident |
|---|---|
| External worker | A worker reports an error and the retry budget is exhausted with no error boundary catching the BPMN error |
| User task | A caller submits ThrowError with no matching error boundary on the user task or up the scope chain |
| Script task | A FEEL expression in the script body raises an evaluation error — division by zero, unresolved variable, type mismatch — with no error boundary attached |
| Business-rule task | The called DMN decision fails to evaluate or the result mapping fails, with no error boundary attached |
| Uncaught error throw | An error end event or thrown error from an activity that bubbles up to the root with no matching error boundary or error event sub-process anywhere on the path |
If an error is caught — by an error boundary on the activity, a parent's boundary, or an error event sub-process — there's no incident. The token routes through the handler.
To act on an incident, open the diagram view and use the run panel — see Resolving incidents.
Activity history
A chronological table of every activity that has fired in the instance:
| Field | Description |
|---|---|
| Node | The activity's ID |
| Type | The element type (serviceTask, userTask, exclusiveGateway, etc.) |
| Status | Started, Completed, Failed, or Canceled |
| Started | Timestamp of entry |
| Duration | Elapsed time, or "Running" if still active |
This is the same data the replay slider works from.
External jobs
Service-task and worker-mode send-task jobs that the engine has dispatched. Useful when investigating a stuck job or confirming that a worker actually picked one up.
| Field | Description |
|---|---|
| Node | The activity that produced the job |
| Task type | The worker type string the job was dispatched with |
| Status | PENDING, COMPLETED, FAILED, or CANCELED |
| Cancel reason | The reason the job was cancelled, when applicable |
| Created / Completed | Timestamps |
For the worker-author side of this, see External workers.
User tasks
The user-task lifecycle for the instance:
| Field | Description |
|---|---|
| Node | The user task's ID |
| Assignee | The single user assigned (if any) |
| Candidate groups | Groups eligible to claim the task |
| Status | CREATED, COMPLETED, FAILED, or CANCELED |
| Detail | Error code (FAILED) or cancel reason (CANCELED) |
| Created | When the task was registered |
Child instances
Instances spawned by call activities in this process. Each row links to the child's detail page; clicking Open preserves the breadcrumb so you can walk back up.
Variables
The instance's current variable scope as a JSON object. Empty when the instance has no variables set.
Diagram view
The View Diagram action opens a full-screen modal with the deployed BPMN diagram, overlaid with the instance's live state.
Overlay colors
| Color | Meaning |
|---|---|
| Active | The node is currently executing |
| Completed | The node finished normally |
| Failed | The node threw an error |
| Incident | The node has an unresolved incident |
The overlay updates every 3 seconds while the instance is running.
Replay slider
Below the diagram, a replay slider scrubs through the recorded history one event at a time. The slider runs from the first event (step 1) to the latest (step N). Each tick reflects one history entry, labelled with the node ID and what happened (Started, Completed, Failed, or Canceled).
| Mode | Behavior |
|---|---|
| Live (default) | The diagram shows the current state; the run panel is interactive |
| Replay | The diagram shows historical state at the chosen step; the run panel is hidden |
Click Live to drop out of replay mode and return to the current state.
Run panel
To the right of the diagram (visible only when the instance is RUNNING and you have edit permission, and not in replay mode), the run panel lets you act on the instance directly.
The panel sections, top to bottom:
Active elements
A list of nodes that are currently active — the same nodes highlighted on the diagram.
Pending tasks
For each pending external job, a card lets you:
- Inspect the job's input variables and headers.
- Complete the job by submitting an output variables JSON — the engine treats this exactly like a worker completion.
- Throw error by submitting an error code and optional variables — the engine treats this as a worker-reported BPMN error, which routes the token through any matching error boundary.
This is most useful when you have no worker yet and want to drive the instance forward by hand, or to recover from a stuck worker.
Ad-hoc scopes
When the instance is inside an ad-hoc sub-process, the run panel offers per-scope controls:
- Activate inner — pick one of the ad-hoc's child activities and start it. Useful when the scope's
activeElementsCollectiondoesn't match what you want, or when activities should be triggered manually. - Update scope variables — submit a JSON object that's merged into the scope's variables. The completion condition is re-evaluated.
Incidents
Each open incident shows the node ID, error message, and error code (if any) with a Retry button. Retrying resolves the incident and re-executes the failed node from the beginning.
Publish signal or message
Send a signal or message directly from the panel.
For a signal (broadcast — wakes every subscriber):
| Field | Notes |
|---|---|
| Name | Signal name to broadcast |
| Variables | JSON payload merged into the recipients' scope |
| TTL | Optional buffer expiry — ISO 8601 duration (PT30M), duration string (30m), or absolute timestamp |
For a message (point-to-point — wakes one matching subscriber):
| Field | Notes |
|---|---|
| Name | Message name |
| Correlation | Either a single value (number, boolean, or string) or a JSON object — leave empty for "no correlation" |
| Variables | JSON payload |
| TTL | Optional buffer expiry, same formats as signal |
If the value field has a number, boolean, or quoted string, it's parsed and sent with the right type. Anything else is sent as a plain string.
Variables
The instance's current variables as a collapsible JSON block. Expands to a scrollable code view.
Activity history
A condensed view of the activity table from the detail page — node, status (with badge), start time, duration. Useful when you want to glance at recent activity without leaving the diagram view.
Cancelling an instance
A Cancel action is available in two places:
- The cancel icon next to each running row in the instances list.
- The cancel button in the instance detail header (running instances only).
Both ask for confirmation and terminate the workflow immediately. Cancellation propagates to child instances spawned by call activities.
Resolving incidents
When an instance has unresolved incidents, work the diagram view:
- Click View Diagram on the instance detail page.
- In the run panel, find the incident in the Incidents section.
- Click Retry. The engine clears the incident and re-runs the activity that produced it.
What "retry" actually does
A retry does not restart the whole instance. It re-runs only the activity that the incident is bound to:
- For a service task, the engine creates a fresh job in the queue. A worker (or an operator using Pending tasks) picks it up and tries again.
- For a user task, the task is re-registered as
CREATEDand waits for an assignee to act on it. - For a script task or business-rule task, the engine re-evaluates the script or the decision.
Variables and tokens elsewhere in the instance are unaffected. The activity's scope is the same as it was at the original entry — input mappings are not re-evaluated.
Adjusting input before retry
The resolve API accepts an optional variables map that's merged into the activity's scope before the retry runs. Use this when the input itself was the problem — for example, a FEEL expression failed because a value was missing or had the wrong shape, and you want to inject a correction without changing the model.
The run panel's Retry button submits an empty variables map by default. To pass corrections, call the API directly:
POST /projects/{projectID}/bpmn/instances/{workflowID}/incidents/{incidentID}/resolve
{ "variables": { "amount": 49.95 } }
When the cause persists
Resolving doesn't fix the underlying problem — it just re-runs the activity. If the cause is still there (the worker is still buggy, the input is still bad, the decision is still misconfigured), the activity fails again and a new incident is created on the next failure.
If you need to abandon the instance entirely, cancel it instead of retrying.
Drilling into child instances
Call activities spawn child workflows. The detail view exposes the parent/child relationship in two places:
- The Child instances accordion section lists all children spawned by this instance.
- The diagram view's call-activity nodes are clickable when an instance is running — clicking jumps to the child's detail page.
Both paths preserve a breadcrumb in the URL, so the navigation walks back up cleanly. If you deep-link directly to a child instance (no URL breadcrumb but the API knows it has a parent), an Up to parent button appears in the header for a single hop.
What's API-only today
Two operational features exist on the API but don't yet have a UI:
| Feature | Effect |
|---|---|
| Modify token state | Insert a new token before a node (START_BEFORE_NODE) or cancel an active token (CANCEL_TOKEN) |
| Migrate to a new process version | Move running instances to a new deployed version with an explicit migration plan |
Both are exposed through the REST API. UI surfaces are on the roadmap.