Skip to main content

External Workers

External workers are the services that do the actual work for BPMN service tasks (and worker-mode send and manual tasks). The engine doesn't run task code itself — it dispatches jobs to a queue, and any service that can speak HTTP can pull from the queue, do the work, and report back.

This page is the worker-author guide: how the dispatch model works, what the HTTP endpoints look like, and how to monitor workers from the UI.

How dispatch works

┌──────────────────┐                ┌──────────────────┐
│ Process instance │ service task │ External jobs │
│ reaches a node │ ───────────► │ queue (PENDING) │
└──────────────────┘ └──────────────────┘

│ long-poll

┌──────────────────┐
│ Worker │
│ (any language) │
└──────────────────┘

complete ◄───┴───► error


┌──────────────────┐
│ Process resumes │
│ (or boundary │
│ error fires, │
│ or instance │
│ gets incident) │
└──────────────────┘

A job is created when the engine reaches an activity that needs external work — a service task with a quantum:taskDefinition, a send task in worker mode, a manual task with a task definition, or a generic task. The job carries the task type, the resolved input variables, the design-time headers, and an execution key that the worker uses when finalising it.

Workers select work by task type: each worker registers itself for one or more types (payment-worker, email-sender, etc.) and only sees jobs for those types.

Designing tasks for workers

The producing side of this is documented under Tasks. The fields that matter for the worker contract:

FieldNotes
quantum:taskDefinition typeThe selector workers poll on. Required; without it deployment fails
quantum:taskDefinition retriesHow many error reports the engine accepts before surfacing the failure (see Failure handling)
quantum:ioMapping inputsVariables prepared before the job is created — the worker sees them in variables
quantum:ioMapping outputsApplied after the worker completes — extract fields from the worker's output back into the parent scope
quantum:taskHeadersStatic design-time metadata passed to the worker alongside each job. Headers are not merged back into the instance

The worker HTTP API

All endpoints live under the project: /projects/{projectID}/bpmn/external-jobs. Authentication is the same as the rest of the API — a bearer token in the Authorization header. For service-account tokens, see Authentication.

Poll for jobs

POST /projects/{projectID}/bpmn/external-jobs/poll

Long-polls for jobs of one task type. Returns a batch of jobs as soon as any are available, or 204 No Content once the timeout elapses with no work.

FieldRequiredNotes
taskTypeyesThe selector this worker handles
clientIDyesStable identifier for this worker process. Used to attribute job locks and to count active workers
lockDurationnoExclusive lock on each acquired job. Duration string (5s, 2m). Defaults to 30s
timeoutnoHow long to wait before returning 204. Duration string. Defaults to 30s
maxJobsnoMaximum jobs to acquire in one call. Default 1, capped at 100

Each returned job looks like this:

{
"id": "…",
"executionKey": "wf-abc:node-charge:1",
"workflowID": "wf-abc",
"nodeID": "charge-card",
"taskType": "payment-worker",
"variables": { "order_id": "ORD-123", "amount": 49.95 },
"headers": { "currency": "USD" },
"retries": 3,
"status": "PENDING",
"lockedBy": "worker-eu-1",
"lockExpiresAt": "2026-05-02T12:34:56Z",
"createdAt": "2026-05-02T12:34:26Z"
}

The executionKey is the path parameter you'll use when finalising the job. The workflowID goes in the complete body.

curl -X POST "$API/projects/$PROJECT/bpmn/external-jobs/poll" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"taskType": "payment-worker",
"clientID": "worker-eu-1",
"lockDuration": "30s",
"timeout": "20s",
"maxJobs": 5
}'

Extend the lock (heartbeat)

POST /projects/{projectID}/bpmn/external-jobs/{executionKey}/heartbeat

Refreshes the exclusive lock on a job the caller acquired. Call this periodically when a job takes longer than the original lockDuration. Other workers can't claim the job until the lock expires or the job is completed.

FieldRequiredNotes
clientIDyesMust match the clientID that originally polled the job
lockDurationnoNew lock window from now. Defaults to 30s

A 404 from this endpoint means the job is no longer yours — the lock expired and another worker took over, the instance was cancelled, or the job already terminated.

Complete a job

POST /projects/{projectID}/bpmn/external-jobs/{executionKey}/complete

Finalises the job successfully. The supplied variables are merged into the originating instance's scope and the process resumes after the service task.

FieldRequiredNotes
workflowIDyesThe workflow ID returned in the poll response
variablesnoOutput map. Goes through any output mappings declared on the service task
curl -X POST "$API/projects/$PROJECT/bpmn/external-jobs/$KEY/complete" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"workflowID": "wf-abc",
"variables": { "transactionId": "txn-789", "approved": true }
}'

complete is idempotent on the execution key — a second call after success is a no-op.

Fail a job (BPMN error)

POST /projects/{projectID}/bpmn/external-jobs/{executionKey}/error

Reports a worker-side failure with a BPMN error code. The error code is raised as a BPMN error on the originating service task — see Failure handling for what happens next.

FieldRequiredNotes
errorCodeyesBPMN error code; matched against any errorEventDefinition errorRef on the task's boundary handlers
variablesnoVariables submitted alongside the error. Available to error-boundary handlers

Batch endpoints

For high-throughput workers that pulled maxJobs > 1:

POST /projects/{projectID}/bpmn/external-jobs/batch/complete
POST /projects/{projectID}/bpmn/external-jobs/batch/error

Each request body wraps an items array. Per-item status is returned so a partial failure doesn't sink the whole batch:

Item statusMeaning
completedThe job was finalised successfully
failedThe job exhausted its retries and surfaced as an incident
requeuedA retry budget remained — the job is back in PENDING
errorThe per-item operation itself failed; the job's state is unchanged

Failure handling

When a worker calls error:

  • The error code is raised as a BPMN error on the producing activity.
  • Any matching error boundary event on the activity (or up the scope chain) fires and routes the token through it.
  • The retry budget set by quantum:taskDefinition retries governs whether the job is requeued for another attempt or surfaces as an incident on the instance.

If the worker simply lets the lock expire without calling complete, error, or heartbeat, the engine treats the job as available and another worker can poll it.

If the originating instance is cancelled while a job is in flight, the job moves to CANCELED with a cancelReason. A subsequent complete or error against the cancelled job is rejected.

Worker design considerations

A few things that matter when you're writing a worker:

  • Pick a stable clientID. It's used to attribute locks and to count active workers in the UI. A different clientID per process or replica is fine; per-poll randomness defeats both.
  • Tune the lock duration to your work. Shorter is better — it minimises retry latency when a worker dies — but never shorter than your handler's worst-case runtime. If you can't bound it, heartbeat.
  • Hold the heartbeat in a separate task from the actual work. A worker that's blocked on I/O may also be unable to heartbeat, which means the lock expires and another worker double-processes the job. Dispatch heartbeats from a parallel goroutine / thread / coroutine.
  • complete is idempotent on executionKey. If you crash between doing the side-effecting work and calling complete, a retry is safe.
  • Treat headers as configuration. They're set at design time and shouldn't carry per-instance values — that's what variables is for.

Operating workers from the UI

The BPMN Jobs view (in the BPMN navigation) is the live operator dashboard.

Queue depth

A bar per task type, width proportional to the largest pending count. Click a bar to filter the list to that type with status PENDING.

Active workers

A badge per task type with the count of workers currently long-polling. Colours:

ColourMeaning
GreenWorkers are polling
GrayNo workers polling, no pending jobs of this type
RedPending jobs exist but no workers are polling for them — your stuck-queue indicator

The view counts workers that are currently connected. Long-polls that have timed out or disconnected aren't counted.

Job list

Filterable table showing one row per job:

ColumnDescription
Task typeThe selector for the job
WorkflowWorkflow ID of the originating instance
NodeThe activity that produced the job
StatusPENDING, COMPLETED, FAILED, or CANCELED
RetriesRemaining retry budget
Locked byThe clientID currently holding the job
Created / AgeWhen the job was queued

Click a row to expand it and see the cancel reason (cancelled jobs), lock expiry, input variables, and headers. The icon at the right of each row jumps to the originating instance.

Filters: time range (last hour / 24h / 7d, defaults to 1h to keep queries fast), task type, status, workflow ID. The list auto-refreshes every 3 seconds while any visible row is PENDING.

Manual completion from the UI

For pending jobs, operators can complete or fail a job by hand from the instance detail run panel — useful when you're testing a process before there's a worker, or recovering from a stuck worker.

EndpointUse
GET /projects/{projectID}/bpmn/external-jobsPaginated list with taskType, status, workflowID, createdAfter filters. The Jobs view uses this; it's also the right thing to call from a custom dashboard
GET /projects/{projectID}/bpmn/external-jobs/queue-depthOne entry per task type with at least one pending job
GET /projects/{projectID}/bpmn/external-jobs/workersOne entry per task type with the live count of connected workers