Device Message Lifecycle

How a device message moves through the Redis queue system — from initial enqueue to delivery confirmation or permanent failure. Two sections: the full state machine, then a side-by-side deep dive of the push (LoRaWAN) and pull (Calin API) paths.

100%

Queue State Machine

Seven delivery statuses and how messages transition between them.

dispatch NS accepted · 20s push: tx-ack ACK + uplink pull: task complete (skips SENT_TO_DEVICE) timeout 20s push: 15min pull: 48h 12s / bad ACK after 11 retries or skipRetry re-enqueued after backoff (2s → doubles → max 1h, +50% jitter) QUEUED initial queue SENT_TO_NS queue_in_flight_to_ns DELIVERED_TO_NS GW queue (push) · awaiting_task (pull) SENT_TO_DEVICE queue_in_flight_to_device PUSH ONLY DELIVERY_SUCCESSFUL cleaned up from Redis TO_RETRY queue_awaiting_retry DELIVERY_FAILED terminal · published to subscribers

Push vs Pull — Detailed Paths

The divergence happens after the network server confirms receipt. LoRaWAN relies on ChirpStack webhooks; Calin API uses a polling loop with a stored task ID.

100%
PUSH — LoRaWAN PULL — Calin API queue:lorawan_network:{grid_id} queue:gateway:{gateway_id} Enqueued priority-sorted Redis sorted set flood-prevention lock (2s) queue_in_flight_to_ns SENT_TO_NS · 20s timeout ChirpStack enqueueDeviceRequest queue_in_flight_to_gw DELIVERED_TO_NS · 15min timeout reaper checks ChirpStack remote queue; extends timeout if item still present webhook: downlink tx-ack (downlinkId present) queue_in_flight_to_device SENT_TO_DEVICE · 12s timeout webhook: ACK event + uplink event ACK / Uplink Correlation 10s window · keyed by deduplicationId both events paired → result resolved DELIVERY_SUCCESSFUL message cleaned up from Redis Enqueued priority-sorted Redis sorted set max 5 concurrent per gateway queue_in_flight_to_ns SENT_TO_NS · 20s timeout Calin API creates task → returns task ID queue_awaiting_task:{V1|V2} DELIVERED_TO_NS · task_id stored cron sweep every 5s; per-msg delay: 10s → 15s → 20s → 30s (increases with message age) adapter fetches task status from API Task Response null → still pending | success | failure V1: Status field · V2: status code 0/1/2/3 null: re-score for next poll EXECUTION_SUCCESS or EXECUTION_FAILURE if age > 48h → DELIVERY_FAILED permanent failure, no retry DELIVERY_SUCCESSFUL execution ok or failed max 48h · then DELIVERY_FAILED (no retry)

Reaper, Retry & Timeouts

Reaper Cron

Runs every 2 seconds in production. In each pass:

  • Expires timed-out messages in queue_in_flight_to_ns (20s)
  • Scans push queues for GW (15min) and device (12s) timeouts; extends GW if still on ChirpStack
  • Scans pull queues for max-age violations (48h) → permanent failure
  • Releases messages in queue_awaiting_retry whose backoff has elapsed
  • Triggers a new distribute-to-NS cycle
Retry Mechanics

Most failures go through retryOrFail before reaching DELIVERY_FAILED.

  • Max retries: 11 (12 total attempts)
  • Backoff: 2s base, doubles each retry, capped at 1 hour
  • Jitter: 0–50% random offset added to reduce thundering herd
  • skipRetry: bad token, device not in ChirpStack, unsupported command → immediate final failure
Timeout Reference
  • NS accept: 20s — both push and pull
  • GW delivery (push): 15 min — extendable if still on ChirpStack queue
  • Device response (push): 12s after tx-ack
  • Correlation window: 10s — ACK and uplink must arrive within 10s of each other
  • Max age (pull): 48h — permanent failure, no retry