#167 Zombie workflow detection: workflow_run status not synced when absurd task fails

closed high Created 2025-11-29 05:32 · Updated 2025-11-29 05:41

Description

Edit
## Problem Discovered zombie workflow f26b990f-e2ae-475a-aa9d-bb882db4270c that was stuck in 'running' status for 109 minutes while the underlying absurd task was already 'failed'. ## Root Cause When an absurd task fails with transaction error: ``` current transaction is aborted, commands ignored until end of transaction block ``` The workflow_run.status is not updated to reflect the failure. ## Evidence - absurd.r_highway_default.state = 'failed' (failed_at: 04:49:01) - workflow_run.status = 'running' (never updated) ## Impact - Zombie workflows show as 'running' indefinitely - Misleading monitoring data - Activities show cancelled but workflow appears running ## Fix Options 1. Add trigger on absurd run state changes to update workflow_run 2. Add periodic job to detect and fix zombie workflows 3. Fix the root cause - ensure workflow_run update happens in same transaction as absurd state change ## Related - Activities for this workflow had errors: 'Cancelled - consumer timeout fix' - 2/4 activities failed, 2/4 completed

Comments

Loading comments...

Context

Loading context...

Audit History

View All
Loading audit history...