#325 CRITICAL: Absurd retry creates new run_id while keeping same workflow_run_id, causing failed workflows to appear completed

closed critical bug data-integrity orchestrator retry Created 2025-12-09 05:51 · Updated 2025-12-09 08:19

Description

Edit
## Bug Summary When an Absurd task fails and retries, a NEW run_id is created (e.g., attempt=2) but the workflow_run_id from headers is preserved. If the retry succeeds, it overwrites the workflow_run status to 'completed', hiding the original failure. ## Evidence ### Timeline: 1. 05:39:39 - First run (attempt=1): **workflow_failed** with 'curl failed with code 60' 2. 05:40:12 - Second run (attempt=2): **workflow_completed** with IP address ### Database state: - absurd.r_highway_default: Two runs with different run_ids but same task_id - workflow_run: status='completed' when it should show the failure history - Event log shows both failure AND success for same workflow_run_id ## Root Cause In orchestrator.py:1077-1084, _update_workflow_run_if_exists_first_start clears error and completed_at on retry. ## Impact - Failed workflows can appear as succeeded - Error messages from original failures are lost - Event log shows both failure AND success for same workflow_run_id - workflow_run.absurd_run_id still points to the FIRST (failed) run ## Suggested Fix 1. Do NOT clear error/completed_at on retry start 2. Consider making workflow_run immutable once failed 3. Or track final_attempt_success separately

Comments

Loading comments...

Context

Loading context...

Audit History

View All
Loading audit history...