>_
.issue.db
/highway-workflow-engine
Dashboard
Issues
Memory
Lessons
Audit Log
New Issue
Edit Issue #126
Update issue details
Title *
Description
Increase max_attempts from 3 to 64 for workflow tasks to handle disruption scenarios. RISK ANALYSIS: - With 3 attempts and 30s claim_timeout, a task has ~90s total before permanent failure - Under disruption, 3 workers could crash sequentially, exhausting retries in <2 minutes - With 64 attempts, tasks can survive extended outages (up to ~32 minutes with 30s timeout) CONFIGURATION: - max_attempts: 64 (from 3) - claim_timeout: 15s (reduced from 30s for faster failover) - Total survival window: 64 * 15s = 16 minutes of continuous disruption TRADEOFF: - Higher retry count = longer before permanent failure detection - But essential for enterprise resilience during rolling deployments, network partitions, etc. Files to update: - Default max_attempts in workflow submission - Worker claim_timeout in systemd service
Priority
Low
Medium
High
Critical
Status
Open
In Progress
Closed
Due Date (YYYY-MM-DD)
Tags (comma separated)
Related Issues (IDs)
Enter IDs of issues related to this one. They will be linked as 'related'.
Update Issue
Cancel