#186 CRITICAL: TimeoutService infinite loop on race condition

closed critical Created 2025-11-30 00:20 · Updated 2025-11-30 00:31

Description

Edit
TimeoutService enters infinite tight loop when race condition detected in _fail_single_run(). The while True drain loop in _find_stuck_runs() keeps returning the same run_id because the run stays in running state after race condition. Caused 132,541 retries in seconds and system crash. Fix: track seen_run_ids, add MAX_STUCK_RUNS_PER_CYCLE limit, add backoff on errors.

Comments

Loading comments...

Context

Loading context...

Audit History

View All
Loading audit history...