| #137 |
API Key copy button not working and missing audit log
1. Copy button in secret display modal does not work when creating API key. 2. No audit log entry cr...
|
closed |
critical |
2025-11-27 19:18 |
- |
|
| #136 |
datashard: Add scan() method to Table class for reading all data
DataShard lacks a method to read all data from a table. Currently only current_snapshot() is availab...
|
closed |
medium |
2025-11-27 11:59 |
- |
|
| #135 |
Health metrics timeline recorder script for test analysis
Create a script that continuously records health metrics to DataShard for timeline analysis during t...
|
closed |
medium |
2025-11-27 11:05 |
- |
|
| #134 |
Implement Worker Registration and Capacity Management System
Implement enterprise-grade worker registration where workers:
1. Ping DB every 30 seconds announcing...
|
closed |
high |
2025-11-27 09:23 |
- |
|
| #133 |
Add health_metrics API endpoint with tenant awareness
Create /api/v1/health/metrics endpoint that exposes the health_metrics.sql query. Must be tenant-awa...
|
closed |
high |
2025-11-27 07:58 |
- |
|
| #132 |
Create health_metrics.sql query for disruption resilience monitoring
Create a comprehensive SQL query in engine/sql/queries/health_metrics.sql that provides real-time he...
|
closed |
high |
2025-11-27 07:41 |
- |
|
| #131 |
Circuit breaker + retry misalignment - tasks exhaust retries during CB cooldown
PROBLEM: When circuit breaker opens, tasks fail with ProtectedCallError. With no retry_strategy (def...
|
closed |
critical |
2025-11-27 07:04 |
- |
|
| #130 |
Add disruption resilience test to prevent SQL-bypass regression
Create a test that specifically validates the SQL-path bypass scenario doesn't regress.
The test sh...
|
closed |
high |
2025-11-27 06:53 |
- |
|
| #129 |
AUDIT: Find all SQL-path bypasses that skip Python layer updates
During disruption testing, we discovered that when absurd.fail_run() is called directly from SQL (e....
|
closed |
high |
2025-11-27 06:45 |
- |
|
| #128 |
CRITICAL: Heartbeats not persisted - use separate connection
During disruption testing, heartbeat tests fail because heartbeats are NOT being committed to the da...
|
closed |
critical |
2025-11-27 06:28 |
- |
|
| #127 |
CRITICAL: workflow_run.status not updated when task permanently fails
During disruption testing, tasks exhaust max_attempts and go to 'failed' state, but workflow_run.sta...
|
closed |
critical |
2025-11-27 06:19 |
- |
|
| #126 |
Increase max_attempts for workflow tasks - too low for disruption resilience
Increase max_attempts from 3 to 64 for workflow tasks to handle disruption scenarios.
RISK ANALYSIS...
|
closed |
high |
2025-11-27 06:07 |
- |
|
| #125 |
CRITICAL: All workers use same worker_id - breaks claim isolation
All 8 workers use hardcoded worker_id='highway_worker_1' from config.ini, ignoring ABSURD_WORKER_ID ...
|
closed |
critical |
2025-11-27 06:04 |
- |
|
| #124 |
Fix: Event deduplication at storage level
LONG-TERM FIX for guaranteed event uniqueness.
Problem: Even with idempotency keys, race conditions...
|
closed |
critical |
2025-11-27 05:39 |
- |
|
| #123 |
Fix: Implement incremental checkpoint commits
MEDIUM-TERM FIX for progress preservation during crashes.
Problem: Checkpoints are committed atomic...
|
closed |
critical |
2025-11-27 05:39 |
- |
|
| #122 |
Fix: Add idempotency keys to event emission
SHORT-TERM FIX for event duplication during retries.
Problem: EventLogger emits new events with inc...
|
closed |
critical |
2025-11-27 05:39 |
- |
|
| #121 |
Fix: Reduce claim_timeout from 30s to 10s for faster crash recovery
IMMEDIATE FIX for disruption resilience.
Location: engine/orchestrator.py:727, engine/cli/worker.py...
|
closed |
critical |
2025-11-27 05:39 |
- |
|
| #120 |
Investigation: test_workflow_without_heartbeat_still_works timeout during disruption
Test failed with: TimeoutError: Workflow events did not stabilize within 40 seconds. Parent issue: #...
|
closed |
critical |
2025-11-27 05:30 |
- |
|
| #119 |
Investigation: test_heartbeat_with_many_iterations timeout during disruption
Test failed with: TimeoutError: Workflow did not complete within 45 seconds. Parent issue: #116. Inv...
|
closed |
critical |
2025-11-27 05:30 |
- |
|
| #118 |
Investigation: test_standard_sleep_wake_event timeout during disruption
Test failed with: TimeoutError: Workflow did not complete within 60 seconds. Parent issue: #116. Inv...
|
closed |
critical |
2025-11-27 05:30 |
- |
|