>_
.issue.db
/highway-workflow-engine
Dashboard
Issues
Memory
Lessons
Audit Log
New Issue
Edit Issue #114
Update issue details
Title *
Description
I am going to run pytest in another terminal. You need to write a bash, in scripts/disrupt.sh, where first you list all of the highway-* services (except api), including internal worker, activity worker, ..., then you need to randomly kill one or two of those, every 10 second. the system should operate with 60% of capacity, you should kill those by pid, you should make sure they will be restarted. you need to do this around 3 minutes. at the end of the script, all services must be back to normal. again, api shouldn't be killed, the rest can. the bash script must be comprehensive, and not specific to this system's setting, so any deployment can run the disruption. after creating the script, I will run it, and will report back results to you. The goal is, all tests must pass as workers are available, but in lower capacity. --- IMPLEMENTATION COMPLETE: Script: scripts/disrupt.sh Protected services (never killed): - highway-api.service - highway-scheduler.service Killable services (auto-discovered): - highway-worker@{1..8}.service (8 workers) - highway-activity-worker.service - highway-internal-worker.service - highway-frontend.service - dsl-generator.service Capacity math: - 13 total killable services - 60% minimum = ceil(13 * 0.6) = 8 must stay running - Max 5 can be down simultaneously - Kills 1-2 per cycle (random) Chaos pattern: - 10-second intervals - 3-minute duration = 18 cycles - PID-based kill -9 - systemd auto-restart handles recovery - Final cleanup verifies all services back
Priority
Low
Medium
High
Critical
Status
Open
In Progress
Closed
Due Date (YYYY-MM-DD)
Tags (comma separated)
Related Issues (IDs)
Enter IDs of issues related to this one. They will be linked as 'related'.
Update Issue
Cancel