>_
.issue.db
/highway-workflow-engine
Dashboard
Issues
Memory
Lessons
Audit Log
New Issue
#723
## Objective Deploy Highway Workflow Engine on Kubernetes for production use. ## Current State - Docker Compose: Working (local dev) - Kubernetes: Not supported ## Required Changes ### 1. Secrets Management **Current**: .env file with VAULT_TOKEN_ADMIN, Vault client reads tokens from env **K8s Options**: - Option A: Vault Agent sidecar (recommended) - K8s auth method, secrets injected as files - Option B: K8s Secrets + External Secrets Operator - map K8s secrets to config paths - Option C: Environment variable injection from K8s Secrets **Code Changes Needed**: - Support file-based secrets at /vault/secrets/* path - Support HIGHWAY_* env var overrides for config values - Graceful fallback chain: env vars → file secrets → Vault API ### 2. Storage **Current**: Local filesystem via bind mounts - /app/artifacts/ - workflow artifacts - /app/highway-test-logs/ - datashard logs - /app/highway-test-logs/uploads/ - file uploads **K8s Options**: - Option A: S3/MinIO (recommended for multi-replica) - re-enable S3StorageProvider with IAM auth - Option B: PersistentVolumeClaim with ReadWriteMany (requires NFS/EFS) - Option C: PVC per worker with node affinity (limits scaling) **Code Changes Needed**: - Re-enable S3 provider in s3_provider.py - Add IAM/IRSA authentication for S3 - Config: storage_type = auto (detect S3 creds, fall back to local) ### 3. Docker-in-Docker Sandboxing **Current**: Mounts /var/run/docker.sock for python_sandbox **K8s Options**: - Option A: Disable sandboxing (python_sandbox.mode = disabled) - acceptable for trusted tenants - Option B: DinD sidecar container per worker pod - Option C: Kaniko/Tekton for isolated execution - Option D: gVisor/Kata for pod-level isolation ### 4. Configuration Delivery **Current**: Bind-mounted config.ini from host **K8s**: ConfigMap mounted as /etc/highway/config.ini **Code Changes Needed**: - Support HIGHWAY_DATABASE_HOST style env var overrides - Environment variables take precedence over config.ini ### 5. Service Discovery **Current**: Docker DNS (postgres, api, ollama, dsl-compiler) **K8s**: K8s Service DNS - same pattern, just need Service manifests ### 6. Database **Current**: Docker PostgreSQL container **K8s Options**: - Managed DB: RDS, CloudSQL, Azure Database (recommended) - StatefulSet with PVC (self-managed) Required PostgreSQL extensions: uuid-ossp, pgcrypto Required databases: highway_db_v2, resilient_circuit_db ## Deliverables ### Helm Chart Structure ``` highway/ ├── Chart.yaml ├── values.yaml ├── templates/ │ ├── configmap.yaml # config.ini │ ├── secrets.yaml # JWT, DB password, encryption key │ ├── deployment-api.yaml │ ├── deployment-worker.yaml │ ├── deployment-activity-worker.yaml │ ├── deployment-internal-worker.yaml │ ├── deployment-dsl-compiler.yaml │ ├── deployment-ollama.yaml (optional) │ ├── service-api.yaml │ ├── service-dsl-compiler.yaml │ ├── ingress.yaml │ ├── pvc.yaml (if not using S3) │ └── hpa.yaml (horizontal pod autoscaler) ``` ### Services to Deploy | Service | Type | Replicas | Notes | |---------|------|----------|-------| | api | Deployment | 1+ | Ingress, port 7822 | | worker | Deployment | 2+ | HPA based on queue depth | | activity-worker | Deployment | 1+ | | | internal-worker | Deployment | 2 | Async logging | | dsl-compiler | Deployment | 1+ | Isolated, no DB access | | ollama | Deployment | 0-1 | Optional, GPU preferred | | postgres | External/StatefulSet | 1 | Prefer managed DB | ### Health Checks (from docker-compose) - API: GET /api/v1/health - Workers: Jumper heartbeat mechanism - DSL Compiler: GET /health ## Migration Path 1. Create Helm chart structure 2. Implement config env var overrides 3. Implement file-based secrets support 4. Re-enable S3 with IAM auth (or configure PVC) 5. Deploy to K8s cluster 6. Debug and iterate 7. Document deployment process ## Testing - Deploy all services - Run platform bootstrap workflow - Run demo workflows (v2, disaster, matrix) - Verify multi-replica worker scaling - Test pod restart recovery ## References - Issue #721: Local storage support (completed) - Issue #722: JoinMode consistency (completed) - docker-compose.yml: Current service definitions - docker/config.ini: Configuration reference
closed
high
Created 2025-12-27 23:38
·
Updated 2026-01-02 06:26
Description
Edit
No description provided.
Similar Issues
Loading similar issues...
Comments
Loading comments...
Add Comment
Context
Loading context...
Audit History
View All
Loading audit history...
Quick Actions
Reopen
Edit
Status
Open
In Progress
Closed
Won't Do
Priority
Low
Medium
High
Critical
Danger Zone
Delete Issue