#244 Critical: Activity worker holds DB connection for entire shell execution duration

closed critical activity-worker database scalability Created 2025-12-03 15:53 · Updated 2025-12-03 16:09

Description

Edit
## Problem Activity worker holds a database connection checked out from the pool for the ENTIRE duration of shell command execution. For long-running activities (minutes/hours/days), this severely limits scalability. ## Root Cause In engine/services/activity_worker.py line 459-475: ```python with db.get_db_connection() as conn: # Connection checked out HERE ctx = DurableContext(conn=conn, ...) result = tool_func(ctx, *args, **kwargs) # Shell runs for hours # Connection returned only when shell completes ``` ## Impact - 500 concurrent activities = 500 connections held idle - Current: 13 processes x 10 max pool = 130 max connections - Limits concurrent long-running activities to ~130 - Connections show as idle but still checked out from pool ## Actual Connection Needs Shell tool only needs connection for: 1. Variable resolution (before shell starts) 2. PID storage (after shell starts, then committed) 3. NOTHING during actual shell execution ## Proposed Fix Release connection after initial setup: 1. Acquire connection for variable resolution 2. Release connection 3. Execute shell WITHOUT holding connection 4. PID storage uses separate short-lived connection 5. Acquire new connection for completion ## Affected Files - engine/services/activity_worker.py - engine/tools/shell_output_streamer.py - engine/tools/shell_command.py

Comments

Loading comments...

Context

Loading context...

Audit History

View All
Loading audit history...