#635 Add distributed rate limiter decorator (@with_rate_limit)

Description

Edit

Executive Summary Implement enterprise-grade rate limiting for Highway tools that combines: 1. Token bucket rate limiting (per minute/hour/day) 2. ScriptPlan calendar integration (business hours, holidays, variable capacity) 3. Multi-scope support (global, per-tenant, per-workflow) 4. Pre-computed availability reports (no CPU cost at runtime) Current State Analysis What Exists - Tenant-level rate limiting (TenantRateLimiter) - token bucket in PostgreSQL - Circuit breaker decorator (@with_circuit_breaker) for external services - tools.scriptplan for schedule-based task execution (uses scriptplan library) - App installations with per-tenant configuration - ScriptPlan library already a dependency (from scriptplan.parser.tjp_parser import ProjectFileParser) What's Missing - Per-tool rate limiting (all tools share tenant quota) - Calendar-aware restrictions (business hours, holidays) - Dynamic capacity based on schedule (100/day Monday, 200/day Wednesday) - Per-workflow scoping option Architecture Design Key Insight Use ScriptPlan to pre-compute availability windows, then rate limiter checks against cached report. This separates: - Scheduling logic (complex, CPU-heavy) → done once on config change - Runtime check (simple, fast) → check if now is in allowed window Two-Phase Architecture ┌──────────────────────────────────────────────────────────────┐ │ CONFIGURATION PHASE │ │ (On config update) │ └──────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────┐ │ 1. Parse ScriptPlan definition │ │ (TJP format with shifts, │ │ holidays, limits) │ ├─────────────────────────────────┤ │ 2. Generate availability report │ │ for next 7 days │ ├─────────────────────────────────┤ │ 3. Cache report in database │ │ (schedule_report JSONB) │ └─────────────────────────────────┘ ┌──────────────────────────────────────────────────────────────┐ │ RUNTIME PHASE │ │ (On each tool call) │ └──────────────────────────────────────────────────────────────┘ │ ┌─────────────────────┼─────────────────────┐ ▼ ▼ ▼ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ 1. SCHEDULE │ │ 2. CAPACITY │ │ 3. TOKEN │ │ CHECK │ │ LOOKUP │ │ BUCKET │ │ │ │ │ │ │ │ Is current │ │ What's the │ │ Have tokens │ │ time in an │ │ limit for │ │ remaining │ │ allowed │ │ this time │ │ in window? │ │ window? │ │ slot? │ │ │ │ │ │ │ │ │ │ (cached JSON) │ │ (from JSON │ │ (PostgreSQL │ │ │ │ or static) │ │ atomic ops) │ └───────────────┘ └───────────────┘ └───────────────┘ │ │ │ └─────────────────────┼─────────────────────┘ │ All pass? │ ──► Execute tool │ Any fail? │ ──► RateLimitError ScriptPlan Definition Example project email_limits "Email Service Limits" 2025-01-01 +1y { timezone "UTC" } # Working hours shift shift business_hours "Business Hours" { workinghours mon - fri 09:00 - 17:00 } # Email service resource with limits resource email_service "Email Service" { workinghours business_hours limits { dailymax 500 # 500 emails per day } } # Holidays - no emails on these days vacation "Christmas" 2025-12-25 vacation "Boxing Day" 2025-12-26 vacation "New Year" 2026-01-01 Pre-computed Availability Report Generated from ScriptPlan, cached in database: { "version": 1, "generated_at": "2025-12-20T00:00:00Z", "valid_until": "2025-12-27T00:00:00Z", "timezone": "UTC", "windows": [ { "start": "2025-12-20T09:00:00Z", "end": "2025-12-20T17:00:00Z", "capacity": {"per_hour": 100, "per_day": 500} }, { "start": "2025-12-23T09:00:00Z", "end": "2025-12-23T17:00:00Z", "capacity": {"per_hour": 100, "per_day": 500} } ], "blocked_dates": ["2025-12-25", "2025-12-26"], "default_capacity": {"per_minute": 10, "per_hour": 100, "per_day": 500} } Database Schema -- Rate limit configurations per tool per tenant CREATE TABLE highway.tool_rate_limits ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), tenant_id TEXT NOT NULL, -- 'default' for global defaults tool_name TEXT NOT NULL, -- 'tools.email.send', 'tools.http.request', etc. -- Scope: how to partition the limit limit_scope TEXT NOT NULL DEFAULT 'tenant', -- 'global', 'tenant', 'workflow' -- Static rate limits (used when no schedule, or as override) requests_per_minute INT, requests_per_hour INT, requests_per_day INT, -- ScriptPlan integration (optional - for calendar-aware limiting) schedule_definition TEXT, -- TJP format definition schedule_report JSONB, -- Pre-rendered availability report schedule_report_valid_until TIMESTAMPTZ, -- Control enabled BOOLEAN DEFAULT TRUE, fail_open BOOLEAN DEFAULT FALSE, -- Allow if rate limit check errors created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW(), UNIQUE(tenant_id, tool_name) ); -- Usage tracking (sliding window token bucket) CREATE TABLE highway.tool_rate_limit_usage ( tenant_id TEXT NOT NULL, tool_name TEXT NOT NULL, scope_key TEXT NOT NULL, -- tenant_id, workflow_run_id, or 'global' window_type TEXT NOT NULL, -- 'minute', 'hour', 'day' window_start TIMESTAMPTZ NOT NULL, request_count INT DEFAULT 0, last_updated TIMESTAMPTZ DEFAULT NOW(), PRIMARY KEY (tenant_id, tool_name, scope_key, window_type, window_start) ); -- Index for cleanup of old usage records CREATE INDEX idx_tool_rate_limit_usage_cleanup ON highway.tool_rate_limit_usage (window_start); -- Index for fast config lookup CREATE INDEX idx_tool_rate_limits_lookup ON highway.tool_rate_limits (tenant_id, tool_name) WHERE enabled = TRUE; Atomic Token Consumption (PostgreSQL Function) CREATE OR REPLACE FUNCTION highway.consume_tool_rate_token( p_tenant_id TEXT, p_tool_name TEXT, p_scope_key TEXT, p_window_type TEXT, p_limit INT ) RETURNS BOOLEAN AS $$ DECLARE v_window_start TIMESTAMPTZ; v_current_count INT; BEGIN -- Calculate window start based on type v_window_start := CASE p_window_type WHEN 'minute' THEN date_trunc('minute', NOW()) WHEN 'hour' THEN date_trunc('hour', NOW()) WHEN 'day' THEN date_trunc('day', NOW()) END; -- Upsert and check limit atomically INSERT INTO highway.tool_rate_limit_usage (tenant_id, tool_name, scope_key, window_type, window_start, request_count) VALUES (p_tenant_id, p_tool_name, p_scope_key, p_window_type, v_window_start, 1) ON CONFLICT (tenant_id, tool_name, scope_key, window_type, window_start) DO UPDATE SET request_count = tool_rate_limit_usage.request_count + 1, last_updated = NOW() WHERE tool_rate_limit_usage.request_count < p_limit RETURNING request_count INTO v_current_count; RETURN v_current_count IS NOT NULL; END; $$ LANGUAGE plpgsql; Implementation Plan Phase 1: Core Rate Limiter (Token Bucket) 1. Create engine/policies/tool_rate_limiter.py 2. Implement sliding window token bucket with PostgreSQL 3. Add atomic consume_token() and check_quota() functions 4. Support minute/hour/day windows 5. Add scope key generation (global/tenant/workflow) Phase 2: Decorator Integration 1. Add @with_rate_limit decorator to engine/tools/decorators.py 2. Mirror @with_circuit_breaker pattern 3. Integrate with tool registry 4. Add to high-impact tools: email, http_request, llm_call Phase 3: ScriptPlan Integration 1. Create schedule report generator using ScriptPlan library 2. Implement report caching in database 3. Add is_in_allowed_window() check 4. Add get_window_capacity() for dynamic limits 5. Background job to refresh reports before expiry Phase 4: Configuration & API 1. Migration for new tables 2. Default limits per tool type 3. Admin API for managing rate limits 4. Per-tenant override capability Phase 5: Testing 1. Unit tests for token bucket 2. Unit tests for schedule checks 3. Integration tests with actual tools 4. Load tests for concurrent access Open Questions for User 1. ScriptPlan dependency approach? 2. Report refresh strategy? 3. Default limits per tool? 4. Fail-open vs fail-closed? 5. API for management needed? Files to Create/Modify | File | Action | Purpose | |--------------------------------------------------|--------|------------------------| | migrations/xxx_tool_rate_limits.sql | Create | Database schema | | engine/policies/tool_rate_limiter.py | Create | Core rate limiter | | engine/policies/schedule_checker.py | Create | ScriptPlan integration | | engine/tools/decorators.py | Modify | Add @with_rate_limit | | engine/tools/email_tool.py | Modify | Apply decorator | | engine/tools/http_request.py | Modify | Apply decorator | | engine/tools/llm_call.py | Modify | Apply decorator | | tests/unit/test_tool_rate_limiter.py | Create | Unit tests | | tests/integration/test_rate_limit_integration.py | Create | Integration tests | ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

Similar Issues

Loading similar issues...

Comments

Loading comments...

Context

Loading context...

Audit History

View All

Loading audit history...