Stored Items

Category Key Value Actions
lessons AUTH_DATABASE_TYPE REQUIRED for auth library - Without AUTH_DATABASE_TYPE=postgresql, auth library defaults to SQLite. Always set AUTH_DATABASE_TYPE before AUTH_DATABASE_URL in K8s deployments.
general EVENT_SCOPING_WORKFLOW_RUN_ID Events MUST be scoped by workflow_run_id (NOT NULL enforced). Global events are forbidden - SQL functions raise exception if workflow_run_id is NULL. Migration 01KDX91GQGF3KDVG3EZWDWKRFT enforces NOT NULL constraint. Python methods (emit_event, await_event, wait_for_event, check_event) also validate and raise ValueError early.
workflow activity_wait_pattern Activity workflows need wait_for_event after each builder.activity() to wait for completion via {{result.completion_event}}
guidelines api_idempotency ALL workflow engine APIs and actions must be strictly idempotent. No side effects on second call. Use idempotency keys where needed.
guidelines api_only ALL tools must use API to interact with the workflow engine. Direct database access is forbidden outside of engine core code.
architecture app_code_no_cache Issue #326: All app code is stored in database and loaded fresh on each execution. NO CACHING. python_module entrypoint removed. Cache APIs deprecated. Changes take effect immediately.
guidelines async When async needed, use trio library only.
lessons atomic_transactions Data integrity and atomicity are top priorities. TimeoutService uses atomic check-and-set (fencing) to prevent race conditions.
general audit_log_transaction_isolation Audit logs in PostgreSQL: When logging to an audit table within a transaction that may rollback, always ensure critical audit events (especially failure events) are logged in a SEPARATE transaction. Use a separate connection or log in the rollback handler's separate transaction. Relying on async logging or sidecar telemetry alone creates gaps in the authoritative audit trail. Issue #755.
guidelines backups Write backups to /tmp/highway-workflow-backups/ with timestamps.
guidelines circuit_breaker Use highway_circuitbreaker package only. Source at ~/develop/highway_circuitbreaker/. Build wheel if changes needed (pip install -e wont work).
guidelines code_quality Ruff for linting/formatting. All code must pass ruff and mypy checks.
architecture core_components 1. Absurd Task Queue (PostgreSQL-based), 2. AbsurdClient (transaction-aware, never commits), 3. Orchestrator (atomic SUCCESS/SLEEP/FAILURE paths), 4. DurableContext (ctx.step, ctx.wait_for_event, ctx.emit_event), 5. WorkflowInterpreter (Highway DSL executor), 6. CLI Tools (worker.py, submit.py, replay.py, monitor.py)
guidelines database_rules Use psycopg3 only with explicit transactions. No ORMs. SQL queries must be in engine/sql/queries/*.sql files.
architecture database_tables absurd.t_*/r_*/c_*/e_* (task definitions, runs, checkpoints, events per queue), workflow_run (execution tracking), absurd_event_log (immutable audit trail), absurd_checkpoint (snapshot storage)
dependencies datashard DataShard v0.2.1 - Install from PyPI only: pip install datashard==0.2.1. NO source modifications. See docs/agent-docs/DATASHARD_INTEGRATION.md
guidelines db_migrations NEVER modify database schema/functions directly. Always create migration files in engine/migrations/sql/highway_X.X.XX_description.sql. Use IF EXISTS for idempotency.
guidelines debugging Use hwe replay <workflow_id> to replay workflows. Prefer this over direct db access. Add missing features to engine/cli/replay.py
guidelines demo_workflows Demo workflows (disaster recovery, demo v2, video demo) are golden standard. Must be kept up-to-date and re-tested after code changes. All demo workflows go in demo tenant.
deployment docker Isolated: docker compose up -d (4 workers + PostgreSQL 16). Multi-stage Dockerfile Python 3.13-slim. Max ulimits 65536 fds, 4096 procs.
general docker_no_remove_param tools.docker.run does NOT have a 'remove' parameter. Containers are ALWAYS automatically removed after completion. Do not include remove=True in Docker workflows.
guidelines documentation Markdown files in docs/agent-docs/. No excessive summaries or MD files.
guidelines dsl Highway DSL is the only DSL for data modeling and schema definition. Use hwe dsl-prompt for syntax info.
general handler_task_ordering In WorkflowBuilder, handler tasks (on_failure/on_success) MUST be defined AFTER the tasks that reference them. Reason: auto-chaining makes first task start_task, and unrecognized handlers get chained. Always define main workflow tasks FIRST, handlers LAST.
dependencies highway_dsl Highway DSL v1.9.0 - Build from source in ~/develop/highway_dsl/. Modifications allowed. See docs/agent-docs/HIGHWAY_DSL_INTEGRATION.md
guidelines logging Use logging module only. No print statements allowed.
general logging-best-practice CRITICAL: Never use f-strings in logger calls. ALWAYS use lazy formatting: logger.debug('message %s', value) NOT logger.debug(f'message {value}'). F-strings evaluate immediately even if log level is disabled, causing: 1) Performance overhead, 2) Security risk - values get logged even at DEBUG when user sees INFO. Use %s placeholders and pass values as args.
general logging-security Never log actual variable values in workflow engines. Log only metadata: type, length, presence (is_set=True). Workflow variables may contain PII, secrets, or large content like 'THE COMPLETE SHERLOCK HOLMES'. Use DEBUG level for internal tracing, not INFO.
guidelines migrations Raw PostgreSQL only in engine/migrations/sql/ directory, executed in order. All DB changes must go via migration scripts.
lessons no_deprecations No deprecated code allowed. Remove deprecated code instead. Production-grade codebase, no deprecations.
lessons parallel_operator_fix ParallelOperator and ActivityOperator are fork/queue-only (Nov 16, 2025). They spawn tasks and return immediately. No automatic waiting. Prevents double-fork bugs in crash recovery. Workflows must add explicit join/wait logic separately.
guidelines process_management Never kill gunicorn or processes directly. Use systemctl or docker compose to manage services.
workflow result_key_naming Avoid using result_key with same name as task_id - causes 'int object does not support item assignment' error during workflow execution
lessons resumability Resumability is critical. Workflows must be able to resume from checkpoints without data loss or corruption.
guidelines secrets No secrets or config in .env file. Configs in /etc/highway/config.ini and secrets in Vault only, even for development.
general service update never create processes yourself. also deploy code by running 'make restart'.
lessons spawn_based_logging Use spawn-based logging instead of injection-based (Nov 15, 2025). Spawn async Absurd tasks for logging after task completion. Crash-safe, idempotent via idempotency keys, no workflow interference.
lessons sql_injection_prevention All services must use psycopg.sql.Identifier() for dynamic table names to prevent SQL injection.
general submit DSL not JSON Submit workflows in Python DSL format, not JSON (unless requested)
deployment systemd Production: bash install-systemd-workers.sh (4 workers). Status: sudo systemctl status highway-worker.target. Logs: sudo journalctl -u highway-worker@* -f
apps tenant_app_installation Apps must be installed for specific tenant (test vs default) to be usable in workflows submitted with that tenant
guidelines testing Integration tests only in tests/integration/. API-based calls only. No mocking. No inline workflow definitions - all must be in tests/workflow_examples/. Tests must pass with -n 4 parallel. No sequential tests.
lessons threadpool_executor_danger NEVER use with ThreadPoolExecutor(). Use try/finally with shutdown(wait=False, cancel_futures=True). Also: as_completed() MUST have timeout.
guidelines ui_components ALL UI components must be in api/ui/templates/components/<component_name>/ with .html, .css, .js files. Use Mako <%text> blocks for CSS/JS.
general verified_workflows Every single Python DSL workflow that works well must be copied into api/dsl_templates/ folder
general zombie_worker_detection If API shows unhealthy with pending tasks but workers show NOTIFY received without Claiming logs, workers are in zombie state. Root cause is usually asyncio/anyio corruption from mixed sync/async execution. Restart workers to fix immediately. See issues #452 and #453 for underlying bugs to fix.