Memory
Persistent context for AI agents
Stored Items
| Category | Key | Value | Actions |
|---|---|---|---|
| lessons | AUTH_DATABASE_TYPE | REQUIRED for auth library - Without AUTH_DATABASE_TYPE=postgresql, auth library defaults to SQLite. Always set AUTH_DATABASE_TYPE before AUTH_DATABASE_URL in K8s deployments. | |
| general | EVENT_SCOPING_WORKFLOW_RUN_ID | Events MUST be scoped by workflow_run_id (NOT NULL enforced). Global events are forbidden - SQL functions raise exception if workflow_run_id is NULL. Migration 01KDX91GQGF3KDVG3EZWDWKRFT enforces NOT NULL constraint. Python methods (emit_event, await_event, wait_for_event, check_event) also validate and raise ValueError early. | |
| workflow | activity_wait_pattern | Activity workflows need wait_for_event after each builder.activity() to wait for completion via {{result.completion_event}} | |
| guidelines | api_idempotency | ALL workflow engine APIs and actions must be strictly idempotent. No side effects on second call. Use idempotency keys where needed. | |
| guidelines | api_only | ALL tools must use API to interact with the workflow engine. Direct database access is forbidden outside of engine core code. | |
| architecture | app_code_no_cache | Issue #326: All app code is stored in database and loaded fresh on each execution. NO CACHING. python_module entrypoint removed. Cache APIs deprecated. Changes take effect immediately. | |
| guidelines | async | When async needed, use trio library only. | |
| lessons | atomic_transactions | Data integrity and atomicity are top priorities. TimeoutService uses atomic check-and-set (fencing) to prevent race conditions. | |
| general | audit_log_transaction_isolation | Audit logs in PostgreSQL: When logging to an audit table within a transaction that may rollback, always ensure critical audit events (especially failure events) are logged in a SEPARATE transaction. Use a separate connection or log in the rollback handler's separate transaction. Relying on async logging or sidecar telemetry alone creates gaps in the authoritative audit trail. Issue #755. | |
| guidelines | backups | Write backups to /tmp/highway-workflow-backups/ with timestamps. | |
| guidelines | circuit_breaker | Use highway_circuitbreaker package only. Source at ~/develop/highway_circuitbreaker/. Build wheel if changes needed (pip install -e wont work). | |
| guidelines | code_quality | Ruff for linting/formatting. All code must pass ruff and mypy checks. | |
| architecture | core_components | 1. Absurd Task Queue (PostgreSQL-based), 2. AbsurdClient (transaction-aware, never commits), 3. Orchestrator (atomic SUCCESS/SLEEP/FAILURE paths), 4. DurableContext (ctx.step, ctx.wait_for_event, ctx.emit_event), 5. WorkflowInterpreter (Highway DSL executor), 6. CLI Tools (worker.py, submit.py, replay.py, monitor.py) | |
| guidelines | database_rules | Use psycopg3 only with explicit transactions. No ORMs. SQL queries must be in engine/sql/queries/*.sql files. | |
| architecture | database_tables | absurd.t_*/r_*/c_*/e_* (task definitions, runs, checkpoints, events per queue), workflow_run (execution tracking), absurd_event_log (immutable audit trail), absurd_checkpoint (snapshot storage) | |
| dependencies | datashard | DataShard v0.2.1 - Install from PyPI only: pip install datashard==0.2.1. NO source modifications. See docs/agent-docs/DATASHARD_INTEGRATION.md | |
| guidelines | db_migrations | NEVER modify database schema/functions directly. Always create migration files in engine/migrations/sql/highway_X.X.XX_description.sql. Use IF EXISTS for idempotency. | |
| guidelines | debugging | Use hwe replay <workflow_id> to replay workflows. Prefer this over direct db access. Add missing features to engine/cli/replay.py | |
| guidelines | demo_workflows | Demo workflows (disaster recovery, demo v2, video demo) are golden standard. Must be kept up-to-date and re-tested after code changes. All demo workflows go in demo tenant. | |
| deployment | docker | Isolated: docker compose up -d (4 workers + PostgreSQL 16). Multi-stage Dockerfile Python 3.13-slim. Max ulimits 65536 fds, 4096 procs. | |
| general | docker_no_remove_param | tools.docker.run does NOT have a 'remove' parameter. Containers are ALWAYS automatically removed after completion. Do not include remove=True in Docker workflows. | |
| guidelines | documentation | Markdown files in docs/agent-docs/. No excessive summaries or MD files. | |
| guidelines | dsl | Highway DSL is the only DSL for data modeling and schema definition. Use hwe dsl-prompt for syntax info. | |
| general | handler_task_ordering | In WorkflowBuilder, handler tasks (on_failure/on_success) MUST be defined AFTER the tasks that reference them. Reason: auto-chaining makes first task start_task, and unrecognized handlers get chained. Always define main workflow tasks FIRST, handlers LAST. | |
| dependencies | highway_dsl | Highway DSL v1.9.0 - Build from source in ~/develop/highway_dsl/. Modifications allowed. See docs/agent-docs/HIGHWAY_DSL_INTEGRATION.md | |
| guidelines | logging | Use logging module only. No print statements allowed. | |
| general | logging-best-practice | CRITICAL: Never use f-strings in logger calls. ALWAYS use lazy formatting: logger.debug('message %s', value) NOT logger.debug(f'message {value}'). F-strings evaluate immediately even if log level is disabled, causing: 1) Performance overhead, 2) Security risk - values get logged even at DEBUG when user sees INFO. Use %s placeholders and pass values as args. | |
| general | logging-security | Never log actual variable values in workflow engines. Log only metadata: type, length, presence (is_set=True). Workflow variables may contain PII, secrets, or large content like 'THE COMPLETE SHERLOCK HOLMES'. Use DEBUG level for internal tracing, not INFO. | |
| guidelines | migrations | Raw PostgreSQL only in engine/migrations/sql/ directory, executed in order. All DB changes must go via migration scripts. | |
| lessons | no_deprecations | No deprecated code allowed. Remove deprecated code instead. Production-grade codebase, no deprecations. | |
| lessons | parallel_operator_fix | ParallelOperator and ActivityOperator are fork/queue-only (Nov 16, 2025). They spawn tasks and return immediately. No automatic waiting. Prevents double-fork bugs in crash recovery. Workflows must add explicit join/wait logic separately. | |
| guidelines | process_management | Never kill gunicorn or processes directly. Use systemctl or docker compose to manage services. | |
| workflow | result_key_naming | Avoid using result_key with same name as task_id - causes 'int object does not support item assignment' error during workflow execution | |
| lessons | resumability | Resumability is critical. Workflows must be able to resume from checkpoints without data loss or corruption. | |
| guidelines | secrets | No secrets or config in .env file. Configs in /etc/highway/config.ini and secrets in Vault only, even for development. | |
| general | service update | never create processes yourself. also deploy code by running 'make restart'. | |
| lessons | spawn_based_logging | Use spawn-based logging instead of injection-based (Nov 15, 2025). Spawn async Absurd tasks for logging after task completion. Crash-safe, idempotent via idempotency keys, no workflow interference. | |
| lessons | sql_injection_prevention | All services must use psycopg.sql.Identifier() for dynamic table names to prevent SQL injection. | |
| general | submit DSL not JSON | Submit workflows in Python DSL format, not JSON (unless requested) | |
| deployment | systemd | Production: bash install-systemd-workers.sh (4 workers). Status: sudo systemctl status highway-worker.target. Logs: sudo journalctl -u highway-worker@* -f | |
| apps | tenant_app_installation | Apps must be installed for specific tenant (test vs default) to be usable in workflows submitted with that tenant | |
| guidelines | testing | Integration tests only in tests/integration/. API-based calls only. No mocking. No inline workflow definitions - all must be in tests/workflow_examples/. Tests must pass with -n 4 parallel. No sequential tests. | |
| lessons | threadpool_executor_danger | NEVER use with ThreadPoolExecutor(). Use try/finally with shutdown(wait=False, cancel_futures=True). Also: as_completed() MUST have timeout. | |
| guidelines | ui_components | ALL UI components must be in api/ui/templates/components/<component_name>/ with .html, .css, .js files. Use Mako <%text> blocks for CSS/JS. | |
| general | verified_workflows | Every single Python DSL workflow that works well must be copied into api/dsl_templates/ folder | |
| general | zombie_worker_detection | If API shows unhealthy with pending tasks but workers show NOTIFY received without Claiming logs, workers are in zombie state. Root cause is usually asyncio/anyio corruption from mixed sync/async execution. Restart workers to fix immediately. See issues #452 and #453 for underlying bugs to fix. |