Skip to main content

Glossary: in-flight context shapes

Quick reference for the context objects you'll encounter reading executor, worker, and pipeline code. Each entry covers what the shape carries, where it lives in the source tree, and when it is constructed vs. frozen.

Grouped thematically:

  • Job lifecycle typesJobContext, JobIdentity, JobParams, JobResult, ExecutorOutcome, AuditRun, pre-run config overlays
  • Workflow DAG typesWorkflowDAG, StageNode, DAGStage, Artifact, ArtifactStore, ArtifactNamespace
  • Shell + authShell enum / bootShell
  • ErrorsJobFailure

Job lifecycle types

JobContext

Source: src/hyrax/jobs/context.py

The single frozen Pydantic value object that captures every per-job configuration decision. Built once at worker job-claim entry by resolve_job_context and threaded into every downstream layer (executor → pipeline → agents) as PipelineContext.config. No callsite re-resolves config once a JobContext exists.

Carries:

  • Identityjob_id, tenant_id, tenant_schema, repo_name, commit_sha, workflow, job_type
  • Resolved executor / runnerexecutor (deploy mode: in-process | fargate), agent_runner (the fully-resolved runner name — params > tenant_config > env > default)
  • LLM knobsmodel, effort, agent_max_tokens, agent_budget_usd
  • Tenant budgetsbudgets (JobBudgets — the frozen ceiling rows for this tenant)
  • Repo overridestest_command, build_command
  • Operational flagsauto_requeue_on_failure, include_virtual_patches, fingerprint_version, etc.

Resolution precedence (highest wins): per-job params → repo overrides → tenant_config KV → env vars → Python defaults. The model is frozen=True; every consumer reads ctx.fooparams.get("foo") inside executors is blocked by a CI test.

The helper method ctx.identity() returns a JobIdentity (see below). ctx.cache_key() returns the stable cache-bucket string.

JobIdentity

Source: src/hyrax/dag/identity.py

A frozen, hashable dataclass that names the subset of JobContext fields that make a job's result reusable. Used as the stage-cache key and as the canonical fold of the three independently-derived idempotency surfaces (submission dedup, forecast bucketing, DAG output caching).

Identifying fields: tenant_id, tenant_schema, repo_name, commit_sha, workflow, job_type, model, effort, agent_runner, agent_max_tokens, agent_budget_usd, include_virtual_patches, fingerprint_version.

Excluded (intentionally not in the key): job_id, executor, agent_timeout_s, test_command, build_command — these affect where or how long the job runs, not what it produces.

identity.cache_key() returns v3:<sha256>. The v3 prefix guards against collisions when the field set changes; prior prefix versions are invalidated automatically. Construct via JobContext.identity(), not directly.

JobParams

Source: src/hyrax/jobs/job_params.py

A discriminated union of per-workflow Pydantic models that types the jobs.params JSONB column. The workflow field is the discriminator (Literal["audit" | "discover" | ...]); every variant extends CanonicalJobParamsBase so the idempotency hash contract is preserved.

Lifecycle:

  1. Write side — API validates the incoming {workflow, params} body against the matching *Params subclass, then model_dump()s back to a JSON-shaped dict before persisting to the JSONB column.
  2. Read side — workflow run(ctx) entries call parse_job_params(<*Params>, params, ...) once at entry point and use typed accessors (params.effort, params.pr_number, …). The CI gate tests/test_no_dict_get_in_workflows.py regresses any params.get(…) under src/hyrax/workflows/.

The 13 customer-facing verb names are audit, scan, discover, improve, fix, learn, revalidate, review, meta_review, task, publish, benchmark, ideate (scan / fix / meta_review were split out of the legacy audit(mode=scanner_only) / task(mode=ref_directed) / review(scope=meta) discriminators by the workflow-versioning rewrite W1+W2; W4 (2026-05-16) retired the legacy submission forms — typed *Params no longer declares mode / scope, and alembic v31 rejects the legacy shape at the DB layer). extra="allow" (inherited from the canonical base) keeps unknown forward-compat keys flowing through unchanged.

JobResult

Source: src/hyrax/jobs/job_result.py

A discriminated union of per-workflow Pydantic models that types the jobs.result JSONB column. Mirrors JobParams on the output side: every executor's run_*_job(ctx) → dict return value is persisted to this column; typed reads go through read_job_result(row) at src/hyrax/jobs/boundary.py.

JobResultBase (common base) carries the high-traffic fields all workflows stamp:

  • Cost / tokenscost_usd, input_tokens, output_tokens, model, runner
  • Version stampshyrax_version, fingerprint_version
  • Snapshot statesnapshot, runner_content_hash
  • Observabilitytools_run, skills_loaded, dedup_stats, snapshot_capture

extra="allow" is intentional: each workflow stamps additional observability fields (warnings, agent_costs, by_tool, audit_warnings, ran_groups, …); readers that only need the common fields use the base, while readers needing long-tail keys can model_dump(). frozen=False — the result is built incrementally by the workflow, then validated at the read seam.

ExecutorOutcome

Source: src/hyrax/jobs/result.py

The success-side return value from a JobExecutor.run() call. A frozen dataclass with three fields:

  • result_dict — the workflow's return payload (cost, tokens, findings summary, model/runner stamps), later persisted as tenant.jobs.result
  • outputs_dir — for executors that stage artifacts on disk (future Fargate task → S3 download path), the directory the worker reads after the task exits. InProcessExecutor returns None here — it writes directly to the per-tenant DB and has no on-disk staging.
  • duration_s — wall-clock seconds from spawn to exit

Used as the T in Result[ExecutorOutcome, JobFailure] at the worker → executor seam. Constructed by the executor; consumed by the worker's terminal-disposition path. The worker never sees a bare dict at this boundary — it always pattern-matches on Ok(ExecutorOutcome) vs Err(JobFailure).

AuditRun

Source: src/hyrax/workflows/observations/_run.py

A PipelineRun subclass specific to the audit workflow. Carries the mutable in-flight state for a single audit pipeline execution: the stage iterator, the accumulated findings list, the planner output, cost accumulators, and checkpoint state. Constructed at audit executor entry and passed down through the findings pipeline stages. Unlike JobContext (frozen at claim time) and JobResult (built then validated at return), AuditRun is the live write surface during execution — stages append findings, accumulate warnings, and update cost totals directly on it.

Pre-run config overlays

Source: src/hyrax/workflows/_shared/overlays.py

Two small frozen dataclasses — PendingPatchesOverlay and SnapshotPinOverlay — that group the fields each pre-run overlay writes onto a PipelineConfig-shaped object. Each carries an apply(config) method that stamps its fields onto the passed config and returns it. The audit / improve executors construct one of each (when applicable) and call .apply(config) inline at the executor entry — no Protocol, no registry, no composition harness.


Workflow DAG types

The shapes in this section live under src/hyrax/dag/ — the workflow-as-DAG framework that every workflow runs through today (the legacy hyrax.pipeline.Pipeline framework was deleted on 2026-05-04). The framework is content-addressed and pure-ish: same inputs (by Artifact.content_hash) plus same JobContext.cache_key() should produce the same outputs.

WorkflowDAG

Source: src/hyrax/dag/workflow.py

A declarative DAG of StageNodes with type-checked edges. Construction does cheap structural checks (duplicate names); validate() runs the expensive whole-graph checks — missing upstream references, cycles (Kahn's algorithm), and that the union of upstream output_types covers every downstream input_types. run() calls validate() first then walks the DAG in topological order, threading each upstream's outputs into the downstream's flat inputs: Sequence[Artifact]. Self-loops are rejected by design — refine-style loops (fix-gauntlet's write→test→critique cycle) are implemented as in-stage iteration, not DAG self-loops, so the DAG topology stays static across max_iterations knobs and checkpoint reloads. Construction-time, returned from each workflow's pipeline-builder; consumed by run_workflow_with_events (src/hyrax/dag/events.py), which adds the cross-cutting concerns the audit pipeline used to own (event emit, checkpoint, cancel, write-scope gate, partial cost).

StageNode

Source: src/hyrax/dag/workflow.py

One node in a WorkflowDAG: a frozen dataclass of (name, stage, upstream). name must be unique within the DAG; upstream lists the names of nodes whose outputs flow into this stage's inputs. An empty upstream tuple marks a source node — it generates from JobContext and / or seed artifacts passed to run_workflow_with_events(seeds=…). Constructed inline in workflow pipeline-builders; consumed only by WorkflowDAG.

DAGStage

Source: src/hyrax/dag/stage.py

A runtime_checkable Protocol for a stage that consumes typed Artifacts and produces typed Artifacts. Carries name, input_types: tuple[str, ...], output_types: tuple[str, ...], and run(inputs, ctx, store) -> Sequence[Artifact]. The DAG validator checks that upstream output_types cover the stage's input_types; emitting an Artifact with a type_tag outside output_types is a runtime error so a stage can't smuggle types past the validator. Optional class-level capability flags with safe defaults: cacheable (skip on cache hit), requires_writable (gates the stage on WritableNamespace — skipped with read_only_scope reason on a ReadOnlyNamespace), critical (when False, failure logged and DAG continues), tool_name (per-call telemetry stamp).

Artifact

Source: src/hyrax/dag/artifact.py

A frozen dataclass (type_tag, content_hash, metadata) — a content-addressed reference to a payload that lives elsewhere in an ArtifactStore. Identity is the pair (type_tag, content_hash); two artifacts with the same identity are interchangeable, which is what makes the stage-output cache safe. content_hash is a stable SHA-256 of the payload via compute_content_hash (supports bytes, str, Pydantic BaseModel via model_dump(mode="json"), and any orjson-serializable value with sorted keys); the BaseModel-vs-dict paths intentionally collapse to the same hash so a producer emitting RepoData(BaseModel) and a consumer reading the same shape as a dict get identical cache keys. metadata is an immutable MappingProxyType for non-identifying annotations (timestamps, source stage, schema version) — explicitly excluded from content_hash so wall-clock fields don't poison cache identity. Constructed by ArtifactStore.put; consumed by downstream DAGStage.run calls.

ArtifactStore

Source: src/hyrax/dag/artifact.py

A runtime_checkable Protocol for the read/write surface backing artifact payloads. Two-method contract: put(*, type_tag, payload, metadata) -> Artifact lands a payload (computing the content_hash and deduping on it) and returns the reference; get(artifact) -> Any resolves a reference back to its payload. has(artifact) is convenience for cache hit-tests. Two implementations ship: InMemoryArtifactStore (dict-backed, used by tests + bare contexts; first-writer-wins on the underlying object held) and PostgresArtifactStore (backed by tenant-schema dag_artifacts table from the tenant v00 baseline, also exposing a per-(job, stage, namespace) checkpoint surface for resume-after-crash via has_stage / load_stage / save_stage / list_for_job). default_artifact_store(ctx) picks the right backend by ctx.tenant_schema. Stages depend on the Protocol, not the implementation.

ArtifactNamespace

Source: src/hyrax/dag/artifact_namespace.py

A phantom-typed chokepoint for canonical-table writes. Abstract base with two @final concrete subclasses: WritableNamespace (writes permitted; run(label, fn, *args, **kwargs) invokes fn and returns T) and ReadOnlyNamespace (writes suppressed; run logs a structured artifact_namespace.skipped event with label + reason and returns None without invoking fn). Snapshot runs (params.snapshot=true) build a ReadOnlyNamespace; everything else builds a WritableNamespace. Stages declare their requirement via requires_writable: ClassVar[bool] = True on the stage class — Pipeline.run (and the DAG runtime) skip such stages on a read-only scope with a read_only_scope reason, so any new client.<write> call site inside such a stage is unreachable on the snapshot path. The phantom-type discrimination shows up in return types: WritableNamespace.run returns strict T; the abstract base / ReadOnlyNamespace return T | None. Constructed via ArtifactNamespace.for_run(snapshot=...) (the canonical entry point) or ArtifactNamespace.from_params(params); immutable after construction (__setattr__ raises post-__init__).


Shell + auth

Shell enum / bootShell

Source: apps/web/src/shell/ (frontend); apps/api/app/shell.py (backend mirror)

The SPA ships as a single Vite build that boots into one of two VISUAL shells depending on the request hostname. The Shell enum is three-valued ('admin' | 'tenant' | 'iru') — iru is a HOST identifier that renders the same TenantApp component as tenant (no visual reskin), but the backend pins it to a single specific tenant:

Shell valueRendersHostname patternAllowed tenants
"admin"AdminAppadmin.* (e.g. admin.get-hydra.dev, admin.localhost)Operator session (OperatorActor resolved from public.operators; no tenant chain)
"iru"TenantApphydra.app.iru.dev / iru.* (e.g. iru.localhost)slug == iru_host_pinned_slug() (kandji-inc dogfood pin)
"tenant"TenantAppanything else (default)tier IN ('customer', 'internal') AND slug != iru_host_pinned_slug()

iru_host_pinned_slug() lives in src/hyrax/db/tenants/core.py — a dynamic + process-cached resolver keyed on the HYRAX_IRU_GITHUB_ORG_ID env var (GitHub's stable numeric account id). The function returns the kandji-inc dogfood tenant's post-v14 hex slug; cache invalidation rides on the existing tenant_change NOTIFY listener so a purge + re-install of kandji-inc converges without a redeploy. Empty-string return (no env, or env set but no live matching tenant) fails closed: iru host 403s, tenant host excludes nothing. Re-exported from apps/api/app/shell.py so the gate site reads it from the natural place.

Shell is the discriminator string. The shell is resolved once at boot from resolveShellFromHostname(window.location.hostname) (apps/web/src/shell/resolveShell.ts — pure, unit-tested) and exported as the module-level const bootShell from apps/web/src/shell/current.ts. Components import bootShell directly — no Provider/Context indirection, since the value is fixed for the lifetime of the page.

The backend mirror resolve_shell_from_hostname in apps/api/app/shell.py plus ShellHostnameMiddleware (apps/api/app/middleware.py) — which reads the X-Forwarded-Host (or Host) header and stamps request.state.shell — must stay in sync with the frontend resolver via the shared apps/web/src/shell/shell_parity_cases.json fixture. require_shell_allows_tenant in apps/api/app/deps.py is the strict shell↔tenant gate for tenant-bearing routes — iru host → slug == iru_host_pinned_slug(); tenant host → tier IN ('customer','internal') AND slug != iru_host_pinned_slug(). (Admin host has no tenant to gate against; operator routes mount through create_operator_router, which wires require_admin_host instead.) Mismatch returns HTTP 403. The iru-pinned tenant is reachable ONLY via the iru host — the tenant-host symmetric defence refuses it.


Errors

JobFailure

Source: src/hyrax/jobs/result.py

A Pydantic discriminated union of every failure mode an executor can return. The kind discriminator field routes the worker's match statement. Every variant extends _JobFailureBase, which carries:

  • message — human-readable string surfaced in jobs.error and log lines
  • outputs_dir, partial_result_dict — forensic state so the worker can ingest partial JSONL ledger writes regardless of which failure variant fired
  • duration_s — wall-clock duration

Concrete variants: BudgetExceededFailure, CancelledFailure, TimedOutFailure, ReapedFailure, ExecutorRaisedFailure, BadInputsFailure, NonZeroExitFailure. The naming convention uses a Failure suffix to distinguish variants from the matching exception classes in hyrax.job_exceptions (e.g. CancelledFailure vs JobCancelled) — exceptions are internal control flow inside an executor; JobFailure variants are the typed return value at the seam boundary.