Glossary: in-flight context shapes
Quick reference for the context objects you'll encounter reading executor, worker, and pipeline code. Each entry covers what the shape carries, where it lives in the source tree, and when it is constructed vs. frozen.
Grouped thematically:
- Job lifecycle types —
JobContext,JobIdentity,JobParams,JobResult,ExecutorOutcome,AuditRun, pre-run config overlays - Workflow DAG types —
WorkflowDAG,StageNode,DAGStage,Artifact,ArtifactStore,ArtifactNamespace - Shell + auth —
Shellenum /bootShell - Errors —
JobFailure
Job lifecycle types
JobContext
Source: src/hyrax/jobs/context.py
The single frozen Pydantic value object that captures every per-job configuration decision. Built once at worker job-claim entry by resolve_job_context and threaded into every downstream layer (executor → pipeline → agents) as PipelineContext.config. No callsite re-resolves config once a JobContext exists.
Carries:
- Identity —
job_id,tenant_id,tenant_schema,repo_name,commit_sha,workflow,job_type - Resolved executor / runner —
executor(deploy mode:in-process|fargate),agent_runner(the fully-resolved runner name — params > tenant_config > env > default) - LLM knobs —
model,effort,agent_max_tokens,agent_budget_usd - Tenant budgets —
budgets(JobBudgets— the frozen ceiling rows for this tenant) - Repo overrides —
test_command,build_command - Operational flags —
auto_requeue_on_failure,include_virtual_patches,fingerprint_version, etc.
Resolution precedence (highest wins): per-job params → repo overrides → tenant_config KV → env vars → Python defaults. The model is frozen=True; every consumer reads ctx.foo — params.get("foo") inside executors is blocked by a CI test.
The helper method ctx.identity() returns a JobIdentity (see below). ctx.cache_key() returns the stable cache-bucket string.
JobIdentity
Source: src/hyrax/dag/identity.py
A frozen, hashable dataclass that names the subset of JobContext fields that make a job's result reusable. Used as the stage-cache key and as the canonical fold of the three independently-derived idempotency surfaces (submission dedup, forecast bucketing, DAG output caching).
Identifying fields: tenant_id, tenant_schema, repo_name, commit_sha, workflow, job_type, model, effort, agent_runner, agent_max_tokens, agent_budget_usd, include_virtual_patches, fingerprint_version.
Excluded (intentionally not in the key): job_id, executor, agent_timeout_s, test_command, build_command — these affect where or how long the job runs, not what it produces.
identity.cache_key() returns v3:<sha256>. The v3 prefix guards against collisions when the field set changes; prior prefix versions are invalidated automatically. Construct via JobContext.identity(), not directly.
JobParams
Source: src/hyrax/jobs/job_params.py
A discriminated union of per-workflow Pydantic models that types the jobs.params JSONB column. The workflow field is the discriminator (Literal["audit" | "discover" | ...]); every variant extends CanonicalJobParamsBase so the idempotency hash contract is preserved.
Lifecycle:
- Write side — API validates the incoming
{workflow, params}body against the matching*Paramssubclass, thenmodel_dump()s back to a JSON-shaped dict before persisting to the JSONB column. - Read side — workflow
run(ctx)entries callparse_job_params(<*Params>, params, ...)once at entry point and use typed accessors (params.effort,params.pr_number, …). The CI gatetests/test_no_dict_get_in_workflows.pyregresses anyparams.get(…)undersrc/hyrax/workflows/.
The 13 customer-facing verb names are audit, scan, discover, improve, fix, learn, revalidate, review, meta_review, task, publish, benchmark, ideate (scan / fix / meta_review were split out of the legacy audit(mode=scanner_only) / task(mode=ref_directed) / review(scope=meta) discriminators by the workflow-versioning rewrite W1+W2; W4 (2026-05-16) retired the legacy submission forms — typed *Params no longer declares mode / scope, and alembic v31 rejects the legacy shape at the DB layer). extra="allow" (inherited from the canonical base) keeps unknown forward-compat keys flowing through unchanged.
JobResult
Source: src/hyrax/jobs/job_result.py
A discriminated union of per-workflow Pydantic models that types the jobs.result JSONB column. Mirrors JobParams on the output side: every executor's run_*_job(ctx) → dict return value is persisted to this column; typed reads go through read_job_result(row) at src/hyrax/jobs/boundary.py.
JobResultBase (common base) carries the high-traffic fields all workflows stamp:
- Cost / tokens —
cost_usd,input_tokens,output_tokens,model,runner - Version stamps —
hyrax_version,fingerprint_version - Snapshot state —
snapshot,runner_content_hash - Observability —
tools_run,skills_loaded,dedup_stats,snapshot_capture
extra="allow" is intentional: each workflow stamps additional observability fields (warnings, agent_costs, by_tool, audit_warnings, ran_groups, …); readers that only need the common fields use the base, while readers needing long-tail keys can model_dump(). frozen=False — the result is built incrementally by the workflow, then validated at the read seam.
ExecutorOutcome
Source: src/hyrax/jobs/result.py
The success-side return value from a JobExecutor.run() call. A frozen dataclass with three fields:
result_dict— the workflow's return payload (cost, tokens, findings summary, model/runner stamps), later persisted astenant.jobs.resultoutputs_dir— for executors that stage artifacts on disk (future Fargate task → S3 download path), the directory the worker reads after the task exits.InProcessExecutorreturnsNonehere — it writes directly to the per-tenant DB and has no on-disk staging.duration_s— wall-clock seconds from spawn to exit
Used as the T in Result[ExecutorOutcome, JobFailure] at the worker → executor seam. Constructed by the executor; consumed by the worker's terminal-disposition path. The worker never sees a bare dict at this boundary — it always pattern-matches on Ok(ExecutorOutcome) vs Err(JobFailure).
AuditRun
Source: src/hyrax/workflows/observations/_run.py
A PipelineRun subclass specific to the audit workflow. Carries the mutable in-flight state for a single audit pipeline execution: the stage iterator, the accumulated findings list, the planner output, cost accumulators, and checkpoint state. Constructed at audit executor entry and passed down through the findings pipeline stages. Unlike JobContext (frozen at claim time) and JobResult (built then validated at return), AuditRun is the live write surface during execution — stages append findings, accumulate warnings, and update cost totals directly on it.
Pre-run config overlays
Source: src/hyrax/workflows/_shared/overlays.py
Two small frozen dataclasses — PendingPatchesOverlay and SnapshotPinOverlay — that group the fields each pre-run overlay writes onto a PipelineConfig-shaped object. Each carries an apply(config) method that stamps its fields onto the passed config and returns it. The audit / improve executors construct one of each (when applicable) and call .apply(config) inline at the executor entry — no Protocol, no registry, no composition harness.
Workflow DAG types
The shapes in this section live under src/hyrax/dag/ — the workflow-as-DAG framework that every workflow runs through today (the legacy hyrax.pipeline.Pipeline framework was deleted on 2026-05-04). The framework is content-addressed and pure-ish: same inputs (by Artifact.content_hash) plus same JobContext.cache_key() should produce the same outputs.
WorkflowDAG
Source: src/hyrax/dag/workflow.py
A declarative DAG of StageNodes with type-checked edges. Construction does cheap structural checks (duplicate names); validate() runs the expensive whole-graph checks — missing upstream references, cycles (Kahn's algorithm), and that the union of upstream output_types covers every downstream input_types. run() calls validate() first then walks the DAG in topological order, threading each upstream's outputs into the downstream's flat inputs: Sequence[Artifact]. Self-loops are rejected by design — refine-style loops (fix-gauntlet's write→test→critique cycle) are implemented as in-stage iteration, not DAG self-loops, so the DAG topology stays static across max_iterations knobs and checkpoint reloads. Construction-time, returned from each workflow's pipeline-builder; consumed by run_workflow_with_events (src/hyrax/dag/events.py), which adds the cross-cutting concerns the audit pipeline used to own (event emit, checkpoint, cancel, write-scope gate, partial cost).
StageNode
Source: src/hyrax/dag/workflow.py
One node in a WorkflowDAG: a frozen dataclass of (name, stage, upstream). name must be unique within the DAG; upstream lists the names of nodes whose outputs flow into this stage's inputs. An empty upstream tuple marks a source node — it generates from JobContext and / or seed artifacts passed to run_workflow_with_events(seeds=…). Constructed inline in workflow pipeline-builders; consumed only by WorkflowDAG.
DAGStage
Source: src/hyrax/dag/stage.py
A runtime_checkable Protocol for a stage that consumes typed Artifacts and produces typed Artifacts. Carries name, input_types: tuple[str, ...], output_types: tuple[str, ...], and run(inputs, ctx, store) -> Sequence[Artifact]. The DAG validator checks that upstream output_types cover the stage's input_types; emitting an Artifact with a type_tag outside output_types is a runtime error so a stage can't smuggle types past the validator. Optional class-level capability flags with safe defaults: cacheable (skip on cache hit), requires_writable (gates the stage on WritableNamespace — skipped with read_only_scope reason on a ReadOnlyNamespace), critical (when False, failure logged and DAG continues), tool_name (per-call telemetry stamp).
Artifact
Source: src/hyrax/dag/artifact.py
A frozen dataclass (type_tag, content_hash, metadata) — a content-addressed reference to a payload that lives elsewhere in an ArtifactStore. Identity is the pair (type_tag, content_hash); two artifacts with the same identity are interchangeable, which is what makes the stage-output cache safe. content_hash is a stable SHA-256 of the payload via compute_content_hash (supports bytes, str, Pydantic BaseModel via model_dump(mode="json"), and any orjson-serializable value with sorted keys); the BaseModel-vs-dict paths intentionally collapse to the same hash so a producer emitting RepoData(BaseModel) and a consumer reading the same shape as a dict get identical cache keys. metadata is an immutable MappingProxyType for non-identifying annotations (timestamps, source stage, schema version) — explicitly excluded from content_hash so wall-clock fields don't poison cache identity. Constructed by ArtifactStore.put; consumed by downstream DAGStage.run calls.
ArtifactStore
Source: src/hyrax/dag/artifact.py
A runtime_checkable Protocol for the read/write surface backing artifact payloads. Two-method contract: put(*, type_tag, payload, metadata) -> Artifact lands a payload (computing the content_hash and deduping on it) and returns the reference; get(artifact) -> Any resolves a reference back to its payload. has(artifact) is convenience for cache hit-tests. Two implementations ship: InMemoryArtifactStore (dict-backed, used by tests + bare contexts; first-writer-wins on the underlying object held) and PostgresArtifactStore (backed by tenant-schema dag_artifacts table from the tenant v00 baseline, also exposing a per-(job, stage, namespace) checkpoint surface for resume-after-crash via has_stage / load_stage / save_stage / list_for_job). default_artifact_store(ctx) picks the right backend by ctx.tenant_schema. Stages depend on the Protocol, not the implementation.
ArtifactNamespace
Source: src/hyrax/dag/artifact_namespace.py
A phantom-typed chokepoint for canonical-table writes. Abstract base with two @final concrete subclasses: WritableNamespace (writes permitted; run(label, fn, *args, **kwargs) invokes fn and returns T) and ReadOnlyNamespace (writes suppressed; run logs a structured artifact_namespace.skipped event with label + reason and returns None without invoking fn). Snapshot runs (params.snapshot=true) build a ReadOnlyNamespace; everything else builds a WritableNamespace. Stages declare their requirement via requires_writable: ClassVar[bool] = True on the stage class — Pipeline.run (and the DAG runtime) skip such stages on a read-only scope with a read_only_scope reason, so any new client.<write> call site inside such a stage is unreachable on the snapshot path. The phantom-type discrimination shows up in return types: WritableNamespace.run returns strict T; the abstract base / ReadOnlyNamespace return T | None. Constructed via ArtifactNamespace.for_run(snapshot=...) (the canonical entry point) or ArtifactNamespace.from_params(params); immutable after construction (__setattr__ raises post-__init__).
Shell + auth
Shell enum / bootShell
Source: apps/web/src/shell/ (frontend); apps/api/app/shell.py (backend mirror)
The SPA ships as a single Vite build that boots into one of two VISUAL shells depending on the request hostname. The Shell enum is three-valued ('admin' | 'tenant' | 'iru') — iru is a HOST identifier that renders the same TenantApp component as tenant (no visual reskin), but the backend pins it to a single specific tenant:
| Shell value | Renders | Hostname pattern | Allowed tenants |
|---|---|---|---|
"admin" | AdminApp | admin.* (e.g. admin.get-hydra.dev, admin.localhost) | Operator session (OperatorActor resolved from public.operators; no tenant chain) |
"iru" | TenantApp | hydra.app.iru.dev / iru.* (e.g. iru.localhost) | slug == iru_host_pinned_slug() (kandji-inc dogfood pin) |
"tenant" | TenantApp | anything else (default) | tier IN ('customer', 'internal') AND slug != iru_host_pinned_slug() |
iru_host_pinned_slug() lives in src/hyrax/db/tenants/core.py — a dynamic + process-cached resolver keyed on the HYRAX_IRU_GITHUB_ORG_ID env var (GitHub's stable numeric account id). The function returns the kandji-inc dogfood tenant's post-v14 hex slug; cache invalidation rides on the existing tenant_change NOTIFY listener so a purge + re-install of kandji-inc converges without a redeploy. Empty-string return (no env, or env set but no live matching tenant) fails closed: iru host 403s, tenant host excludes nothing. Re-exported from apps/api/app/shell.py so the gate site reads it from the natural place.
Shell is the discriminator string. The shell is resolved once at boot from resolveShellFromHostname(window.location.hostname) (apps/web/src/shell/resolveShell.ts — pure, unit-tested) and exported as the module-level const bootShell from apps/web/src/shell/current.ts. Components import bootShell directly — no Provider/Context indirection, since the value is fixed for the lifetime of the page.
The backend mirror resolve_shell_from_hostname in apps/api/app/shell.py plus ShellHostnameMiddleware (apps/api/app/middleware.py) — which reads the X-Forwarded-Host (or Host) header and stamps request.state.shell — must stay in sync with the frontend resolver via the shared apps/web/src/shell/shell_parity_cases.json fixture. require_shell_allows_tenant in apps/api/app/deps.py is the strict shell↔tenant gate for tenant-bearing routes — iru host → slug == iru_host_pinned_slug(); tenant host → tier IN ('customer','internal') AND slug != iru_host_pinned_slug(). (Admin host has no tenant to gate against; operator routes mount through create_operator_router, which wires require_admin_host instead.) Mismatch returns HTTP 403. The iru-pinned tenant is reachable ONLY via the iru host — the tenant-host symmetric defence refuses it.
Errors
JobFailure
Source: src/hyrax/jobs/result.py
A Pydantic discriminated union of every failure mode an executor can return. The kind discriminator field routes the worker's match statement. Every variant extends _JobFailureBase, which carries:
message— human-readable string surfaced injobs.errorand log linesoutputs_dir,partial_result_dict— forensic state so the worker can ingest partial JSONL ledger writes regardless of which failure variant firedduration_s— wall-clock duration
Concrete variants: BudgetExceededFailure, CancelledFailure, TimedOutFailure, ReapedFailure, ExecutorRaisedFailure, BadInputsFailure, NonZeroExitFailure. The naming convention uses a Failure suffix to distinguish variants from the matching exception classes in hyrax.job_exceptions (e.g. CancelledFailure vs JobCancelled) — exceptions are internal control flow inside an executor; JobFailure variants are the typed return value at the seam boundary.