WF_JSON_SCHEMA_VALIDATION_PLAN

Workflow JSON Schema Validation Plan

Context

We have meta-model-based workflow validation in validate-workflow.ts that:

We also have JSON Schema infrastructure:

Goal: add a --mode json-schema path to validate-workflow.ts that mirrors the meta-model validation but uses AJV + generated JSON Schema instead of S.decodeUnknownEither.


Step 1 — Add --mode CLI option

File: packages/cli/src/index.ts (or wherever validate-workflow subcommand is wired)

Test: CLI smoke — galaxy-tool-cache validate-workflow foo.ga --mode json-schema parses without error.


Step 2 — Structural validation via JSON Schema

File: new packages/cli/src/commands/validate-workflow-json-schema.ts (or inline in existing file — prefer separate to keep the file manageable)

2a — Generate structural JSON Schema from Effect schemas

import * as JSONSchema from "effect/JSONSchema";
const nativeStructSchema = JSONSchema.make(NativeGalaxyWorkflowSchema);
const format2StructSchema = JSONSchema.make(GalaxyWorkflowSchema);

Cache these (they’re static). Compile with AJV once.

2b — Validate structural

const ajv = new Ajv2020({ allErrors: true, strict: false });
const validate = ajv.compile(structSchema);
const valid = validate(data);
if (!valid) { /* collect errors */ }

Convert AJV errors to same StepValidationResult-compatible shape. Mirror Galaxy’s _convert_errors pattern.

2c — Short-circuit

If structural fails, skip tool-state validation (matches Galaxy behavior).

Test: Red-to-green: a structurally invalid workflow (missing steps) fails structural validation via JSON Schema mode.


Step 3 — Per-step tool state validation via JSON Schema (native)

Mirror _validateNativeStep but swap S.decodeUnknownEither for AJV:

async function validateNativeStepJsonSchema(
  step, stepLabel, toolId, toolVersion, cache, validatorCache
): Promise<StepValidationResult> {
  // 1. Resolve tool from cache (same as meta model path)
  // 2. Scan for replacements → skip if "yes" (same)
  // 3. Deep copy state, inject connections (same)
  // 4. Build JSON Schema:
  //    const effectSchema = createFieldModel(bundle, "workflow_step_native");
  //    const jsonSchema = JSONSchema.make(effectSchema);
  //    const validate = ajv.compile(jsonSchema);  // cache by tool_id@version
  // 5. validate(state) → collect errors
}

Key details:

Test: Red-to-green on a native workflow with a known-valid step — passes via JSON Schema mode.


Step 4 — Per-step tool state validation via JSON Schema (format2)

Mirror the two-pass strategy:

4a — Base pass (workflow_step)

const baseSchema = JSONSchema.make(createFieldModel(bundle, "workflow_step"));
const baseValidate = ajv.compile(baseSchema);
baseValidate(state);

4b — Linked pass (workflow_step_linked)

If connections exist:

const linkedState = structuredClone(state);
injectConnectionsIntoState(bundle.parameters, linkedState, connections);
const linkedSchema = JSONSchema.make(createFieldModel(bundle, "workflow_step_linked"));
const linkedValidate = ajv.compile(linkedSchema);
linkedValidate(linkedState);

Cache validators for both representations.

Test: Red-to-green on a format2 workflow with connections — linked pass catches/accepts connected values correctly.


Step 5 — Offline mode (—tool-schema-dir)

Mirror Galaxy’s _load_tool_state_validator_from_dir:

function loadToolSchemaFromDir(
  toolId: string,
  toolVersion: string | null,
  schemaDir: string,
): object | null {
  const safeId = toolId.replace(/\//g, "~");
  const version = toolVersion ?? "_default_";
  // Try {schemaDir}/{safeId}/{version}.json, fallback to _default_.json
}

When --tool-schema-dir is provided, load pre-exported schemas instead of generating from cache. Falls through to dynamic generation if file not found and cache is available.

Test: Export a schema with galaxy-tool-cache schema, put it in the dir layout, validate a workflow against it.


Step 6 — Subworkflow recursion

Both native and format2 JSON Schema paths need to recurse into subworkflows exactly as the meta-model paths do. Factor out the recursion logic or just duplicate the pattern (it’s small).

Test: Validate a workflow with an embedded subworkflow via JSON Schema mode.


Step 7 — IWC sweep test for JSON Schema mode

File: packages/cli/test/iwc-sweep.test.ts or new iwc-sweep-json-schema.test.ts

Add a parallel sweep that runs all IWC workflows through the JSON Schema validation path. Compare:

Any divergence between meta-model and JSON-Schema results = bug to investigate. The sweep is the proof that JSON Schema validation is equivalent.

Test: GALAXY_TEST_IWC_DIRECTORY=... pnpm test -- iwc-sweep passes with 0 new failures.


Step 8 — Error formatting parity

AJV errors are shaped differently than Effect ParseResult errors. Normalize to same output format:

Factor error conversion into a shared helper.


Step 9 — CLI parameter sync with Galaxy

Galaxy’s gxwf-state-validate has:

Our CLI should mirror:


Implementation Order

#StepEst. scopeDepends on
1CLI --mode optionS
2Structural JSON Schema validationM1
3Native per-step JSON Schema validationL1, 2
4Format2 two-pass JSON Schema validationL1, 2
5Offline --tool-schema-dirM3 or 4
6Subworkflow recursionS3, 4
7IWC sweep (JSON Schema)M3, 4, 6
8Error formatting parityS3, 4
9CLI param sync / --stripS

Red-to-green path: steps 1→2→3→7 gets us to a validated native JSON Schema path with IWC coverage. Then 4→7 extends to format2. Steps 5, 8, 9 are polish.


Architecture Notes


Resolved Questions

Resolved During Implementation

IWC Sweep Results

ModeValidatedSkippedFailed
Effect (meta-model)251500
JSON Schema (AJV)251500

Full parity between both validation modes.

Remaining Questions

  1. Should we add a --compare-modes flag for development that runs both effect and json-schema and reports divergences?