TS_JSON_SCHEMA_STATE_TESTING_PLAN

Plan: JSON Schema Generation & Validation Testing

Repo: galaxy-tool-util (TS) Upstream branch: json_schema_parameters in Galaxy (4 commits) Goal: Generate JSON Schema from Effect models, validate against parameter_specification.yml, match the Python-side test_parameter_specification_json_schema.py test coverage.


Context: What the Python Side Did (4 Commits)

Commit 1: d06e043a — Foundation

Commit 2: c2a667be — In-range, length, regex in JSON Schema

Commit 3: 92b430f4 — Color pattern + negated length

Commit 4: e8d71557 — Rename json_schema_skip_json_schema_skip

Net YAML Changes

The Python-side parameter_specification.yml has 21 more lines than our copy — all _json_schema_skip entries for validators that remain AfterValidator-only:

Also some non-skip structural changes: dce src ordering in data collections, YAML anchor additions for data_optional job_runtime entries.


Current State: TS Side

What Works

The Gap

All 6 validators use S.filter() (runtime-only). JSONSchema.make() throws on S.filter() unless a jsonSchema annotation is provided. This means:

ValidatorCurrent ImplementationJSON Schema Emittable?
in-rangeS.filter() with manual min/max logicYes — use S.greaterThanOrEqualTo() etc. or jsonSchema annotation
regex (non-negated)S.filter() with RegExp.test()Yes — use S.pattern() or jsonSchema: {pattern}
length (non-negated)S.filter() with .length checksYes — use S.minLength()/S.maxLength() or annotation
length (negated)S.filter() with inverted checkYesjsonSchema: {not: {minLength, maxLength}}
expressionS.filter() parsing Python expressionNo — runtime only, needs _json_schema_skip
empty_fieldS.filter() checking length === 0No — runtime only, needs _json_schema_skip

Key insight: Effect Schema provides first-class JSON Schema support — S.greaterThanOrEqualTo(), S.lessThan(), S.minLength(), S.maxLength(), S.pattern() all emit correct JSON Schema keywords. We can either:

Recommendation: Option A for in-range, regex, length. These combinators emit better JSON Schema and provide better error messages. Keep S.filter() for expression and empty_field (truly runtime-only).


Implementation Plan

Step 1: Sync parameter_specification.yml from Galaxy + Add Makefile Target

The TS copy is slightly out of sync with the Python source (dce ordering, YAML anchor deduplication for data_optional job_runtime, plus the new _json_schema_skip entries). The yaml npm package handles anchors/aliases natively, so we can copy the file directly.

Add a sync-param-spec Makefile target following the existing sync-golden pattern:

# Sync parameter_specification.yml from Galaxy repo.
#   GALAXY_ROOT=~/projects/worktrees/galaxy/branch/json_schema_parameters make sync-param-spec
PARAM_SPEC_SRC = $(GALAXY_ROOT)/test/unit/tool_util/parameter_specification.yml
PARAM_SPEC_DST = packages/schema/test/fixtures/parameter_specification.yml

sync-param-spec:
ifndef GALAXY_ROOT
	$(error GALAXY_ROOT is not set. Point it at your Galaxy checkout.)
endif
	@test -f "$(PARAM_SPEC_SRC)" || (echo "ERROR: $(PARAM_SPEC_SRC) not found" && exit 1)
	@echo "Syncing parameter_specification.yml from $(PARAM_SPEC_SRC)..."
	cp $(PARAM_SPEC_SRC) $(PARAM_SPEC_DST)
	@echo "Synced."

Also update the .PHONY line to include sync-param-spec, and consider a top-level sync target that runs both:

sync: sync-golden sync-param-spec

The existing parameter-specification.test.ts already skips unknown keys (anything not {rep}_valid/{rep}_invalid), so the new _json_schema_skip entries won’t break existing tests.

Test: existing 2087 tests still pass after sync.

Step 2: Upgrade Validators to Emit JSON Schema Keywords

2a: in-range.ts — Use Effect Schema numeric combinators

Replace S.filter() with chained Effect Schema combinators:

function applyInRange(schema: S.Schema.Any, validator: unknown): S.Schema.Any {
  const v = validator as InRangeValidatorModel;
  if (v.negate) {
    // Negated range — keep S.filter, add jsonSchema annotation
    return (schema as S.Schema<number>).pipe(
      S.filter((value: number) => { /* existing logic */ }, {
        jsonSchema: { not: buildRangeConstraint(v) }
      }),
    ) as S.Schema.Any;
  }
  // Non-negated: chain Effect Schema combinators
  let s = schema as S.Schema<number>;
  if (v.min != null) {
    s = v.exclude_min ? s.pipe(S.greaterThan(v.min)) : s.pipe(S.greaterThanOrEqualTo(v.min));
  }
  if (v.max != null) {
    s = v.exclude_max ? s.pipe(S.lessThan(v.max)) : s.pipe(S.lessThanOrEqualTo(v.max));
  }
  return s as S.Schema.Any;
}

Test (red-to-green): JSON Schema test for gx_int_validation_range, gx_int_min_max, gx_float_validation_range, gx_float_min_max entries should pass.

2b: regex.ts — Use S.pattern() for non-negated

function applyRegex(schema: S.Schema.Any, validator: unknown): S.Schema.Any {
  const v = validator as RegexValidatorModel;
  const re = new RegExp(v.expression);
  if (v.negate) {
    return (schema as S.Schema<string>).pipe(
      S.filter((value: string) => !re.test(value)),
    ) as S.Schema.Any;
  }
  // Non-negated: S.pattern emits JSON Schema "pattern" keyword
  // Python re.match anchors at start; add ^ if missing
  let pattern = v.expression;
  if (!pattern.startsWith("^")) pattern = "^" + pattern;
  return (schema as S.Schema<string>).pipe(S.pattern(new RegExp(pattern))) as S.Schema.Any;
}

Consideration: S.pattern() also validates at runtime, so we don’t lose validation coverage. But the regex semantics differ slightly — Python re.match anchors at start, JSON Schema pattern does not by default. We prepend ^ like the Python side does.

Test: JSON Schema test for gx_text_regex_validation should pass.

2c: length.ts — Use S.minLength()/S.maxLength() or negated annotation

function applyLength(schema: S.Schema.Any, validator: unknown): S.Schema.Any {
  const v = validator as LengthValidatorModel;
  if (v.negate) {
    const notConstraint: Record<string, number> = {};
    if (v.min != null) notConstraint.minLength = v.min;
    if (v.max != null) notConstraint.maxLength = v.max;
    return (schema as S.Schema<string>).pipe(
      S.filter((value: string) => {
        let valid = true;
        if (v.min != null) valid = valid && value.length >= v.min;
        if (v.max != null) valid = valid && value.length <= v.max;
        return !valid;
      }, { jsonSchema: { not: notConstraint } }),
    ) as S.Schema.Any;
  }
  let s = schema as S.Schema<string>;
  if (v.min != null) s = s.pipe(S.minLength(v.min));
  if (v.max != null) s = s.pipe(S.maxLength(v.max));
  return s as S.Schema.Any;
}

Test: JSON Schema test for gx_text_length_validation, gx_text_length_validation_negate should pass.

2d: expression.ts and empty-field.ts — Add jsonSchema passthrough annotation

These validators can’t be represented in JSON Schema. Add jsonSchema: {} annotation so JSONSchema.make() doesn’t throw — it will simply emit the base type without constraint keywords.

// expression.ts - add jsonSchema annotation to prevent JSONSchema.make() throw
S.filter((value: string) => { /* existing */ }, {
  jsonSchema: {},  // not representable — covered by _json_schema_skip in tests
})

Same for empty-field.ts.

Test: JSON Schema generation should no longer throw for tools with expression/empty_field validators. The _json_schema_skip entries tolerate these _invalid entries passing.

Step 3: Fix gx-color JSON Schema Pattern

gx-color.ts already uses S.pattern(/^#[0-9a-fA-F]{6}$/). The Python side uses ^#[0-9a-f]{6}$ (lowercase only). The case-insensitive regex in TS accepts uppercase hex digits which is more permissive. Check whether this matters for JSON Schema validation — the test fixture only has lowercase color values, so this should be fine. No change needed unless tests fail.

Step 4: Handle Structural JSON Schema Issues

The Python side needed three post-processing fixes in json.py. Check if Effect Schema’s JSONSchema.make() has analogous issues:

  1. Conditional oneOf disambiguation — Python Pydantic emits overlapping oneOf branches for conditionals because the discriminator is a callable. Effect Schema uses S.Union with S.Literal discriminators — check if this produces clean oneOf or needs fixing.

  2. Collection runtime oneOfanyOf — Pydantic’s nested collection discriminator overlap. Effect Schema may handle this differently — verify with a data_collection test case.

  3. Annotated types keyword normalization — Pydantic emits ge/gt instead of minimum/exclusiveMinimum for Union types. Effect Schema emits standard keywords — likely no fix needed.

Approach: Write JSON Schema test first, then fix issues as they surface. These fixes likely live in a new utility function that post-processes JSONSchema.make() output, or they may not be needed at all.

Step 5: Write JSON Schema Test

Create packages/schema/test/parameter-specification-json-schema.test.ts mirroring the Python test structure:

import { describe, it, expect, afterAll } from "vitest";
import * as JSONSchema from "@effect/schema/JSONSchema";
import Ajv from "ajv/dist/2020";  // or jsonschema equivalent
import { createFieldModel } from "../src/schema/model-factory.js";
// ... same imports as parameter-specification.test.ts

describe("parameter specification (JSON Schema)", () => {
  for (const [toolName, combos] of Object.entries(specification)) {
    describe(toolName, () => {
      const bundle = loadBundle(toolName);
      if (!bundle) { it.skip(...); return; }
      // ... skip logic same as existing test

      const jsonSchemaSkip: Record<string, string> = combos._json_schema_skip ?? {};
      const jsonSchemaValidSkip: Record<string, string> = combos._json_schema_valid_skip ?? {};

      for (const [specKey, testCases] of Object.entries(combos)) {
        if (specKey.startsWith("_")) continue;
        // ... parse rep + valid/invalid from specKey

        const effectSchema = createFieldModel(bundle, stateRep);
        if (!effectSchema) { it.skip(...); continue; }

        let jsonSchema;
        try { jsonSchema = JSONSchema.make(effectSchema); }
        catch { it.skip(`JSON Schema generation failed for ${stateRep}`); continue; }

        for (let i = 0; i < testCases.length; i++) {
          it(`${specKey}[${i}]`, () => {
            const valid = ajv.validate(jsonSchema, testCases[i]);
            if (isValid && !valid && !(specKey in jsonSchemaValidSkip)) {
              expect.fail(`valid entry REJECTED: ${JSON.stringify(testCases[i])}`);
            }
            if (!isValid && valid && !(specKey in jsonSchemaSkip)) {
              expect.fail(`invalid entry ACCEPTED: ${JSON.stringify(testCases[i])}`);
            }
          });
        }
      }
    });
  }
});

JSON Schema validator choice: Python uses jsonschema.Draft202012Validator. In TS, options:

Recommendation: ajv — well-maintained, fast, widely used. Add as devDependency to packages/schema.

Step 6: Validate & Iterate

Run the JSON Schema test. Expected outcomes:

Fix any issues discovered. The Python side had to iterate on conditional oneOf and collection oneOf — we may hit similar or different issues due to Effect Schema’s different JSON Schema generation approach.


Implementation Order

StepWhatEffortBlocked ByTest Approach
1Add sync-param-spec Makefile target + run sync10mExisting 2087 tests still green
2aUpgrade in-range.ts to emit JSON Schema keywords20mRed: JSON Schema test fails for int/float range. Green: passes
2bUpgrade regex.ts to use S.pattern()15mRed: JSON Schema test fails for regex. Green: passes
2cUpgrade length.ts to use S.minLength/maxLength15mRed: JSON Schema test fails for length. Green: passes
2dAdd jsonSchema: {} to expression + empty_field validators10mJSONSchema.make() no longer throws for these tools
3Verify gx-color JSON Schema5mColor pattern in JSON Schema output
4Fix structural JSON Schema issues (if any)30m5Discovered during test run
5Write parameter-specification-json-schema.test.ts + add ajv dep30m1All non-skipped entries pass
6Iterate on failures30m2-5Full green

Steps 1 and 5 should come first (test infrastructure), then 2a-2d (validator upgrades), then 3-4 (structural fixes), then 6 (iteration).

Practical order: 1 → 5 → 2d → 2a → 2b → 2c → 3 → 4 → 6

Start with the test (even though it’ll have many failures), then incrementally fix validators to go red→green.


Unresolved Questions