TEST_JOB_VALDIATION_PLAN

Test Job Validation Plan

Follow-up to galaxyproject/galaxy#18884. That PR modeled TestJob.outputs but left TestJob.job: JobDict = Dict[str, Any] unvalidated. Goal: validate the job: block with a schema modeled on what the *.gxwf-tests.yml / Planemo *-tests.yml format should contain — not what the legacy input-staging helpers accept. Starting point is mvdbeek’s XSD-derived lib/galaxy/tool_util/schemas/job.py @ 805f429342f; this plan extends it only with fields we actually see in real workflow tests.

Guiding principle

The schema defines the canonical workflow-test job syntax. It is NOT a spec of every dict shape load_data_dict / stage_inputs() / populators._elements_to_test_data happens to consume — those helpers are called from many sites (tool tests, workflow tests, API upload, IWC) and have accreted tolerances over the years. Schema strictness is the forcing function that cleans those up at the fixture layer.

Concretely: the type: File | Directory | raw + value / content form that several of Galaxy’s own gxwf-tests.yml files use is a helper artifact — we migrate those fixtures to CWL-style rather than model the form.

IWC verification

Before finalizing “legacy type: syntax is local leftover, not schema territory”, we audited galaxyproject/iwc (~/projects/repositories/iwc/workflows, 119 *-tests.yml / *-test*.yml files):

IWC is 100% CWL-style (class: File / class: Collection, path:, filetype:, identifier, collection_type, elements). The legacy type: / value: / content: inputs exist only in lib/galaxy_test/workflow/ — genuine local leftovers. Safe to exclude from the schema.

The final commit message MUST mention this audit, so a future reader doesn’t need to re-derive why the legacy forms were left unmodeled.

Motivation

Scope

In:

Out:

Current state

lib/galaxy/tool_util_models/__init__.py:

JobDict = Dict[str, Any]

class TestJob(StrictModel):
    doc: Optional[str]
    job: JobDict
    outputs: Dict[str, TestOutputAssertions]
    expect_failure: Optional[bool] = False

TestJobDict (TypedDict) also uses JobDict.

21 of 44 files in lib/galaxy_test/workflow/*.gxwf-tests.yml use the legacy type: ... form; these need migrating before the stricter schema lands.

Reference: Marius’ model

job.py @ 805f429342f models only CWL-style inputs:

This is the baseline. Deviations below are additive or hygienic.

Target model

Port into new module lib/galaxy/tool_util_models/test_job.py (keeps __init__.py lean; matches the pattern used for tool_outputs.py).

Changes vs. Marius’ 2023 version

  1. Add hashes to BaseFilelist[HashEntry] where HashEntry = {hash_function: Literal[...], hash_value: str}. Widespread in IWC.
  2. Add identifier to BaseFile — not only on *Element forms. The IWC SampleMetadata pattern attaches an identifier to a top-level File, and current helpers accept it.
  3. Collapse *FileElement duplication — Marius’ version re-declared location/path/composite_data on each element variant. Replace with a single BaseFile plus an optional identifier; gate the element-vs-standalone distinction at the union level, not via parallel classes.
  4. Use Galaxy’s StrictModel — the one in tool_util_models/__init__.py (extra="forbid" + field title generator). Drops the bespoke Model class.
  5. Modernize typing — lowercase list[...] / dict[...]; Annotated, Literal from typing_extensions to match tool_outputs.py.
  6. Tighten collection_type — reuse the existing CollectionType annotated alias from __init__.py (list/paired/paired_or_unpaired/record/sample_sheet, colon-nested). Today Marius’ schema lets collection_type be any string.
  7. Discriminated union on classAnnotated[Union[File, Collection], Field(discriminator="class_")] for value-level dispatch. Requires populate_by_name=True in each model’s ConfigDict so both class (YAML) and class_ (Python) resolve.
  8. Drop RootModel subclassing — use Job = RootModel[Dict[str, JobParamValue]] generic per pydantic v2 idiom. No subclass of RootModel.
  9. Fix the top-level AnyJobParam — Marius had Union[dict[...], str] on Job.root, which allows the entire job: block to be a bare string. It’s never a string in the wild; top-level is always Dict[str, JobParamValue]. Scalars live as values under the dict, not as the dict itself.
  10. Typed lists, not bare list — a job-param value can be a typed list[JobParamValue] (list-of-files for data_collection params without nesting, list-of-scalars for multiple text, etc). Never a bare untyped list.

Union structure

FileT = Annotated[Union[LocationFile, PathFile, CompositeDataFile], ...]
# Collection.elements items: nested Collection or File-with-identifier
CollectionElementT = Annotated[Union[Collection, FileElement], Field(discriminator="class_")]
# Where FileElement = FileT intersected with required identifier (or FileT + validator)

JobParamValue = Union[
    FileT,                         # class: File (with path | location | composite_data)
    Collection,                    # class: Collection + elements
    str, int, float, bool, None,   # scalars for text/int/float/bool/select params
    list["JobParamValue"],         # typed list (e.g., multiple data / multiple text)
]

Job = RootModel[Dict[str, JobParamValue]]

Discriminator handling: class: File / class: Collection is the only discriminator. Scalars (and None) dispatch by type. There is no structural fallback for type: ... / value: / content: / elements without class: — those raise validation errors.

composite_data — keep. Zero IWC hits today but it’s part of the documented CWL-style input syntax supported by stage_inputs / upload; cheap to model.

deferred: true — keep as Optional[bool] on BaseFile. Don’t enforce the location-only constraint with a model_validator in v1; leave as semantic rule and document.

hashes algorithms — align with galaxy.util.hash_util.HASH_NAMES. Resolve the exact enum when writing the model; do not invent a new set here.

Migration: lib/galaxy_test/workflow fixtures

Before switching TestJob.job to the strict model, migrate all legacy-form inputs in lib/galaxy_test/workflow/*.gxwf-tests.yml:

LegacyMigrated
type: File, value: X, file_type: Yclass: File, path: test-data/X, filetype: Y
type: File, content: "..."class: File, path: <fixture> (write the content out)
type: Directory, value: X, file_type: Yclass: File, path: test-data/X, filetype: Y (bwa_mem2_index case)
type: raw, value: VBare scalar V (including null, "", booleans). Framework runner’s test_data_format="cwl_style" routes these as literal params via stage_inputs.
collection_type: L, elements: [{content: C, identifier: I}]class: Collection, collection_type: L, elements: [{class: File, identifier: I, path: <fixture>}]

Use grep -l "type:" lib/galaxy_test/workflow/*.gxwf-tests.yml (21 files) as the work list. Land migrations as their own commit ahead of the model switch so bisection against the test suite stays clean.

For the content: cases (inline string → dataset), either (a) promote to a checked-in test-data/ file, or (b) skip — these tests are small enough to convert to a file on disk.

Integration

  1. lib/galaxy/tool_util_models/test_job.py — new module. Exports Job, LocationFile, PathFile, CompositeDataFile, Collection, CollectionElement, HashEntry.
  2. lib/galaxy/tool_util_models/__init__.py:
    • from .test_job import Job
    • TestJob.job: JobDictJob.
    • Keep JobDict = Dict[str, Any] alias as deprecated shim in case external code imports it (grep the repo first — if unused, remove).
  3. lib/galaxy/tool_util/validate_test_format.py — no code change; stricter errors flow through automatically.
  4. Re-export Job from galaxy.tool_util_models top-level so downstream (galaxy-tool-util-ts) can import it alongside Tests.
  5. lib/galaxy_test/base/populators.py — add test_data_format: Optional[Literal["cwl_style"]] = None param to WorkflowPopulator.run_workflow (see Runtime dispatch below). No behavior change when unset.
  6. lib/galaxy_test/workflow/test_framework_workflows.py — pass test_data_format="cwl_style" when invoking run_workflow, so migrated .gxwf-tests.yml fixtures take the strict path.

Runtime dispatch

WorkflowPopulator.run_workflow is the single entry point for both .gxwf-tests.yml-driven framework tests and the ~290 procedural API tests in lib/galaxy_test/api/test_workflows.py (plus workflow_fixtures.py embedded test_data: blocks). The two call-sites disagree on bare-scalar semantics:

The dispatch inside run_workflow cannot distinguish a legacy bare-string (text_input: |\n a\n b\n c) from a post-migration scalar param (threshold: 0.5) without more signal. An earlier WIP tried a not any(isinstance(v, dict) ...) heuristic — it inverted semantics for every legacy multi-line-string dataset input and broke a large swath of API tests.

Approach: explicit signal from the caller.

This keeps the schema strict (Pydantic Job rejects legacy forms in .gxwf-tests.yml), keeps load_data_dict behavior frozen for API callers, and removes the runtime ambiguity without a heuristic.

Test strategy (red-to-green)

Add in test/unit/tool_util/test_test_format_model.py. Build fixtures in test/unit/tool_util/test_data/test_job_fixtures/:

Positive fixtures (one per union arm)

Negative fixtures (must fail validation)

Regression sweeps

Red-green order

  1. Add test_data_format="cwl_style" param to run_workflow and wire test_framework_workflows.py to pass it. No-op for all other callers; framework tests still pass because current fixtures are still legacy-form and the auto-detect path handles them when the kwarg is None — switch to the new path happens in step 2.
  2. Land migration commit for the 21 legacy fixtures. Framework runner now uses the strict path; API tests untouched.
  3. Author positive + negative fixture tests — they fail against Dict[str, Any] for the negative cases (since nothing is checked).
  4. Implement test_job.py arm by arm; flip TestJob.job to Job; negative cases start failing as expected.
  5. Run test_validate_workflow_tests, test_iwc_directory, and the full framework-workflows suite — iterate model on any surprise.
  6. Sanity sweep a sample of test_workflows.py API tests (e.g. test_run_workflow, test_run_workflow_with_output_collections) to confirm the None-default path is unchanged.

Validation rollout

Port to galaxy-tool-util-ts (downstream)

Once merged, galaxy-tool-util-ts make sync-test-format-schema can export Job.model_json_schema() alongside Tests.model_json_schema(). Replaces the hand-vendored 17128-era shapes in galaxy-workflows-vscode. No TS-side changes in this plan.

Open questions