TEST_VALIDATION_PLAN

Python Workflow-Test Validation Plan (gxwf validate-tests)

Parallel to TEST_SCHEMA_TS_PLAN.md. The TS side vendors a JSON Schema dumped from Python; here we live inside Galaxy and validate directly against the Pydantic models (galaxy.tool_util_models.Tests / TestJob). No schema export, no Ajv — just Tests.model_validate() with structured error reporting.

Decisions

Work items

1. Validator module (lib/galaxy/tool_util/workflow_state/validation_tests.py)

2. Report models (_report_models.py)

3. CLI — single file (scripts/workflow_validate_tests.py)

4. CLI — tree (scripts/workflow_validate_tests_tree.py)

5. Register in gxwf.py

6. Tests (test/unit/tool_util/workflow_state/test_validate_tests.py)

Red-to-green. Reuse the same fixture set named in TEST_SCHEMA_TS_PLAN.md §3 so TS and Python exercise the same surface:

7. IWC sweep hook (test_iwc_sweep.py)

8. Docs (doc/source/dev/wf_tooling.md) — DONE

9. Accept legacy type: sugar on Collection (post-IWC-sweep triage) — DONE

Implementation landed:

Residual IWC failures (follow-ups, NOT this plan’s scope):

Original plan text below for reference.

Motivation: first IWC sweep surfaced ~52/120 failures, dominated by nested class: Collection elements that use type: paired instead of collection_type: paired. Staging code (load_data_dict / fetch API) already ignores that key — it’s documentary authoring sugar, not load-bearing. Rather than churn every IWC tests file, normalize it on the model side so the validator accepts the sugar and Python consumers always see a canonical collection_type.

Scope: model-level only. No consumer changes — validation is the only consumer of these models today, so we pay zero migration cost.

Fixtures:

Tests:

TS follow-up:

Unresolved:

Test strategy (red-to-green)

  1. Land fixtures + failing unit tests referencing validate_tests_file.
  2. Implement validation_tests.py; positives green, negatives produce expected diagnostics.
  3. Add single-file CLI + report plumbing; snapshot tests over text/JSON/markdown output.
  4. Add tree variant + discovery; mixed-directory snapshot.
  5. Wire into gxwf.py.
  6. Run IWC sweep; file issues for any real failures found.
  7. Land Phase 9 type → collection_type normalizer; re-run IWC sweep; triage residual failures.

Relationship to TS plan

Unresolved questions

User Input: Stick with the things here for consistency please.

User Input: Nah.