INTEROP_CONNECTION_TESTING_PLAN

Interop Connection Testing Plan

Goal

Make Galaxy’s connection-validation tests consumable by the forthcoming TypeScript connection validator in galaxy-tool-util-ts, so both languages run the same corpus against the same expectations and the same tool definitions — without coupling the TS side to Galaxy’s Python tool-XML parser.

Scope: the pieces needed now to unblock TS connection validation at the CLI level. Does not cover cross-CLI diff (idea #7) or goldens (idea #4) — those sit on top and can land later.

Context

Decisions (locked)

  1. Fixture home in Galaxy: stays at test/unit/tool_util/workflow_state/connection_workflows/. Established pattern, minimal blast radius. Sync target points here.
  2. TS destination: packages/core/test/fixtures/connection_workflows/. Mirrors packages/core/test/fixtures/golden/.
  3. ParsedTool JSON cache: lives in this repo at packages/core/test/fixtures/connection_workflows/parsed_tools/<tool_id>.json. Not in Galaxy.
  4. Tool list: auto-derived by walking the synced *.gxwf.yml fixtures for tool_id: refs. No hand-maintained manifest.
  5. Sidecar schema: ported verbatim. TS gets a dictVerifyEach helper with the same target: [path,…] / value: X entries.
  6. Fixtures without sidecars: allowed — ok_* just asserts valid, fail_* just asserts invalid, same as Python today.
  7. sample_sheet collection types: defer. Not blocking.

Work Items

WI-1: Sync fixture corpus (Galaxy → TS)

Add two Makefile targets in the TS repo:

CONN_WF_SRC = $(GALAXY_ROOT)/test/unit/tool_util/workflow_state/connection_workflows
CONN_WF_DST = packages/core/test/fixtures/connection_workflows

sync-connection-workflows:
    # guard GALAXY_ROOT; rm -rf dst; cp *.gxwf.yml; cp expected/*.yml

sync-connection-workflows copies *.gxwf.yml and expected/*.yml only. No parsed-tool work here; that’s WI-2.

Rolled into the top-level sync target alongside existing sync-golden etc. Add a check-sync-connection-workflows (SHA diff) matching the existing pattern.

WI-2: Sync ParsedTool JSON cache (new Python sync script)

New script: scripts/sync-parsed-tools.py (this repo). Invoked via $(GALAXY_PYTHON) (Galaxy’s venv). Responsibilities:

  1. Walk $CONN_WF_DST/*.gxwf.yml, collect every tool_id: value referenced in steps.*. Tool IDs like gx_data, collection_paired_test, collection_type_source, etc. (current fixtures use Galaxy’s functional test tools under $GALAXY_ROOT/test/functional/tools/).
  2. For each tool_id, resolve via the same logic FunctionalGetToolInfo uses: functional_test_tool_source(tool_id) with recursive directory walk fallback (port from test/unit/tool_util/workflow_state/functional_tool_info.py — ~20 lines).
  3. Parse with galaxy.tool_util.model_factory.parse_tool.
  4. Serialize via Pydantic: parsed_tool.model_dump_json(indent=2, exclude_none=True, by_alias=True) (match whatever serialization shape the TS ParsedTool Effect Schema expects — cross-check against packages/schema/src/tool/parsed-tool.ts before committing).
  5. Write one file per tool: packages/core/test/fixtures/connection_workflows/parsed_tools/<tool_id>.json.
  6. Write SHA256 manifest parsed_tools.sha256 for CI verification (match the sync-test-format-schema pattern).
  7. Exit non-zero on any unresolved tool_id, listing them. No silent skips (loud failure requested — counters Galaxy’s current FunctionalGetToolInfo exception-swallowing; see idea #8 from the brainstorm).

Makefile:

PARSED_TOOLS_DST = $(CONN_WF_DST)/parsed_tools
GALAXY_PYTHON ?= $(GALAXY_ROOT)/.venv/bin/python

sync-parsed-tools: sync-connection-workflows
    # guard GALAXY_ROOT + GALAXY_PYTHON
    PYTHONPATH="$(GALAXY_ROOT)/lib" "$(GALAXY_PYTHON)" scripts/sync-parsed-tools.py \
        --fixtures $(CONN_WF_DST) --out $(PARSED_TOOLS_DST) --galaxy-root $(GALAXY_ROOT)

Order matters: sync-parsed-tools depends on sync-connection-workflows so tool discovery reads the just-synced fixtures (one source of truth). Both folded into the top-level sync.

WI-3: Fixture + tool loader in TS

Small util in packages/core/src/testing/ (or packages/core/test/helpers/ if it shouldn’t ship):

Exports from @galaxy-tool-util/core under a testing subpath so downstream packages (including the future connection-validation package) can consume without duplication.

WI-4: Truth-table for collection-type algebra (idea #3)

Replace the hand-ported test_connection_types.py cases with a YAML truth table that both sides consume.

Canonical location: new file in Galaxy: test/unit/tool_util/workflow_state/connection_type_cases.yml. Owned by the module under test; synced to TS.

Shape (informal):

- op: can_match          # can_match | can_map_over | effective_map_over
  output: list:paired
  input: paired
  expected: true
  semantics_ref: MAPPING_LIST_PAIRED_OVER_PAIRED   # optional
  note: "…"                                         # optional

- op: effective_map_over
  output: list:list
  input: list:paired_or_unpaired
  expected: list
  semantics_ref: MAPPING_LIST_LIST_OVER_LIST_PAIRED_OR_UNPAIRED

Special tokens for sentinels: NULL, ANY. Parser resolves via NULL_COLLECTION_TYPE / ANY_COLLECTION_TYPE.

Order: land the YAML + Python loader in Galaxy first (red-to-green: new test passes against existing algebra). Then TS side consumes on top of workflow-graph Phase 2.

WI-5: workflow_format_validation tracking in collection_semantics.yml (idea #5)

Extend collection_semantics.yml (42 examples) with a new test-tracking key alongside existing tool_runtime, workflow_runtime, workflow_editor.

tests:
    tool_runtime:
        api_test: "test_tool_execute.py::test_map_over_collection"
    workflow_runtime:
        framework_test: "collection_semantics_cat_0"
    workflow_editor: "accepts paired data -> data connection"
    workflow_format_validation:
        fixture: "ok_list_paired_to_paired"     # fixture stem in connection_workflows/
        type_cases:                               # optional — which algebra cases map here
          - {op: can_map_over, output: "list:paired", input: "paired"}

Notes:

Action items:

  1. Add a collection_semantics.yml schema/validator pass (already happens for the other keys?) to enforce that referenced fixture stems exist.
  2. Sweep all 42 examples, fill in workflow_format_validation where coverage exists today. Gaps → new fixtures (folds into WI-6 backlog).
  3. Add a one-off Python script test/unit/tool_util/workflow_state/test_collection_semantics_coverage.py that cross-checks: every workflow_format_validation.fixture → real file; every .gxwf.yml fixture → referenced by at least one example (or flagged orphan: true).

WI-6: Convert convertible programmatic tests into fixtures (idea #6)

Walk test_connection_validation.py and identify cases that only need a workflow + tool defs. Candidates (based on my read of the current file):

Each converted test:

  1. New .gxwf.yml under connection_workflows/ — ideally the tool IDs it references already exist in functional test tools. If a test needs a synthetic tool shape not present, write a minimal tool XML under test/functional/tools/ (preferred) or skip the conversion.
  2. Sidecar expected/*.yml capturing the specific assertions the programmatic test made (map_over, connections[*].status, connections[*].mapping, etc.).
  3. Delete the programmatic test once the fixture covers it.

Keep programmatic (don’t convert):

Non-goal: 100% conversion. The goal is that every case that naturally fits the fixture shape gets to TS for free.


Implementation Order

  1. WI-1 (fixture sync) — trivial, unblocks everything.
  2. WI-2 (parsed-tools sync script) — sketch script against current fixtures; pnpm --filter @galaxy-tool-util/core test stays green with no consumers yet.
  3. WI-3 (TS loader + dictVerifyEach) — tested by snapshot-style “loads N fixtures” test; real consumers arrive with the validator.
  4. WI-4 (type-algebra truth table) — Galaxy side first, then TS after workflow-graph Phase 2 lands. Can proceed in parallel with 1-3.
  5. WI-5 (workflow_format_validation keys in collection_semantics.yml) — Galaxy-only, no TS dependency. Run anytime.
  6. WI-6 (programmatic → fixture conversion) — after WI-1 so new fixtures immediately sync to TS. Incremental; one commit per batch is fine.

WI-4, WI-5, WI-6 are independent of each other and of the TS validator; sequence by reviewer bandwidth.


Status (2026-04-25)

Galaxy-side foundations landed on wf_tool_state. Recap by WI:

WI-4 — type-algebra truth table

Done:

Remaining:

WI-5 — workflow_format_validation + algebra tracking

Done:

Remaining:

WI-6 — programmatic → fixture conversion

Done:

Remaining:

Out-of-scope work that landed alongside


File / Path Cheatsheet

Galaxy repo (source of truth)

test/unit/tool_util/workflow_state/
├── connection_workflows/
│   ├── *.gxwf.yml                       # synced
│   └── expected/*.yml                   # synced, verbatim schema
├── connection_type_cases.yml            # NEW (WI-4), synced
├── test_connection_types.py             # slimmed to sentinel tests + YAML loader (WI-4)
├── test_connection_validation.py        # shrinks as fixtures convert (WI-6)
└── test_collection_semantics_coverage.py  # NEW (WI-5)

lib/galaxy/model/dataset_collections/types/collection_semantics.yml
    # gains `workflow_format_validation:` entries (WI-5)

TS repo (this repo)

packages/core/test/fixtures/connection_workflows/
├── *.gxwf.yml                           # from sync-connection-workflows
├── expected/*.yml
├── parsed_tools/
│   ├── <tool_id>.json                   # from sync-parsed-tools
│   └── parsed_tools.sha256

packages/core/src/testing/ (or test/helpers/)
├── load-connection-fixtures.ts          # WI-3
├── parsed-tool-cache.ts                 # WI-3
└── dict-verify-each.ts                  # WI-3

packages/workflow-graph/test/fixtures/connection_type_cases.yml   # WI-4
packages/workflow-graph/test/connection-type-cases.test.ts        # WI-4

scripts/sync-parsed-tools.py             # WI-2
Makefile                                 # +targets: sync-connection-workflows, sync-parsed-tools, sync-connection-type-cases

Risks / Open Points

Unresolved Questions