INTEROP_CONNECTION_TESTING_HARDEN_PLAN

Interop Connection Testing — Hardening Plan

Goal

Maximize declarative test coverage of the Galaxy connection validator so the same corpus can be consumed by galaxy-tool-util-ts. Three workstreams, all Galaxy-side, all incremental:

Background: INTEROP_CONNECTION_TESTING_PLAN.md (the original interop design) and CONNECTION_REBASE_PLAN.md (the recent rebase). Status section in the interop plan lists the post-rebase coverage baseline (40/42 algebra, 7/42 fixture).

Both ends of this plan are tested by test_collection_semantics_coverage.py — the cross-check fails the build if a new fixture isn’t either referenced by an example or listed in KNOWN_ORPHANS, and fails if a new example isn’t covered by algebra: / workflow_format_validation: / EXPECTED_NEITHER.


Combined Effort & Output

WorkstreamNew .gxwf.ymlNew tool XMLsNew examples in collection_semantics.ymlProgrammatic tests deleted/converted
WI-5 fixture sweep300 (links existing examples)0
WI-6 conversion70010 convert + 3 delete (trivial)
Orphan audit0030
Total100313

Estimated effort: 6–8 hours total, splittable across reviewers.


WI-5 — Fixture Sweep

Triage criteria

Coverage today

40/42 examples carry algebra:. 7/42 also carry workflow_format_validation:. The two without algebra are runtime-only (BASIC_MAPPING_INCLUDING_SINGLE_DATASET, BASIC_MAPPING_TWO_INPUTS_WITH_IDENTICAL_STRUCTURE) and stay in EXPECTED_NEITHER.

New fixtures (3)

All use collection_paired_or_unpaired (already in test/functional/tools/); no new tool XMLs.

Fixture stemExample labelWorkflow shapeValidates
ok_list_paired_to_paired_or_unpairedMAPPING_LIST_PAIRED_OVER_PAIRED_OR_UNPAIREDlist:paired collection input → tool with paired_or_unpaired slot → dataset outSub-collection extraction + asymmetry (paired ⊂ paired_or_unpaired)
ok_list_list_paired_to_paired_or_unpairedMAPPING_LIST_LIST_PAIRED_OVER_PAIRED_OR_UNPAIREDlist:list:paired collection input → tool with paired_or_unpaired slot → dataset outHigher-rank nesting with same asymmetry rule
ok_list_to_paired_or_unpairedMAPPING_LIST_OVER_PAIRED_OR_UNPAIREDlist collection input → tool with paired_or_unpaired slot → dataset outsingle_datasets sub_collection_type mapping (each list element wrapped as unpaired)

Sidecar shape

- target: [step_results, 1, map_over]
  value: "list"
- target: [step_results, 1, connections, 0, status]
  value: "ok"
- target: [step_results, 1, connections, 0, mapping]
  value: "list"

After landing, add workflow_format_validation: { fixture: <stem> } to the three example entries in collection_semantics.yml.

Why not more

The 32 remaining algebra-only examples are pure unary type rejections / matches (COLLECTION_INPUT_*_NOT_CONSUMES_*, *_REDUCTION_INVALID, SAMPLE_SHEET_* symmetric-or-asymmetric matches). Algebra captures them completely; a fixture would just confirm “the validator wires accepts into edge validation” — already implicitly true for every passing fixture.


WI-6 — Programmatic → Fixture Conversion

Triage criteria

Disposition (23 methods)

Convert (10):

Keep programmatic (10):

Delete (3):

Subworkflow rule of thumb

Up to one level of subworkflow nesting is fixturable — sidecar paths stay readable. Two levels deep, paths like step_results[2].resolved_outputs[0]... get O(depth) brittle. Fixture gxformat2 requires explicit input_subworkflow_step_id on connections crossing the subworkflow boundary; verify with the basic passthrough fixture before scaling.

Sidecar examples

# ok_two_list_inputs_map_over
- target: [step_results, 2, map_over]
  value: "list"
- target: [step_results, 2, connections, 0, status]
  value: "ok"

# ok_subworkflow_list_propagation
- target: [step_results, 1, map_over]
  value: "list"
- target: [step_results, 2, connections, 0, mapping]
  value: "list"

Orphan Audit

8 fixtures currently in KNOWN_ORPHANS. Of these:

Stay orphan (5)

Genuine plumbing or validator-internal concerns that don’t fit a single collection_semantics.yml label:

Tighten the KNOWN_ORPHANS comments to spell out why each one stays — current comments are terse.

Promote to collection_semantics.yml (3)

ok_collection_output_with_map_over → new COLLECTION_OUTPUT_PRODUCES_NESTED_MAPPING

Place after MAPPING_LIST_PAIRED_OVER_PAIRED. Tests sub-collection mapping where a tool’s static list:paired output composes with an outer list map-over to produce list:list:paired — distinct from MAPPING_LIST_PAIRED_OVER_PAIRED which only tests consuming list:paired.

- example:
    label: COLLECTION_OUTPUT_PRODUCES_NESTED_MAPPING
    assumptions:
    - datasets: ["d_1,...,d_n"]
    - tool1:
        in: {i: list}
        out: {c: "collection<list:paired>"}
    - tool2:
        in: {i: "collection<paired>"}
        out: {o: dataset}
    - collections:
        C: [list, {i1: d_1, ..., in: d_n}]
    then:
        type: map_over
        invocation:
            inputs:
                i: {type: map_over, collection: C}
        produces:
            c:
                type: collection
                collection_type: "list:paired"
                elements:
                    i1:
                        type: nested_elements
                        elements:
                            forward: {type: ellipsis}
                            reverse: {type: ellipsis}
    tests:
        workflow_format_validation:
            fixture: ok_collection_output_with_map_over
        algebra:
          - {op: can_map_over, output: list, input: NULL}

fail_incompatible_map_over → new SIBLING_MAP_OVER_TYPE_MISMATCH_INVALID

Place after PAIRED_OR_UNPAIRED_NOT_CONSUMED_BY_LIST_WHEN_MAPPING. Tests sibling map-over rejection — two inputs whose contributed map-over types don’t compatible().

- example:
    label: SIBLING_MAP_OVER_TYPE_MISMATCH_INVALID
    assumptions:
    - datasets: ["d_1,...,d_n", d_f, d_r]
    - tool:
        in: {i1: dataset, i2: dataset}
        out: {o: dataset}
    - collections:
        C_list: [list, {i1: d_1, ..., in: d_n}]
        C_paired: [paired, {forward: d_f, reverse: d_r}]
    then:
        type: invalid
        invocation:
            inputs:
                i1: {type: map_over, collection: C_list}
                i2: {type: map_over, collection: C_paired}
    is_valid: false
    tests:
        workflow_format_validation:
            fixture: fail_incompatible_map_over

ok_paired_maps_over_multi_data → new PAIRED_MAPS_OVER_MULTI_DATA

Place after PAIRED_REDUCTION_INVALID. Closes a real specification gap: collection_semantics.yml documents that paired cannot reduce into dataset<multiple=true> (PAIRED_REDUCTION_INVALID), but says nothing about whether map-over is still available. The validator path (connection_validation.py:268-277) takes the non-list-like fall-through and accepts the connection as a map-over (each pair element fed singly; step runs twice; multiple=true is satisfied because per-execution arity is 1).

The sidecar asserts step_results[1].map_over == "paired" and connections[0].mapping == "paired" — i.e. it’s testing map-over, not reduction. No contradiction with PAIRED_REDUCTION_INVALID; both behaviors are correct via different codepaths.

- example:
    label: PAIRED_MAPS_OVER_MULTI_DATA
    doc: |
        A ``paired`` collection cannot reduce into a ``multiple=true`` data
        input (see PAIRED_REDUCTION_INVALID — only list-like collections
        reduce). It can, however, map over such an input: each pair element
        is fed singly, and the step runs twice. ``multiple=true`` does not
        block this — the slot accepts >=1 dataset, and the per-execution
        cardinality is 1.
    assumptions:
    - datasets: [d_f, d_r]
    - tool:
        in: {i: "dataset<multiple=true>"}
        out: {o: dataset}
    - collections:
        C: [paired, {forward: d_f, reverse: d_r}]
    then:
        type: map_over
        invocation:
            inputs:
                i: {type: map_over, collection: C}
    tests:
        workflow_format_validation:
            fixture: ok_paired_maps_over_multi_data
        algebra:
          - {op: can_map_over, output: paired, input: NULL}

Implementation Order

Independent; can be done in any order or in parallel. Suggested sequence by reviewer load:

  1. WI-5 (3 fixtures) — smallest, validates the fixture-creation pipeline against current scaffolding.
  2. Orphan promotions (2 examples + risk-flag investigation) — catalog edits + cross-check still passes.
  3. WI-6 conversions (10 fixtures + 3 deletions) — biggest; do in batches (map-over variants → multi-data → subworkflow), one PR per batch.

Each step keeps test_collection_semantics_coverage.py and test_connection_workflows.py green; no flag-day cutover.


Sanity Checks (after each batch)


Open Questions