CONNECTION_REBASE_PLAN

Connection-Algebra Rebase Plan

Situation

We’re working on TS↔Python connection-validation interop (see INTEROP_CONNECTION_TESTING_PLAN.md). The original Python connection validator landed on wf_tool_state per old/CONNECTION_VALIDATION.md. While building TS interop we noticed three upstream improvements that belong in Galaxy proper, not in the interop branch:

Those three commits were authored on map_match_logic (worktree at /Users/jxc755/projects/worktrees/galaxy/branch/map_match_logic, branched from 4cafd91e1d on dev). Their merge base with wf_tool_state is 4cafd91e1d.

The problem: wf_tool_state has already done the extraction (00155e913clib/galaxy/tool_util/collections.py is now the implementation; lib/galaxy/model/dataset_collections/type_description.py is a 32-line shim). All three rebase commits modify the OLD-location type_description.py, plus they touch query.py, structure.py, matching.py, terminals.ts, collectionTypeDescription.ts, collection_semantics.yml, and the matching test files. So a naive git rebase will produce one large conflict per commit on type_description.py, and the rename in commit 2 will silently break wf_tool_state-specific code that already calls has_subcollections_of_type (connection_types.py, connection_validation.py).

Working tree on wf_tool_state also has uncommitted WIP that anticipates the rename (uses op: can_map_over in connection_semantics.yml algebra entries, plus the new connection_type_cases.yml — WI-4 from the interop plan). Those need a parking decision before rebasing.

Branches / Worktrees

Predicted Conflicts (per commit)

Commit 1: 39597b3366 — Split can_match_type into accepts + compatible

Files touched on map_match_logic:

FileConflict?Resolution
lib/galaxy/model/dataset_collections/type_description.pyYES (relocation)Apply the diff to lib/galaxy/tool_util/collections.py instead. Leave the shim file alone — it subclasses the base, so new methods come along for free. No re-export edits needed.
lib/galaxy/model/dataset_collections/matching.pyMaybe cleanRenames can_match_type callers to compatible. Check for unrelated changes on wf_tool_state.
lib/galaxy/model/dataset_collections/query.pyMaybe cleanSame — rename callers.
lib/galaxy/model/dataset_collections/structure.pyMaybe cleanSame — rename callers in Tree.compatible_shape.
lib/galaxy/tools/execute.pyLikely cleanOne-call rename.
client/src/components/Workflow/Editor/modules/collectionTypeDescription.tsLikely cleanNew methods + rename.
client/src/components/Workflow/Editor/modules/collectionTypeDescription.test.tsNew file, clean
client/src/components/Workflow/Editor/modules/terminals.tsLikely cleanCaller rename.
client/src/components/Workflow/Editor/modules/terminals.test.tsLikely clean
lib/galaxy/model/dataset_collections/types/collection_semantics.ymlCHECK — see “WIP collision” belowThe commit adds an algebra section (~92 lines). The wf_tool_state working tree adds algebra: keys to many examples. Cherry-pick first if WIP is parked; merge by hand if not.
test/unit/data/dataset_collections/test_matching.pyLikely clean
test/unit/data/dataset_collections/test_structure.pyLikely clean
test/unit/data/dataset_collections/test_type_descriptions.pyLikely cleanTests the renamed methods on the shim — the shim re-exports the base class, so tests still hit the real implementation in tool_util/collections.py.

Ripple effect on wf_tool_state code (no conflict, but will break at runtime/tests):

Commit 2: 120f527c5a — Rename has_subcollections_of_typecan_map_over; drop is_subcollection_of_type

FileConflict?Resolution
lib/galaxy/model/dataset_collections/type_description.pyYES (relocation)Apply rename to lib/galaxy/tool_util/collections.py.
lib/galaxy/model/dataset_collections/query.pyLikely overlaps with commit 1 changesCheck — rename + inline is_subcollection_of_type.
lib/galaxy/model/dataset_collections/structure.pyLikely overlaps with commit 1
client/src/components/Workflow/Editor/modules/terminals.tsMaybe clean
lib/galaxy/model/dataset_collections/types/collection_semantics.ymlCHECK — WIP collision
test/unit/data/dataset_collections/test_type_descriptions.pyRenames

Ripple on wf_tool_state code:

Commit 3: 0683385f5d — Reframe algebra docstrings

FileConflict?Resolution
lib/galaxy/model/dataset_collections/type_description.pyYES (relocation)Port docstring rewrites to lib/galaxy/tool_util/collections.py.
lib/galaxy/model/dataset_collections/types/collection_semantics.ymlCHECK — WIP collision (45 lines changed)This is the biggest semantics.yml change of the three.
client/src/components/Workflow/Editor/modules/collectionTypeDescription.tsMaybe cleanDocstring/comment changes.
client/src/components/Workflow/Editor/modules/terminals.tsMaybe clean

WIP Collision: collection_semantics.yml

Working tree on wf_tool_state adds algebra: blocks to ~12 examples, e.g.:

algebra:
  - {op: can_map_over, output: paired, input: NULL}
  - {op: effective_map_over, output: paired, input: NULL}

The WIP already uses op: can_map_over — the post-rename name from commit 2. So the WIP was authored anticipating the rebase. It’s compatible with commits 2 and 3 in spirit but will textually conflict with commit 3’s algebra-section additions. Park (stash or commit) before starting; replay on top after the three commits land.

Sanity-Check List (post-rebase)

Run from the wf_tool_state worktree, with .venv sourced.

Open Design Questions (discuss before rebasing)

Follow-on Refactor (post-rebase, separate commit)

Commit 1 (39597b3366) doesn’t just rename — it introduces compatible specifically to fix order-dependent sibling map-over matching. The TS side gets the fix in the same commit (mappingConstraints in terminals.ts switches from .canMatch() to .compatible()). The Python validator has the equivalent site, and after the rebase it should mirror.

Site: _resolve_step_map_over in lib/galaxy/tool_util/workflow_state/connection_validation.py:299-318.

best = non_none[0]
for ctd in non_none[1:]:
    if ctd.collection_type != best.collection_type:
        step_result.errors.append(f"Incompatible map-over types: ...")
        return best
return best

Raw string equality. Stricter than even the pre-rebase can_match_type — rejects sibling map-overs that should compose, e.g. one input contributing list:paired and another contributing list:paired_or_unpaired. TS post-rebase accepts that pair via compatible; Python emits a spurious error.

Refactor:

  1. Add a sentinel-aware free function compatible(a, b) to connection_types.py, alongside the existing can_match / can_map_over / effective_map_over wrappers.
  2. Rewrite _resolve_step_map_over to use compatible() for the pairwise check, and pick the higher-rank type as the resolved map-over (matches TS “most specific compatible” choice — see INTEROP_CONNECTION_TESTING_PLAN.md).
  3. Add a sibling-mismatch fixture under connection_workflows/ that today fails spuriously and will pass post-refactor (red-to-green).
  4. Add op: compatible cases to connection_type_cases.yml so the TS truth-table gets coverage for free.

Out of scope for the refactor (leave alone):

Reference