COLLECTION_SEMANTICS_PLAN_REVIEW

Plan: Review Spec for Typos and Inaccuracies

Overview

28 total issues identified across 7 categories in the hand-written collection semantics specification.

Category 1: Grammar and Typos in Prose (5 issues) — DONE

#LineCurrentFixStatus
1.112”Typically, this explicitly annotated""Typically, this is explicitly annotated”DONE
1.2184”referred to a “reduction"""referred to as a “reduction""DONE
1.3557”Due only implementation time""Due only to implementation time”DONE
1.4359”as describe above""as described above”DONE
1.5440”This inverse of this""The inverse of this”DONE

Category 2: Swapped/Incorrect Test References (3 issues) — DONE

2.1: Swapped paired/unpaired test refs — DONE

2.2: Duplicated path in LIST_REDUCTION — DONE

2.3: wf_editor field name instead of workflow_editor — DONE

Category 3: Bracket Mismatches in then Expressions (4 issues) — DONE

#LineLabelIssueStatus
3.186BASIC_MAPPING_LISTStray ][o]] should be [o]DONE
3.2171BASIC_MAPPING_TWO_INPUTS_WITH_IDENTICAL_STRUCTURESame stray ]DONE
3.3532MAPPING_LIST_OVER_PAIRED_OR_UNPAIREDSame stray ]DONE
3.4374NESTED_LIST_REDUCTIONon: floats outside inner bracesDONE

Category 4: Dataset Naming Inconsistency (1 issue) — DONE

Category 5: Notation Inconsistencies in Collection Definitions (5 issues) — DONE

5.1: Mixed = vs : in element definitions — DONE

5.2: list:paired collections with flat (non-nested) elements — DONE

5.3: f/r instead of d_f/d_r — DONE

Category 6: Generator Script Issues (4 issues) — 6.1/6.2 DONE, 6.3 planned, 6.4 separate

6.1: Incomplete WORDS_TO_TEXTIFY — DONE

6.2: expression_to_latex doesn’t handle paired_or_unpaired as compound word — DONE

6.3: Examples without then silently dropped from docs — PLANNED

6.4: check() unimplemented — SEPARATE PLAN

Category 7: Spacing Inconsistencies (2 issues) — DONE

Implementation Plan

Phase 1: Fix YAML spec (High Impact) — DONE

  1. Fix grammar/typos (Findings 1.1-1.5) — 5 line changes
  2. Fix swapped/incorrect test refs (Findings 2.1-2.3) — 3 changes
  3. Fix bracket mismatches (Findings 3.1-3.4) — 4 line changes
  4. Fix dataset naming (Finding 4.1) — 1 line change
  5. Fix notation inconsistencies (Findings 5.1-5.3) — ~6 line changes
  6. Normalize spacing (Findings 7.1-7.2) — cosmetic pass

Phase 2: Fix generator script (Medium Impact) — MOSTLY DONE

  1. Expand WORDS_TO_TEXTIFY (Finding 6.1)
  2. Handle paired_or_unpaired compound word (Finding 6.2)
  3. Include test-only examples in docs (Finding 6.3) — see separate plan

Phase 3: Regenerate and Verify — TODO

  1. Run semantics.py to regenerate docs
  2. Visual inspect generated Markdown
  3. Build docs locally to confirm LaTeX renders

Testing Strategy

Critical Files

FileRole
lib/galaxy/model/dataset_collections/types/collection_semantics.ymlPrimary target: all 20+ YAML issues
lib/galaxy/model/dataset_collections/types/semantics.pyGenerator script fixes
doc/source/dev/collection_semantics.mdRegenerate after fixes
lib/galaxy_test/api/test_tool_execute.pyVerify swapped test refs
lib/galaxy_test/api/test_tools.pyVerify duplicated path ref

Unresolved Questions

  1. Swapped test refs (2.1) — verify actual test implementations or trust function names? Fixed by swapping.
  2. WORDS_TO_TEXTIFY — textify ALL identifier words or curated subset? Applied curated expansion + placeholder approach.
  3. Examples without then — render differently (no math block) or add then to each? Separate plan recommends adding then to each.
  4. check() implementation — part of this PR or separate? (Separate plan exists)
  5. ExampleTests model — set extra='forbid' or keep silent dropping?