Plan: Generate API Test Cases from Collection Semantics Spec
Overview
Generate parametrized pytest API tests from the YAML spec’s example entries, covering collection mapping, reduction, sub-collection mapping, and type validation at runtime.
Inventory: Current Test Coverage
31 examples in collection_semantics.yml:
- 9 have
api_testreferences (existing tests intest_tool_execute.pyandtest_tools.py) - 4 have
toolreferences (framework test tools likecollection_paired_test) - 18 have NO
tool_runtimetest at all
Typo found: LIST_REDUCTION (line 303) has doubled path: test_tools.py::TestToolsApi::test_tools.py::test_reduce_collections
Approach: Hybrid Parametrize
Prong 1: YAML-to-Test-Case Parser
New module test_cases.py that parses the spec into structured dataclasses:
@dataclass
class CollectionSemanticTestCase:
label: str
tool_shape: ToolShape # in/out type signature
collections: dict[str, CollectionDef]
input_bindings: dict[str, InputBinding]
expected_results: dict[str, OutputExpectation]
is_valid: bool
Extends existing Pydantic models in semantics.py to extract structured test information from assumptions + then expressions.
Prong 2: Parametrized Test File
New test_collection_semantics.py using pytest.mark.parametrize:
@pytest.mark.parametrize("test_case", load_valid_test_cases(), ids=lambda tc: tc.label)
def test_valid_collection_operation(test_case: CollectionSemanticTestCase, target_history):
"""Verify valid collection operations produce expected output structure."""
collection = build_collection_from_spec(target_history, test_case)
result = execute_tool(target_history, test_case, collection)
verify_output_structure(result, test_case.expected_results)
@pytest.mark.parametrize("test_case", load_invalid_test_cases(), ids=lambda tc: tc.label)
def test_invalid_collection_operation(test_case: CollectionSemanticTestCase, target_history):
"""Verify invalid collection operations are rejected."""
collection = build_collection_from_spec(target_history, test_case)
with pytest.raises(Exception): # 400 or appropriate error
execute_tool(target_history, test_case, collection)
Implementation Steps
Step 1: Fix spec typo
Line 303: test_tools.py::TestToolsApi::test_tools.py::test_reduce_collections -> test_tools.py::TestToolsApi::test_reduce_collections
Step 2: Build YAML-to-test-case parser
- Parse
assumptionsto extract tool shape, collection defs, dataset declarations - Parse
thenexpressions to extract input bindings and expected output structure - Output structured
CollectionSemanticTestCasedataclasses
Step 3: Create parametrized test file
lib/galaxy_test/api/test_collection_semantics.py(new)- Two test functions: valid operations, invalid operations
- Test IDs from spec labels for clear pytest output
Step 4: Collection setup infrastructure
Map spec notation to existing TargetHistory methods:
[paired, {forward: d_f, reverse: d_r}]->target_history.create_pair()[list, {i1: d_1, ..., in: d_n}]->target_history.create_list()["list:paired", {el1: {forward: d_f, reverse: d_r}}]->target_history.create_list_of_pairs()- Nested structures -> recursive builder
Step 5: Input binding infrastructure
Map spec’s then expressions to Galaxy API input formats:
mapOver(C)->{"i": {"src": "hdca", "id": collection_id}}tool(i=C)-> direct collection inputtool(i=[d_1,...,d_n])-> multi-dataset inputmapOver(C, 'paired')-> subcollection mapping
Step 6: Tool selection logic
Map abstract tool shapes to concrete tool IDs:
{i: dataset} -> {o: dataset}->cat1or framework test tool{i: "collection<paired>"} -> {o: dataset}->collection_paired_test{i: "dataset<multiple=true>"} -> {o: dataset}-> appropriate multi-input tool
Step 7: Create missing framework tool
collection_list_test.xml — for examples referencing list collection inputs without an existing test tool.
Step 8: Update spec with tool_runtime references
Add tool_runtime entries for all newly covered examples.
Testing Strategy (Red-to-Green)
- Write parametrized test with first valid case (BASIC_MAPPING_PAIRED) -> fails (no infrastructure)
- Implement collection builder -> test still fails (no tool execution)
- Implement tool execution + output verification -> test passes
- Add next case, repeat
- Add invalid cases last
Critical Files
| File | Role |
|---|---|
lib/galaxy/model/dataset_collections/types/collection_semantics.yml | Fix typo, extend with tool_runtime refs |
lib/galaxy/model/dataset_collections/types/semantics.py | Extend with test case extraction (or sibling module) |
lib/galaxy_test/api/test_tool_execute.py | Pattern to follow for fixture-based tests |
lib/galaxy_test/base/populators.py | Extend with collection-from-spec builder |
lib/galaxy_test/api/conftest.py | May need new fixtures |
lib/galaxy_test/api/test_collection_semantics.py | New parametrized test file |
test/functional/tools/ | New collection_list_test.xml framework tool |
Unresolved Questions
- Do invalid cases verify 400 status, or are some only workflow-editor constraints the API doesn’t enforce?
- Should generated tests use deterministic content for output assertions, or just verify structural properties?
- Should parametrized tests also run against tool request API format?
- Is
cat1the right tool for generated mapping tests (real tool, not test tool)? - Should examples with no
thenclause getthenclauses added to make them API-testable? - Should
test_cases.pylive in production code (lib/galaxy/model/...) or test code (lib/galaxy_test/)?