Connection Validation: Testing, Reporting, and Tracking
Replaces Phase 4 reporting and Phase 5 of CONNECTION_VALIDATION.md with a unified approach: fixture-based testing with sidecar expectations, Pydantic report models integrated into the existing validate.py infrastructure, and collection_semantics.yml tracking via fixture filenames.
1. Pydantic Report Models
The current connection validation uses dataclass result objects (ConnectionValidationResult, StepConnectionResult, WorkflowConnectionResult) that are separate from the existing _report_models.py Pydantic hierarchy. This section adds Pydantic models that:
- Integrate with
--report-json/--report-markdownoutput - Carry resolved output types (currently discarded after validation)
- Compose cleanly into the existing
WorkflowValidationResult
1.1 New Models in _report_models.py
ConnectionStatus = Literal["ok", "invalid", "skip"]
class ConnectionResult(BaseModel):
"""Single connection between two steps.
status and mapping are orthogonal:
- status: whether the connection is valid (ok/invalid/skip)
- mapping: what collection type is being mapped over (None = direct match)
A connection can be ok with mapping (valid map-over) or ok without (direct).
"""
source_step: str
source_output: str
target_step: str
target_input: str
status: ConnectionStatus
mapping: Optional[str] = None # collection type being mapped over, None = direct
errors: List[str] = []
class ResolvedOutputType(BaseModel):
"""Resolved collection type for a step output."""
name: str
collection_type: Optional[str] = None # None = plain dataset
class ConnectionStepResult(StepResultBase):
"""Connection validation result for one step."""
step_type: str = "tool"
map_over: Optional[str] = None
connections: List[ConnectionResult] = []
resolved_outputs: List[ResolvedOutputType] = []
errors: List[str] = []
class ConnectionValidationReport(BaseModel):
"""Connection validation results for one workflow."""
valid: bool
step_results: List[ConnectionStepResult] = []
summary: Dict[str, int] = {} # {ok, invalid, skip}
1.2 Extend WorkflowValidationResult
Add an optional connection_report field to the existing workflow result:
class WorkflowValidationResult(WorkflowResultBase):
step_results: List[ValidationStepResult] = Field(default=[], serialization_alias="results")
connection_report: Optional[ConnectionValidationReport] = None
# ... existing summary computed_field
When --connections is enabled, connection_report is populated. When disabled, it’s None and excluded from JSON output via model_dump(exclude_none=True). This keeps the existing --report-json shape stable for consumers that don’t care about connections, while enriching it for those that do.
1.3 Resolved Output Types
The validation engine currently builds resolved_output_types: Dict[str, Dict[str, CollectionTypeOrSentinel]] during traversal but discards it. Change validate_connection_graph() to return this alongside the result, and convert it to ResolvedOutputType entries on each ConnectionStepResult.
This is the key data that enables testing “what collection type does step X’s output Y resolve to after propagation?“
1.4 JSON Output Shape (example)
{
"path": "workflow.ga",
"results": [ ... tool_state validation ... ],
"connection_report": {
"valid": true,
"step_results": [
{
"step": "0",
"step_type": "data_collection_input",
"resolved_outputs": [
{"name": "output", "collection_type": "list:paired"}
]
},
{
"step": "1",
"tool_id": "collection_paired_test",
"step_type": "tool",
"map_over": "list",
"connections": [
{
"source_step": "0",
"source_output": "output",
"target_step": "1",
"target_input": "f1",
"status": "ok",
"mapping": "list"
}
],
"resolved_outputs": [
{"name": "out1", "collection_type": "list"}
]
}
],
"summary": {"ok": 1, "invalid": 0, "skip": 0}
}
}
1.5 Formatters
Add text and markdown formatters for ConnectionValidationReport following the existing patterns in validate.py. The text formatter replaces the current format_connection_text(). The markdown formatter adds a “Connection Validation” section to the existing markdown report.
2. Fixture-Based Test Infrastructure
2.1 Directory Layout
test/unit/tool_util/workflow_state/
connection_workflows/
ok_dataset_to_dataset.gxwf.yml
ok_paired_to_paired.gxwf.yml
ok_paired_to_paired_or_unpaired.gxwf.yml
ok_list_paired_to_paired.gxwf.yml
ok_list_to_multi_data.gxwf.yml
ok_collection_type_source_passthrough.gxwf.yml
ok_structured_like_passthrough.gxwf.yml
ok_two_inputs_compatible_map_over.gxwf.yml
ok_chain_map_over_propagates.gxwf.yml
ok_collection_output_with_map_over.gxwf.yml
ok_sample_sheet_to_list.gxwf.yml
fail_list_to_paired.gxwf.yml
fail_incompatible_map_over.gxwf.yml
fail_paired_or_unpaired_to_paired.gxwf.yml
...
connection_workflows/expected/
ok_list_paired_to_paired.yml
ok_chain_map_over_propagates.yml
fail_incompatible_map_over.yml
...
- Workflows are gxformat2 YAML — human-readable, minimal (5-15 lines each)
- Sidecar
.ymlfiles inexpected/contain hand-written[json_path, expected_value]pairs (see 2.5) - Not every workflow needs a sidecar —
ok_/fail_prefix is sufficient for the basic assertion - Sidecars are for cases where we want to verify specific details: mapping types, resolved output types, error messages, skip reasons
2.2 Fixture Workflow Format
class: GalaxyWorkflow
inputs:
input_collection:
type: collection
collection_type: list:paired
steps:
step1:
tool_id: collection_paired_test
in:
f1: input_collection
Tools referenced by tool_id are resolved via functional_test_tool_source → parse_tool. The functional test tools in test/functional/tools/ already cover the needed archetypes:
| Tool | Input | Output | Use For |
|---|---|---|---|
collection_paired_test | f1: paired | out1: data | paired consumption |
collection_paired_or_unpaired | f1: paired_or_unpaired | out1: data | pou consumption |
collection_list_paired_or_unpaired | f1: list:paired_or_unpaired | out1: data | compound pou |
collection_type_source | input_collect: collection(any) | list_output: collection(type_source=input_collect) | dynamic output types |
collection_paired_structured_like | input1: paired | list_output: paired(structured_like) | structured_like output |
collection_creates_pair | input1: data | paired_output: paired | paired creation |
multi_data_param | f1: data(multiple) | out1: data | multi-data reduction |
Missing tool: A simple data → data tool is needed for basic mapping tests (e.g., ok_list_to_dataset.gxwf.yml). None of the existing functional test tools have a single data input and single data output — collection_creates_pair takes data but outputs a paired collection. Add a minimal simple_data_tool.xml to test/functional/tools/ with one data input and one data output.
2.3 FunctionalGetToolInfo
Adapter that bridges functional_test_tool_source to the GetToolInfo protocol:
class FunctionalGetToolInfo:
"""GetToolInfo backed by functional test tool XMLs."""
def __init__(self):
self._cache: Dict[str, Optional[ParsedTool]] = {}
def get_tool_info(self, tool_id: str, tool_version: Optional[str]) -> Optional[ParsedTool]:
if tool_id not in self._cache:
try:
ts = functional_test_tool_source(tool_id)
self._cache[tool_id] = parse_tool(ts)
except Exception:
self._cache[tool_id] = None
return self._cache[tool_id]
Shared as a module-level or session-scoped pytest fixture to avoid re-parsing tools per test.
2.4 Test Runner: test_connection_workflows.py
import pytest
import yaml
from pathlib import Path
from gxformat2.converter import python_to_workflow
WORKFLOW_DIR = Path(__file__).parent / "connection_workflows"
EXPECTED_DIR = WORKFLOW_DIR / "expected"
def discover_workflows():
return sorted(WORKFLOW_DIR.glob("*.gxwf.yml"))
def workflow_ids():
return [p.stem for p in discover_workflows()]
@pytest.fixture(scope="module")
def tool_info():
return FunctionalGetToolInfo()
@pytest.mark.parametrize("workflow_path", discover_workflows(), ids=workflow_ids())
def test_connection_workflow(workflow_path, tool_info):
"""Validate fixture workflow and check ok_/fail_ expectation + sidecar assertions."""
wf_dict = yaml.safe_load(workflow_path.read_text())
native = python_to_workflow(wf_dict, MinimalGalaxyInterface())
result = validate_connections(native, tool_info)
report = to_connection_validation_report(result)
stem = workflow_path.stem
if stem.startswith("ok_"):
assert report.valid, f"Expected valid but got errors: {_collect_errors(report)}"
elif stem.startswith("fail_"):
assert not report.valid, f"Expected invalid but workflow validated"
else:
pytest.skip(f"Unknown prefix for {stem}")
# Check sidecar expectations if present
expected_path = EXPECTED_DIR / f"{stem}.yml"
if expected_path.exists():
raw = yaml.safe_load(expected_path.read_text())
expectations = [(e["target"], e["value"]) for e in raw]
report_dict = report.model_dump(by_alias=True)
dict_verify_each(report_dict, expectations)
The parametrized test auto-discovers workflows. Adding a new test case = dropping a YAML file.
2.5 Sidecar Expectation Files
Sidecar .yml files are hand-written lists of [json_path, expected_value] pairs — the same pattern as dict_verify_each in test/unit/tool_util/util.py. Each assertion is a deliberate human/agent judgment, not a snapshot.
Example for ok_list_paired_to_paired.yml:
- target: [valid]
value: true
- target: [step_results, 1, map_over]
value: "list"
- target: [step_results, 1, connections, 0, status]
value: "ok"
- target: [step_results, 1, connections, 0, mapping]
value: "list"
- target: [step_results, 1, resolved_outputs, 0, collection_type]
value: "list"
Example for fail_incompatible_map_over.yml:
- target: [valid]
value: false
- target: [step_results, 2, errors, 0]
value: "Incompatible map-over types: list vs paired"
Properties:
- Each pair is a deliberate assertion — someone decided this specific path should have this value
- Not every field is checked — only the behaviors that matter for the scenario
- Easy to review: each line is a self-contained claim about the validation result
- No regeneration scripts needed — these are authored, not generated
3. Relationship to Existing Tests
3.1 What Moves to Fixtures
The following tests from test_connection_validation.py become fixture workflows:
| Current Test | Fixture |
|---|---|
test_dataset_to_dataset_ok | ok_dataset_to_dataset.gxwf.yml |
test_collection_to_matching_collection_ok | ok_paired_to_paired.gxwf.yml |
test_incompatible_collection_types_invalid | fail_list_to_paired.gxwf.yml |
test_paired_to_paired_or_unpaired_ok | ok_paired_to_paired_or_unpaired.gxwf.yml |
test_list_to_any_collection_ok | ok_any_collection_accepts_list.gxwf.yml |
test_list_paired_over_paired | ok_list_paired_to_paired.gxwf.yml + expectations |
test_list_over_dataset | ok_list_to_dataset.gxwf.yml + expectations |
test_list_to_multi_data_ok | ok_list_to_multi_data.gxwf.yml |
test_paired_to_multi_data_not_reduction | ok_paired_maps_over_multi_data.gxwf.yml + expectations |
test_incompatible_map_over_types_error | fail_incompatible_map_over.gxwf.yml + expectations |
test_map_over_propagates_through_chain | ok_chain_map_over_propagates.gxwf.yml + expectations |
test_collection_output_with_map_over | ok_collection_output_with_map_over.gxwf.yml + expectations |
test_collection_type_source_resolution | ok_collection_type_source.gxwf.yml + expectations |
test_structured_like_resolution | ok_structured_like.gxwf.yml + expectations |
3.2 What Stays Programmatic
test_connection_types.py— unit tests for the adapter free functions. These testcan_match,can_map_over,effective_map_overdirectly against collection type pairs. No workflow context needed.test_connection_graph.py— unit tests for graph building internals:_parse_connections,_collect_data_inputs(conditional/repeat/section walking),_collect_outputs,_topological_sort, cycle detection. These test the graph builder’s parameter tree navigation, not validation outcomes.test_unresolved_tool_skips— stays programmatic since it requires a tool_id that doesn’t exist in functional test tools, and this is specifically about the skip behavior. Could also be a fixture with askip_prefix if we add that convention.
3.3 Migration Path
- Build the Pydantic report models and wire resolved outputs
- Build
FunctionalGetToolInfoand the test runner - Write fixture workflows one at a time, verifying against existing programmatic test assertions
- Write sidecar expectation files for fixtures needing detailed assertions
- Remove corresponding programmatic tests as fixtures take over
- Keep
test_connection_types.pyandtest_connection_graph.pyas-is
4. collection_semantics.yml Tracking
4.1 New Test Category
Add workflow_format_validation to each example’s tests: block. The value is the fixture workflow filename (without .gxwf.yml):
- example:
label: BASIC_MAPPING_PAIRED
# ... existing content ...
tests:
tool_runtime:
api_test: "test_tool_execute.py::test_map_over_collection"
workflow_runtime:
framework_test: "collection_semantics_cat_0"
workflow_editor: "accepts paired data -> data connection"
workflow_format_validation: "ok_list_paired_to_paired"
The entry IS the filename. The test runner discovers the file by name, validates it, and the ok/fail status maps directly to the collection_semantics.yml example’s expected behavior.
4.2 Fixture ↔ Example Mapping
Each collection_semantics.yml example that describes a connection scenario gets a fixture workflow. The fixture filename should be traceable to the example label:
| Example Label | Fixture | Expected |
|---|---|---|
BASIC_MAPPING_PAIRED | ok_basic_mapping_paired.gxwf.yml | valid, map_over=paired |
BASIC_MAPPING_LIST | ok_basic_mapping_list.gxwf.yml | valid, map_over=list |
COLLECTION_INPUT_PAIRED | ok_collection_input_paired.gxwf.yml | valid, direct match |
COLLECTION_INPUT_PAIRED_NOT_CONSUMES_LIST | fail_paired_not_consumes_list.gxwf.yml | invalid |
PAIRED_OR_UNPAIRED_CONSUMES_PAIRED | ok_pou_consumes_paired.gxwf.yml | valid, direct match |
PAIRED_OR_UNPAIRED_NOT_CONSUMED_BY_PAIRED | fail_pou_not_consumed_by_paired.gxwf.yml | invalid |
MAPPING_LIST_PAIRED_OVER_PAIRED | ok_list_paired_to_paired.gxwf.yml | valid, map_over=list |
LIST_REDUCTION | ok_list_to_multi_data.gxwf.yml | valid, reduction |
PAIRED_REDUCTION_INVALID | fail_paired_reduction.gxwf.yml | invalid |
SAMPLE_SHEET_MATCHES_LIST | ok_sample_sheet_to_list.gxwf.yml | valid |
LIST_NOT_MATCHES_SAMPLE_SHEET | fail_list_not_matches_sample_sheet.gxwf.yml | invalid |
Not every example maps 1:1 — some examples describe runtime behavior (job execution, element identity) that connection validation can’t test. Those get workflow_format_validation: ~ (null) to explicitly mark them as not applicable.
4.3 Coverage Verification
A script or test that loads collection_semantics.yml, extracts all examples with workflow_format_validation entries, and verifies that the referenced fixture file exists and passes/fails as expected. This prevents drift between the tracking document and the test fixtures.
def test_collection_semantics_coverage():
"""Verify all collection_semantics.yml entries have matching fixtures."""
examples = load_collection_semantics_examples()
for ex in examples:
fixture_name = ex.tests.get("workflow_format_validation")
if fixture_name is None:
continue # Not applicable for this example
fixture_path = WORKFLOW_DIR / f"{fixture_name}.gxwf.yml"
assert fixture_path.exists(), f"Missing fixture for {ex.label}: {fixture_name}"
4.4 Example Entries Without Fixture Workflows
Some examples describe behaviors that can’t be tested by connection validation alone:
- Runtime element identity/ordering — requires actual job execution
collection_type_from_rules— explicitly out of scope- Scatter/merge semantics — explicitly out of scope
These get workflow_format_validation: ~ with a comment explaining why.
5. Implementation Order
| Step | Delivers | Depends On |
|---|---|---|
| 5.1 | Pydantic report models (ConnectionResult, ConnectionStepResult, ConnectionValidationReport) | Existing dataclass models |
| 5.2 | Wire resolved output types into validation result | 5.1 |
| 5.3 | Extend WorkflowValidationResult with connection_report | 5.1 |
| 5.4 | Update validate.py to use Pydantic reports (replace format_connection_text, integrate into emit_reports) | 5.1, 5.3 |
| 5.5 | Text + markdown formatters for connection report | 5.4 |
| 5.6 | FunctionalGetToolInfo adapter + gxformat2 conversion helper | — |
| 5.7 | Test runner (test_connection_workflows.py) with fixture discovery + dict_verify_each sidecar loading | 5.6 |
| 5.8 | First batch of fixture workflows (10-15 covering the main cases) + hand-written expectation files | 5.7 |
| 5.9 | Remove migrated programmatic tests from test_connection_validation.py | 5.8 |
| 5.10 | collection_semantics.yml entries for all fixture workflows | 5.8 |
| 5.11 | Coverage verification test | 5.10 |
6. Unresolved Questions
- For the
sample_sheetasymmetry cases (LIST_NOT_MATCHES_SAMPLE_SHEET): the base classcan_match_typenormalizes symmetrically. Do we need to add asymmetry enforcement to the connection validator before writing those fixtures, or mark them as known gaps? - Format2 connection validation support: when it lands, do the same fixtures run through a second parametrize axis (
@pytest.mark.parametrize("format", ["native", "format2"]))? Seems like yes.