Home Research

Galaxy Workflow Testability Design

Design guidance for Galaxy workflow inputs, outputs, and checkpoints that make IWC-style workflow tests possible.

Raw
Revised
2026-05-06
Rev
2
component

Galaxy workflow testability design

Use this note when authoring or translating a Galaxy workflow before the -tests.yml file exists. It covers workflow structure choices that make later IWC-style tests meaningful: labels, promoted checkpoints, collection identifiers, and fixture-compatible inputs.

This is not a content/patterns/ page. It is cross-cutting design guidance for Molds that need testable Galaxy workflows. Assertion syntax lives in planemo-asserts-idioms. Test YAML fixture shapes live in iwc-test-data-conventions. Accepted shortcut vs smell calls live in iwc-shortcuts-anti-patterns. Corpus evidence trail lives in iwc-workflow-testability-survey.

1. Treat labels as API

Workflow input and output labels are not cosmetic. Planemo and IWC tests address workflow inputs and outputs by label, and the survey found exact label matches for every asserted output across 114 matched workflow/test pairs. A generated workflow should therefore pick stable, descriptive labels before test authoring starts.

Rules:

  • Label every output that may need a test assertion.
  • Treat input/output renames as breaking changes requiring sibling -tests.yml updates.
  • Prefer stable domain names over tool-step defaults or positional names.
  • Do not rely on unlabeled or positional outputs for tests.

Evidence:

2. Promote assertable checkpoints

IWC workflow tests assert workflow-level outputs. Intermediate step results are invisible unless promoted to top-level workflow outputs. When final reports are weakly assertable, expose intermediate checkpoints that carry deterministic content or structure.

Rules:

  • Promote intermediate outputs when they are the best deterministic or structural checkpoint.
  • Prefer a checkpoint table/text/HDF5 object that can prove content over a final plot/report that can only prove existence.
  • Accept some output-list clutter when it buys meaningful tests.
  • Do not promote every intermediate by default; expose checkpoints that map to concrete assertion intent.

Evidence:

3. Stabilize collection output identifiers

Collection tests key assertions by element identifier. If a workflow emits collections with unstable or opaque identifiers, the test cannot target elements cleanly.

Rules:

  • Preserve biologically or sample-meaningful identifiers through map-over and collection reshaping.
  • When generating or relabeling collections, make the identifier derivation deterministic and visible in workflow structure.
  • For nested collections, ensure each axis has predictable identifiers.
  • Quote special identifiers in tests when YAML requires it, but do not simplify identifiers merely for YAML convenience.

Evidence:

4. Choose checkpoints by assertion strength

Assertion choice is not only a test-file decision. It should feed back into workflow output design. If the only exposed output is a stochastic plot or binary file, the best possible test may be a weak size check. Exposing a sibling table, report, HDF5 structure, or summary line can make the same workflow much more testable.

Rules:

  • For image-heavy workflows, expose data or summary outputs behind the plot when possible.
  • For stochastic statistical outputs, expose structural checkpoints and stable summary tokens.
  • For binary outputs, expose a text/table report or stats file when the tool can produce one.
  • Use planemo-asserts-idioms to select the assertion family after choosing the checkpoint.

Evidence:

5. Design inputs with fixtures in mind

Workflow input labels and types constrain the eventual job: block. Fixture planning is not only a test-file activity: it should influence whether the workflow exposes a file input, a collection input, a string data-table input, or a typed parameter.

Rules:

  • Choose input labels that will be readable as test job: keys.
  • Match workflow input collection types to realistic fixture shapes.
  • Decide early whether reference data should be a portable remote file or a CVMFS/data-table string.
  • Keep typed parameters explicit when tests need to set them (int, boolean, string) rather than burying them in step defaults.

Evidence:

6. Know what a gxformat2 output entry contains

Top-level gxformat2 outputs: is the public workflow-output surface. It is separate from per-step out: declarations and from step post-job actions such as change_datatype or rename.

Authoring rules:

  • Use label as the stable public name tests and users will address.
  • Use outputSource to point at the producing step output; do not rely on positional output order.
  • Use doc for short user-facing context when the label is not self-explanatory.
  • Keep type aligned with the exposed value (data, collection, or scalar vocabulary from gxformat2-workflow-inputs) when the schema needs it.
  • Apply change_datatype at the producing step output when Galaxy needs a stronger datatype than the tool reports; choose values from galaxy-datatypes-conf.
  • Use rename only for generated dataset names inside Galaxy histories. It is not a substitute for stable workflow-output label.
  • Treat add_tags and remove_tags as metadata helpers, not as the test API. IWC tests key by labels and collection element identifiers, not tags.
  • Avoid hide or delete_intermediate_datasets on outputs that are promoted as test checkpoints.

Design inference: a workflow-output promotion decision should pick both the public outputs: entry and any producer-side post-job action needed to make that output useful. For example, a synthesized BED checkpoint needs a stable output label plus a producer-side change_datatype: bed; one without the other is incomplete for a testable workflow.

Cross-references

Incoming References (9)

  • implement-galaxy-workflow-testrelated note— Assemble Galaxy workflow test fixtures and assertions.
  • gxformat2 structural JSON Schemarelated note— Vendored structural JSON Schema for gxformat2 workflows: vocabulary for inputs, outputs, steps, and step subtypes.
  • gxformat2 workflow inputsrelated note— Conceptual model, current aliases, and schema gaps for gxformat2 workflow inputs.
  • Iwc Shortcuts Anti Patternsrelated note— What IWC test suites cut corners on (accepted) vs what's a code smell — existence-only probes, sim_size deltas, image dim checks, label coupling.
  • Iwc Test Data Conventionsrelated note— How IWC workflows organize and reference test data — Zenodo-first, SHA-1 integrity, collection shapes, CVMFS gotchas.
  • Iwc Workflow Testability Surveyrelated note— IWC evidence survey for Galaxy workflow structures that make workflow tests meaningful.
  • Nextflow params to Galaxy workflow inputsrelated note— Rules for translating Nextflow params, sample sheets, channels, and control flags into gxformat2 inputs.
  • Planemo Asserts Idiomsrelated note— Decision and idiom guide for picking planemo workflow-test assertions: which family per output type, how to size tolerances, when to validate.
  • Planemo workflow-test architecturerelated note— Reference for Planemo workflow test/run architecture, Galaxy modes, API polling, and noisy failure boundaries.