IWC conditionals survey
Source corpus: 120 cleaned gxformat2 workflows under $IWC_FORMAT2/. The local $IWC_SKELETONS tree was not present, so this survey used full $IWC_FORMAT2 reads for both topology and parameter evidence. Counts below come from rg -n "^\s*when:" $IWC_FORMAT2 --glob "*.gxwf.yml"; citations use $IWC_FORMAT2/path:line.
1. How IWC actually uses conditionals
The corpus uses Galaxy when: gates, but not as broad workflow-level branching. There are 111 when: occurrences across 21 workflow files. Every observed gate is serialized as when: $(inputs.when) on the gated step, with a sibling input named when connected to either a user boolean, a mapped boolean, or a small subworkflow that computes a boolean from data shape. Representative direct user gates include optional BUSCO assessment in BRAKER3 ($IWC_FORMAT2/genome_annotation/annotation-braker3/Genome_annotation_with_braker3.gxwf.yml:110-141, $IWC_FORMAT2/genome_annotation/annotation-braker3/Genome_annotation_with_braker3.gxwf.yml:341-385) and optional StringTie/Cufflinks FPKM branches in RNA-seq ($IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:1082-1140, $IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:1141-1214).
The largest dense uses are not arbitrary if/else logic. They are repeated gates around optional outputs or optional sub-recipes: the MGnify amplicon workflows gate Krona and BIOM conversion outputs only when their upstream collection is non-empty ($IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1330-1357, $IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1358-1483, $IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1484-1659). VGP Hi-C workflows use gates for optional haplotype suffixing, optional extraction/unboxing, and optional Pretext tracks ($IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:338-459, $IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3057-3218, $IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3289-3346).
Corpus-zero or near-zero points matter for pattern boundaries. __FILTER_NULL__ has zero hits in $IWC_FORMAT2, so IWC does not show the canonical “conditional step may produce null; filter nulls downstream” shape. The closest observed idioms are either not running the optional step at all, then choosing between alternatives with pick_value, or deriving a boolean from an empty/non-empty output before gating follow-on reporting ($IWC_FORMAT2/VGP-assembly-v2/Assembly-decontamination-VGP9/Assembly-decontamination-VGP9.gxwf.yml:243-305, $IWC_FORMAT2/VGP-assembly-v2/Assembly-decontamination-VGP9/Assembly-decontamination-VGP9.gxwf.yml:331-391, $IWC_FORMAT2/VGP-assembly-v2/Assembly-decontamination-VGP9/Assembly-decontamination-VGP9.gxwf.yml:392-409).
2. Inventory, demoted
| Measure | Count | Notes |
|---|---|---|
| Cleaned workflows scanned | 120 | $IWC_FORMAT2/**/*.gxwf.yml |
Workflows with when: | 21 | Concentrated in VGP, amplicon, microbiome, transcriptomics, genome annotation |
when: step occurrences | 111 | Includes nested subworkflows embedded in format2 files |
__FILTER_NULL__ occurrences | 0 | No corpus-backed pattern candidate |
Workflow-file count by top-level domain:
| Domain | Files with when: |
|---|---|
VGP-assembly-v2 | 5 |
amplicon | 3 |
microbiome | 3 |
bacterial_genomics | 2 |
genome_annotation | 2 |
transcriptomics | 2 |
comparative_genomics | 1 |
sars-cov-2-variant-calling | 1 |
scRNAseq | 1 |
virology | 1 |
Most common gated tool IDs, by observed when: occurrences:
| Gated tool | Count | Interpretation |
|---|---|---|
biom_convert | 16 | Optional format export after non-empty amplicon result collections |
taxonomy_krona_chart | 8 | Optional visualization after non-empty taxonomic outputs |
__EXTRACT_DATASET__ | 6 | Optional unboxing/extraction after shape tests |
pretext_graph | 6 | Optional Pretext track addition when track data exists |
tp_replace_in_line | 4 | Optional text rewrite in suffixing/masking recipes |
eggnog_mapper | 4 | One-of-N annotation mode routing |
samtools_merge | 4 | Optional merge/reduction branches in VGP workflows |
3. Recurring idioms
3a. User boolean gates optional analysis
This is the simplest conditional shape: expose a boolean workflow input, connect it to id: when, and put when: $(inputs.when) on every step in the optional branch. BRAKER3 gates both genome BUSCO and protein BUSCO behind Include BUSCO ($IWC_FORMAT2/genome_annotation/annotation-braker3/Genome_annotation_with_braker3.gxwf.yml:110-141, $IWC_FORMAT2/genome_annotation/annotation-braker3/Genome_annotation_with_braker3.gxwf.yml:341-385). RNA-seq gates StringTie and Cufflinks FPKM branches independently with Compute StringTie FPKM and Compute Cufflinks FPKM ($IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:1082-1140, $IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:1141-1214). The VGP Hi-C manual-curation workflow gates a multi-step suffixing branch from Do you want to add suffixes to the scaffold names? ($IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:312-367, $IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:368-459).
This is a strong candidate because it is small, direct, and repeated across unrelated domains. The authoring rule is not tool-specific; the tool-specific details live inside the optional branch.
3b. Mutually exclusive data-prep branches plus pick_value
Several workflows run one of two or more alternative steps, then collapse the possible outputs back to one downstream value with pick_value. Scanpy imports 10x matrices through either the legacy or v3 anndata_import mode, maps the user boolean to its opposite with map_param_value, gates both import steps, then uses pick_value to choose the first present AnnData output ($IWC_FORMAT2/scRNAseq/scanpy-clustering/Preprocessing-and-Clustering-of-single-cell-RNA-seq-data-with-Scanpy.gxwf.yml:180-211, $IWC_FORMAT2/scRNAseq/scanpy-clustering/Preprocessing-and-Clustering-of-single-cell-RNA-seq-data-with-Scanpy.gxwf.yml:212-241, $IWC_FORMAT2/scRNAseq/scanpy-clustering/Preprocessing-and-Clustering-of-single-cell-RNA-seq-data-with-Scanpy.gxwf.yml:337-399). Functional annotation fans out four eggnog_mapper modes from booleans derived upstream, then selects outputs with pick_value ($IWC_FORMAT2/genome_annotation/functional-annotation/functional-annotation-of-sequences/Functional_annotation_of_sequences.gxwf.yml:244-395, $IWC_FORMAT2/genome_annotation/functional-annotation/functional-annotation-of-sequences/Functional_annotation_of_sequences.gxwf.yml:396-429).
This is the main conditional routing idiom in IWC. It is not just a when: pattern; the reusable recipe is when-gated alternatives -> pick first available output.
3c. Data-derived boolean from empty/non-empty collection
MGnify amplicon pipelines compute a boolean from a collection, then use it to gate output generation. The embedded subworkflow labeled Map empty/not empty collection to boolean extracts collection identifiers, counts lines with wc_gnu, converts c1 != 0 into a boolean table with column_maker, and reads the result with param_value_from_file ($IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1358-1483). That boolean gates Krona and BIOM exports for SSU/LSU SILVA outputs ($IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1330-1357, $IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1484-1659).
VGP Hi-C uses the same broader shape with datasets rather than collections: param_value_from_file reads telomere BED outputs as text, map_param_value maps empty string to false and unmapped non-empty text to true, then pretext_graph steps are gated ($IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3057-3218, $IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3289-3346).
This is probably the highest-value conditionals pattern because it shows how IWC authors avoid running downstream reporting tools on empty results without relying on __FILTER_NULL__.
3d. Optional transform then fallback to original
VGP suffixing and decontamination recipes show a when-gated transform paired with pick_value fallback. In Hi-C manual curation, optional suffix expressions and replacements run only when the user requests suffixes, then pick_value chooses the suffixed haplotype when present or the original haplotype otherwise ($IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:338-459, $IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:460-479). In Assembly decontamination, an upstream mapped boolean gates adaptor filtering, replacement, and concatenation; later pick_value falls back to the original Adaptor Action report when the optional masking branch does not run ($IWC_FORMAT2/VGP-assembly-v2/Assembly-decontamination-VGP9/Assembly-decontamination-VGP9.gxwf.yml:243-305, $IWC_FORMAT2/VGP-assembly-v2/Assembly-decontamination-VGP9/Assembly-decontamination-VGP9.gxwf.yml:331-391, $IWC_FORMAT2/VGP-assembly-v2/Assembly-decontamination-VGP9/Assembly-decontamination-VGP9.gxwf.yml:392-409).
This differs from mutually exclusive alternatives: one side is a transformed version of the other, not a separate mode. It may deserve its own pattern if authors need to preserve downstream shape while letting an optional cleanup branch disappear.
3e. Optional subworkflow branch
IWC also gates entire embedded Galaxy subworkflows. MAGs-generation runs either co-assembly or individual assembly with metaSPAdes; the individual-assembly branch is an embedded workflow gated by _unlabeled_step_19/output_param_boolean, and a downstream pick_value chooses among individual, co-assembly, or custom assemblies ($IWC_FORMAT2/microbiome/mags-building/MAGs-generation.gxwf.yml:260-294, $IWC_FORMAT2/microbiome/mags-building/MAGs-generation.gxwf.yml:295-410, $IWC_FORMAT2/microbiome/mags-building/MAGs-generation.gxwf.yml:411-439).
This is a recipe extension of when-gated alternatives -> pick first available output; it probably should merge into that candidate rather than become a separate page.
4. Redundancy and decision points
The main redundancy is not between tools that do the same biological operation. It is between control-flow mechanisms:
| Need | Observed IWC mechanism | Nearby alternative | Survey call |
|---|---|---|---|
| User toggles optional analysis | Direct workflow boolean connected to when | Runtime parameter inside tool state | Keep direct when; it is explicit and corpus-backed |
| One-of-two or one-of-N route | Gate alternatives, then pick_value | One large wrapper/tool conditional | Keep as route pattern; Galaxy-native and visible in graph |
| Skip reporting when result is empty | Compute boolean from data/collection, then when | Run tool and filter null outputs | Keep data-derived gate; __FILTER_NULL__ has zero corpus uptake |
| Optional transform with original fallback | Gate transform, then pick_value original vs transformed | Duplicate downstream branches | Keep or merge with route pattern; evidence is good but boundary is close |
| Conditional outputs after failed map-over | __FILTER_EMPTY_DATASETS__/__FILTER_FAILED_DATASETS__ cleanup | when per element | Merge with collection cleanup; already covered by iwc-transformations-survey and existing collection cleanup patterns |
The anti-pattern note does not currently constrain conditionals directly. The strongest existing constraint comes from iwc-transformations-survey: __FILTER_NULL__ is not observed, while cleanup after map-over failure is already a collection-transform problem rather than a conditional-step problem.
5. Candidate pattern boundaries
Candidate A: conditional-run-optional-step
Scope: author a boolean-controlled optional branch by connecting the workflow boolean to id: when on every optional step, with no output selection unless downstream needs a replacement value.
Evidence:
- Optional BUSCO branches in BRAKER3:
$IWC_FORMAT2/genome_annotation/annotation-braker3/Genome_annotation_with_braker3.gxwf.yml:110-141,$IWC_FORMAT2/genome_annotation/annotation-braker3/Genome_annotation_with_braker3.gxwf.yml:341-385. - Optional StringTie/Cufflinks branches in RNA-seq:
$IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:1082-1140,$IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:1141-1214. - Optional suffixing branch in VGP Hi-C:
$IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:338-459.
Call: keep. This is the operation-level primitive all other conditional recipes build on.
Candidate B: conditional-route-between-alternative-outputs
Scope: run mutually exclusive alternatives with when, then use pick_value to collapse possible outputs to a single downstream value. Includes binary and one-of-N routing.
Evidence:
- Scanpy 10x legacy vs v3 import route:
$IWC_FORMAT2/scRNAseq/scanpy-clustering/Preprocessing-and-Clustering-of-single-cell-RNA-seq-data-with-Scanpy.gxwf.yml:180-211,$IWC_FORMAT2/scRNAseq/scanpy-clustering/Preprocessing-and-Clustering-of-single-cell-RNA-seq-data-with-Scanpy.gxwf.yml:337-399. - Functional annotation eggNOG mode fan-out:
$IWC_FORMAT2/genome_annotation/functional-annotation/functional-annotation-of-sequences/Functional_annotation_of_sequences.gxwf.yml:244-395,$IWC_FORMAT2/genome_annotation/functional-annotation/functional-annotation-of-sequences/Functional_annotation_of_sequences.gxwf.yml:396-429. - MAGs individual vs co-assembly/custom assemblies:
$IWC_FORMAT2/microbiome/mags-building/MAGs-generation.gxwf.yml:295-410,$IWC_FORMAT2/microbiome/mags-building/MAGs-generation.gxwf.yml:411-439.
Call: keep. This is the highest-value routing page because it explains the necessary downstream pick_value merge, not just the when field.
Candidate C: conditional-gate-on-nonempty-result
Scope: derive a boolean from a dataset or collection being empty/non-empty, then gate downstream reporting or conversion steps.
Evidence:
- MGnify collection identifier count to boolean:
$IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1358-1483. - MGnify gated Krona/BIOM outputs:
$IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1330-1357,$IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1484-1659. - VGP telomere track text-to-boolean gate:
$IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3057-3218,$IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3289-3346.
Call: keep. This is distinct from generic optional branching because the boolean is computed from data shape/content.
User story: a generated Galaxy workflow needs to skip optional reporting/export steps when an upstream dataset or collection is empty. The corpus-backed MGnify recipe proves the shape by turning collection membership into a boolean through collection_element_identifiers -> wc_gnu -> column_maker -> param_value_from_file, then using that boolean as inputs.when. This is structurally useful but clunky: it is a four-step shim for one boolean, so the pattern page should present it as verified IWC precedent, not necessarily the preferred authoring target.
Verification outcome: verification/workflows/conditional-gate-on-nonempty-result/gate-on-nonempty.gxwf-test.yml passes under Planemo against Galaxy release_25.1 with the MGnify-style shim. Direct when over the collection failed with when_not_boolean, and an embedded CWL ExpressionTool shim failed gxformat2 validation, so the known-good recipe remains collection_element_identifiers -> wc_gnu -> column_maker -> param_value_from_file -> when.
Candidate D: conditional-transform-or-pass-through
Scope: optionally transform an input, then choose transformed output if present or original input otherwise.
Evidence:
- Optional haplotype suffixing then fallback to original haplotypes:
$IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:338-479. - Optional masking action report then fallback to original report:
$IWC_FORMAT2/VGP-assembly-v2/Assembly-decontamination-VGP9/Assembly-decontamination-VGP9.gxwf.yml:243-409.
Call: keep separately for now. It shares pick_value with Candidate B, but the authoring decision is different: Candidate B routes among peer alternatives, while this recipe preserves one original value unless an optional transform runs. Merge later only if the page proves too thin after use.
Candidate E: conditional-filter-null-outputs
Scope: use __FILTER_NULL__ after conditional steps.
Evidence: zero __FILTER_NULL__ hits in $IWC_FORMAT2; iwc-transformations-survey also notes no corpus uptake.
Call: drop. Document as a catalog capability, not an IWC-backed pattern.
Candidate F: conditional-cleanup-after-mapover-failure
Scope: recover after mapped tools return empty or failed datasets.
Evidence: covered as collection cleanup in iwc-transformations-survey and existing collection cleanup patterns. The observed mechanisms are __FILTER_EMPTY_DATASETS__ and __FILTER_FAILED_DATASETS__, not when: gates.
Call: merge into the collection-cleanup pattern family; do not create a conditionals-specific page.
6. Open questions
-
Should conditionals get a MOC page (
galaxy-conditionals-patterns.md) now, or wait until at least two operation pages are authored? -
Answered for now: keep
conditional-transform-or-pass-throughseparate fromconditional-route-between-alternative-outputs. The boundary is “same value optionally modified” vs “peer alternatives routed to one output”. Revisit only if the separate page proves too thin. -
Answered for now:
verification/workflows/conditional-gate-on-nonempty-result/gate-on-nonempty.gxwf-test.ymlpasses with the MGnify shim, while the tested direct collection gate and embedded CWL expression-tool shim fail. Keep the MGnify recipe as lead known-good shape until a different short route validates. -
No. Do not add
__FILTER_NULL__to the anti-pattern note from zero uptake alone. Lack of uptake does not make a Galaxy feature an anti-pattern; users can be slow to adopt newer or esoteric features. Keep the survey call as “catalog capability, no IWC-backed pattern candidate” unless separate evidence shows a concrete reason not to endorse it. -
Does the Foundry need a background reference note for Galaxy
when:syntax and nullable downstream behavior before operation pattern pages are useful?