IWC runtime parameter shims survey
Source corpus: 120 cleaned gxformat2 workflows under $IWC_FORMAT2/, materialized from pinned IWC commit deafc4876f2c778aaf075e48bd8e95f3604ccc92. This is a narrower follow-up to iwc-parameter-derivation-survey: it looks only at tiny glue steps that adapt runtime values for downstream tool parameters.
Scope: small parameter shims that turn workflow inputs, scalar files, counts, or short strings into typed connected parameters: flags, enum dialects, booleans, column-list strings, expressions, labels, and counts.
Out of scope:
- Broad data-to-parameter derivation inventory; covered by iwc-parameter-derivation-survey.
- Conditional topology after the boolean exists; covered by iwc-conditionals-survey.
- Row-wise tabular computation whose output remains a dataset; covered by iwc-tabular-operations-survey and tabular-compute-new-column.
Shim Surface
The corpus uses four first-class shim mechanisms and one supporting count/computation chain. Parsed counts from iwc-parameter-derivation-survey put the authored-step surface at 63 compose_text_param steps, 50 param_value_from_file steps, 26 map_param_value steps, 8 wc_gnu count shims, and 20 column_maker/Add_a_column1 steps that are parameter-adjacent rather than always parameter shims.
| Shim operation | Main observed tool/recipe | Pattern status |
|---|---|---|
| Compose runtime text | compose_text_param | Already authored: compose-runtime-text-parameter |
| Read scalar file as typed parameter | param_value_from_file | Already authored: derive-parameter-from-file |
| Map enum/string/boolean/integer to another parameter value | map_param_value | Already authored for enum-to-tool dialect: map-workflow-enum-to-tool-parameter |
| Count file/collection then parameterize | wc_gnu -> param_value_from_file, sometimes with collection_element_identifiers | Folded into derive-parameter-from-file and conditional-gate-on-nonempty-result |
| Compute boolean/string in tabular-land, then parameterize | column_maker -> param_value_from_file | Merge boundary with tabular-compute-new-column plus derive-parameter-from-file |
No simple expression-tool shim is corpus-backed here. Searches for ExpressionTool, expression_tool, __EXPRESSION, and related CWL-expression markers returned zero IWC gxformat2 hits. The verified conditionals pattern separately records that an attempted embedded CWL ExpressionTool shim failed gxformat2 validation, so it should not be promoted as an IWC-backed replacement for the clunky collection-count boolean chain.
Recurring Shim Idioms
1. Enum or label normalization into downstream tool dialects
map_param_value is the core adapter when one workflow-facing value must become exact downstream syntax. RNA-seq maps one Strandedness input into separate dialects for featureCounts numeric codes, Cufflinks library types, and StringTie flags; each dialect gets a separate mapper step with on_unmapped: fail ($IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:270-368). BREW3R repeats the strandedness-to-StringTie-flag and strandedness-to-boolean split in a smaller workflow ($IWC_FORMAT2/transcriptomics/brew3r/BREW3R.gxwf.yml:51-109). VGP maps haplotype labels into suffix fragments like H1, H2, pri, and alt, then composes downstream strings from those fragments ($IWC_FORMAT2/VGP-assembly-v2/Scaffolding-HiC-VGP8/Scaffolding-HiC-VGP8.gxwf.yml:276-344).
This is a stable pattern leaf when the output is a tool parameter value, not a graph-control boolean.
2. Boolean inversion and one-of-N boolean fanout
map_param_value also expresses boolean inversion and enum-to-boolean fanout for conditional routing. Scanpy maps a user boolean to its opposite so legacy 10x and 10x v3 import branches are mutually exclusive ($IWC_FORMAT2/scRNAseq/scanpy-clustering/Preprocessing-and-Clustering-of-single-cell-RNA-seq-data-with-Scanpy.gxwf.yml:173-241). Functional annotation maps Selected sequence type into four one-hot booleans, then gates four eggnog_mapper branches ($IWC_FORMAT2/genome_annotation/functional-annotation/functional-annotation-of-sequences/Functional_annotation_of_sequences.gxwf.yml:87-243). VGP subworkflows map collection-size text values into Is empty, Has a single sample, and Has multiple samples booleans ($IWC_FORMAT2/VGP-assembly-v2/Scaffolding-HiC-VGP8/Scaffolding-HiC-VGP8.gxwf.yml:760-937).
This is implementation detail inside conditional routing pages, not a standalone parameter-shim page. The user story is branch selection, so the candidate should merge into conditional-route-between-alternative-outputs and conditional-run-optional-step.
3. Empty/non-empty mapping for runtime gates
IWC uses two shapes for empty/non-empty booleans. MGnify converts a collection to a boolean by extracting collection element identifiers, counting lines with wc_gnu, computing c1 != 0 with column_maker, and reading the result as boolean_param ($IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1358-1483). VGP Hi-C reads telomere BED files as text, maps "" -> false, and defaults unmapped non-empty text to true before gating Pretext graph steps ($IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3057-3218).
This is pattern-worthy, but the pattern is already conditional-gate-on-nonempty-result. The shim mechanics should remain a section there because the operation is “gate on a non-empty result,” not “derive a boolean for its own sake.”
4. Scalar-file bridge into parameter-land
param_value_from_file is the escape hatch from one-value datasets to typed runtime ports. Pox-virus amplicon reads sequence and pool positions from small files as text parameters, then composes genomic ranges and pool suffix strings ($IWC_FORMAT2/virology/pox-virus-amplicon/pox-virus-half-genome.gxwf.yml:409-479, $IWC_FORMAT2/virology/pox-virus-amplicon/pox-virus-half-genome.gxwf.yml:560-669). Consensus peaks counts replicates with wc_gnu before feeding count-driven downstream parameters ($IWC_FORMAT2/epigenetics/consensus-peaks/consensus-peaks-atac-cutandrun.gxwf.yml:299-318). MGnify’s collection gate uses the same bridge at the last step, but with param_type: boolean ($IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1447-1463).
This is already correctly centralized in derive-parameter-from-file. Count recipes are common enough to document inside that page, but not distinct enough to split into their own leaf.
5. Runtime text composition for expressions, column lists, config lines, and labels
compose_text_param is the strongest standalone runtime shim. Consensus peaks composes c4 >= <minimum overlap> and connects it to Filter1.cond ($IWC_FORMAT2/epigenetics/consensus-peaks/consensus-peaks-atac-cutandrun.gxwf.yml:102-128, $IWC_FORMAT2/epigenetics/consensus-peaks/consensus-peaks-atac-cutandrun.gxwf.yml:318-336). SRA manifest processing maps a zero column number to one, then composes a dynamic Cut1.columnList such as cN,cM ($IWC_FORMAT2/data-fetching/sra-manifest-to-concatenated-fastqs/sra-manifest-to-concatenated-fastqs.gxwf.yml:32-113). GROMACS dcTMD composes config lines such as pull_coord1_rate = <float>, dt = <float>, and nsteps = <integer> ($IWC_FORMAT2/computational-chemistry/gromacs-dctmd/gromacs-dctmd.gxwf.yml:553-654). Pox-virus amplicon composes genomic ranges and pool suffixes from file-derived text ($IWC_FORMAT2/virology/pox-virus-amplicon/pox-virus-half-genome.gxwf.yml:560-669).
This deserves to stay as its own operation page. It is the clearest answer to “how do I build a dynamic downstream parameter without authoring a custom wrapper?“
6. Enum-to-code and integer normalization as small value repairs
Not every map_param_value is an enum dialect. SRA manifest processing maps user-provided column 0 to 1, passes all other integers through, and then composes the resulting value into a Cut1 column list ($IWC_FORMAT2/data-fetching/sra-manifest-to-concatenated-fastqs/sra-manifest-to-concatenated-fastqs.gxwf.yml:32-63). This is a tiny normalization repair, not a broad pattern leaf.
Keep this as an example inside map-workflow-enum-to-tool-parameter or compose-runtime-text-parameter only if those pages need an integer-normalization exemplar.
7. Long snippets selected by enum are possible but brittle
The taxonomic-rank summary workflow maps Taxonomic rank to long awk programs, then connects the selected program into tp_awk_tool.code ($IWC_FORMAT2/amplicon/amplicon-mgnify/taxonomic-rank-abundance-summary-table/taxonomic-rank-abundance-summary-table.gxwf.yml:35-140). This proves the mechanism can select executable text, but it is a clunky authoring surface: long snippets in frontmatter-like tool state are hard to review and easy to damage.
Do not create a separate “enum-to-program” pattern from this alone. Mention it as a high-power, high-risk edge case in map-workflow-enum-to-tool-parameter and cross-link to tabular operation pages when the real operation is taxonomy splitting.
Candidate Pattern Boundaries
Candidate A: compose-runtime-text-parameter
Scope: build connected text parameters from ordered literal and runtime pieces.
Evidence:
- Filter predicate composition in consensus peaks:
$IWC_FORMAT2/epigenetics/consensus-peaks/consensus-peaks-atac-cutandrun.gxwf.yml:102-128,$IWC_FORMAT2/epigenetics/consensus-peaks/consensus-peaks-atac-cutandrun.gxwf.yml:318-336. - Dynamic column-list composition in SRA manifest processing:
$IWC_FORMAT2/data-fetching/sra-manifest-to-concatenated-fastqs/sra-manifest-to-concatenated-fastqs.gxwf.yml:61-113. - Config-line composition in GROMACS dcTMD:
$IWC_FORMAT2/computational-chemistry/gromacs-dctmd/gromacs-dctmd.gxwf.yml:553-654.
Call: keep; already authored as compose-runtime-text-parameter. This remains the highest-value standalone shim leaf.
Candidate B: derive-parameter-from-file
Scope: read one scalar dataset as a typed parameter, including count-to-integer and file-text-to-parameter recipes.
Evidence:
- Pox-virus sequence and pool positions read as text parameters:
$IWC_FORMAT2/virology/pox-virus-amplicon/pox-virus-half-genome.gxwf.yml:409-479. - Consensus peak replicate count via
wc_gnu:$IWC_FORMAT2/epigenetics/consensus-peaks/consensus-peaks-atac-cutandrun.gxwf.yml:299-318. - MGnify boolean read from one-cell table:
$IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1447-1463.
Call: keep; already authored as derive-parameter-from-file. Do not split wc_gnu count shims into a separate page yet.
Candidate C: map-workflow-enum-to-tool-parameter
Scope: convert workflow-facing enum/string/integer values into exact downstream tool dialects, flags, snippets, or fragments.
Evidence:
- RNA-seq strandedness to several downstream dialects:
$IWC_FORMAT2/transcriptomics/rnaseq-pe/rnaseq-pe.gxwf.yml:270-368. - VGP haplotype label to suffix fragment:
$IWC_FORMAT2/VGP-assembly-v2/Scaffolding-HiC-VGP8/Scaffolding-HiC-VGP8.gxwf.yml:276-344. - Taxonomic rank to awk program text:
$IWC_FORMAT2/amplicon/amplicon-mgnify/taxonomic-rank-abundance-summary-table/taxonomic-rank-abundance-summary-table.gxwf.yml:35-140.
Call: keep; already authored as map-workflow-enum-to-tool-parameter. Add a future refinement only if that page needs an explicit “integer normalization” subsection for the SRA 0 -> 1 case.
Candidate D: map-parameter-for-conditional-routing
Scope: map booleans or enums to booleans used only for when gates.
Evidence:
- Scanpy boolean inversion for two import branches:
$IWC_FORMAT2/scRNAseq/scanpy-clustering/Preprocessing-and-Clustering-of-single-cell-RNA-seq-data-with-Scanpy.gxwf.yml:173-241. - Functional annotation one-of-N boolean fanout:
$IWC_FORMAT2/genome_annotation/functional-annotation/functional-annotation-of-sequences/Functional_annotation_of_sequences.gxwf.yml:87-243. - VGP collection-size classification booleans:
$IWC_FORMAT2/VGP-assembly-v2/Scaffolding-HiC-VGP8/Scaffolding-HiC-VGP8.gxwf.yml:760-937.
Call: merge into conditional-route-between-alternative-outputs and conditional-run-optional-step. The authoring decision is conditional topology, not parameter normalization.
Candidate E: derive-nonempty-boolean-parameter
Scope: turn empty/non-empty datasets or collections into boolean runtime parameters.
Evidence:
- MGnify collection-to-boolean chain:
$IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1358-1483. - VGP text-empty mapping:
$IWC_FORMAT2/VGP-assembly-v2/hi-c-contact-map-for-assembly-manual-curation/hi-c-map-for-assembly-manual-curation.gxwf.yml:3057-3218.
Call: merge into conditional-gate-on-nonempty-result. The boolean shim is important, but the reusable operation is gating downstream work on non-empty output.
Candidate F: compute-tabular-value-then-parameterize
Scope: use a tabular tool such as column_maker or table_compute to compute one value, then read it as a parameter.
Evidence:
- MGnify
column_makercomputesc1 != 0beforeparam_value_from_file:$IWC_FORMAT2/amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-rrna-prediction/mgnify-amplicon-pipeline-v5-rrna-prediction.gxwf.yml:1414-1463. - Consensus peaks uses
wc_gnuand downstream scalar parameterization around replicate counts:$IWC_FORMAT2/epigenetics/consensus-peaks/consensus-peaks-atac-cutandrun.gxwf.yml:299-318.
Call: merge. Keep tabular computation on tabular-compute-new-column or specific tabular pages; keep the escape back to parameter-land on derive-parameter-from-file.
Candidate G: simple expression shim
Scope: replace small shim chains with a single simple expression tool or CWL ExpressionTool-like adapter.
Evidence: zero IWC gxformat2 hits for ExpressionTool, expression_tool, __EXPRESSION, or related markers in this corpus scan.
Call: drop as corpus-backed pattern; catalog gap only. Do not recommend a shorter expression shim until it has a verified-pattern workflow and survives gxformat2 validation.
Relationship To Existing Surveys And Patterns
This focused survey does not supersede iwc-parameter-derivation-survey. It narrows that survey’s broad candidates into shim-specific authoring boundaries:
- derive-parameter-from-file owns scalar files, counts, and the final typed-parameter bridge.
- compose-runtime-text-parameter owns text assembly from constants plus runtime scalar values.
- map-workflow-enum-to-tool-parameter owns exact downstream dialect normalization.
- conditional-gate-on-nonempty-result owns empty/non-empty boolean derivation when the boolean feeds
when. - conditional-route-between-alternative-outputs and conditional-run-optional-step own boolean inversion and one-of-N gate booleans.
- tabular-compute-new-column owns row-wise or one-cell tabular computation before any parameter bridge.
The main refinement from this pass is suppression, not expansion: many shim chains are implementation details inside already-authored operation pages.
Open Questions
-
Should map-workflow-enum-to-tool-parameter add a short “integer normalization” subsection for SRA’s
0 -> 1column repair, or is that too thin? -
Should compose-runtime-text-parameter explicitly warn about unvalidated downstream syntax for
Filter1.cond,Cut1.columnList, and config-line consumers, or is the current pitfall coverage enough? -
Should conditional-gate-on-nonempty-result state more strongly that simple expression tools are not corpus-backed and failed the current verified-pattern route, or keep that as survey-only context?
-
Should the taxonomic-rank enum-to-awk snippet stay as a pitfall/example inside map-workflow-enum-to-tool-parameter, or should it be referenced only from tabular taxonomy pages to avoid normalizing long generated programs?