CWL pickValue → Galaxy pick_value
Audience: a Mold author who just saw a pickValue:* marker in a summary-cwl.json edge via: array (or a WorkflowOutputParameter.output_source multi-value carrying a pickValue hint) and needs to emit gxformat2.
CWL pickValue — canonical semantics
Source: CWL v1.2 schema Workflow.yml (PickValueMethod) and the rendered spec at https://www.commonwl.org/v1.2/Workflow.html#PickValueMethod.
first_non_null— “For the first level of a list input, pick the first non-null element. The result is a scalar. It is an error if there is no non-null element.”the_only_non_null— “For the first level of a list input, pick the single non-null element. The result is a scalar. It is an error if there is more than one non-null element.”all_non_null— “For the first level of a list input, pick all non-null values. The result is a list, which may be empty.”
Placement: declared on both WorkflowStepInput and WorkflowOutputParameter with identical semantics. Operates on the array produced when source: / outputSource: is multi-valued. First level only; composes with linkMerge (which builds the array pickValue then filters).
Interaction with when: — pickValue is the canonical fan-in idiom for N branches gated by complementary when: predicates. Skipped steps emit null for their outputs; the survivor is picked. (Inference, but corroborated by the PR’s stated motivation: unblocking 27+ CWL v1.2 conditional conformance tests.)
Galaxy pick_value — what galaxy#22222 added
PR https://github.com/galaxyproject/galaxy/pull/22222 (merged 2026-03-31, author jmchilton, labels area/workflows, area/cwl).
-
New workflow module type:
pick_value. Registered atlib/galaxy/workflow/modules.pymodule_types["pick_value"] = PickValueModule(~line 3108 at PR head). No DB migration — reusesWorkflowStep.type = "pick_value"+tool_state = {"mode": "...", "num_inputs": N}+ standardWorkflowStepConnection/WorkflowOutput. -
Four modes (Galaxy is a superset of CWL by one extra mode):
Galaxy mode Maps to CWL All-null behavior Output shape first_non_nullfirst_non_nullFail workflow ( FailWorkflowEvaluation)scalar dataset first_or_skip(no CWL equivalent) Emit a “skipped” HDA ( extension=expression.json,blurb=skipped)scalar (or skipped) the_only_non_nullthe_only_non_nullFail; also fails when >1non-nullscalar dataset all_non_nullall_non_nullReturns an HDCA list(may be empty)listcollection -
Null detection is two-pronged (
PickValueModule._pick_from_replacements, ~modules.py:2060–2068):- the
NO_REPLACEMENTsentinel (no upstream connection, or upstream step was skipped viawhen:); or - an HDA with
extension == "expression.json"andblurb == "skipped".
- the
-
gxformat2 surface (from PR test fixtures, e.g.
lib/galaxy_test/workflow/pick_value_first_non_null_mapped.gxwf.yml):pick: type: pick_value in: input_0: source: branch_a/out_file1 input_1: source: branch_b/out_file1 state: mode: first_non_nullInputs are named
input_0…input_{N-1}. The single output is namedoutput. The editor exposes one extra empty terminal for grow-on-connect (PR body §“get_all_inputs()”). -
Mapping over collections is supported (
_execute_mapped, modules.py ~2130). When inputs are collections, the module iterates per-element. Output shape:listfor the scalar modes,list:listforall_non_null(modules.py ~2211–2215). -
CWL importer not yet wired. The PR body explicitly says “Once CWL import integration is added, it will unblock 27+ CWL v1.2 conditional conformance tests.” The runtime is in
mainas of 2026-03-31; the importer mappingpickValue → pick_valueis future work. Translator Molds cannot yet punt to Galaxy’s CWL importer — the gxformat2 file we emit must already contain thepick_valuestep.
Mapping table
| CWL position | CWL mode | Galaxy translation |
|---|---|---|
WorkflowOutputParameter.outputSource: [a, b, …] + pickValue: first_non_null | first_non_null | Insert a type: pick_value step with state.mode: first_non_null, input_0: a, input_1: b, …. Wire the pick step’s output into the workflow outputs: block. Direct mapping. |
Same, the_only_non_null | the_only_non_null | Same shape; mode: the_only_non_null. Direct mapping. |
Same, all_non_null | all_non_null | Same shape; mode: all_non_null. Output type changes: workflow output becomes a list HDCA. Surface to consumers. |
WorkflowStepInput.source: [a, b] + pickValue | any | Insert a pick_value step upstream of the consuming step; rewire the step input to consume the pick step’s output. There is no inline-pick on a step’s input in gxformat2 — it must be a real step. (Inference; PR adds a module, not an input-side attribute.) |
pickValue over all-null inputs | first_non_null / the_only_non_null | Galaxy raises FailWorkflowEvaluation — matches CWL “It is an error if there is no non-null element”. |
| Workflow author wants “skipped” rather than “fail” on all-null | (no CWL mode) | Use Galaxy-only first_or_skip. Flag in translator that this is a Galaxy extension, not CWL semantics. Round-tripping back to CWL would require codifying. |
Cases that don’t translate cleanly:
scatter+pickValueon the same step input. Galaxy supports per-element mapping, but the composition is unusual in CWL; verify per-fixture (inference).- Deeply nested
linkMerge: merge_nested(>1 level). Galaxypick_valueoperates at first level only (inference from MODES + execute logic); deeply nested cases may need a translator flag.
Translation guidance for cwl-summary-to-galaxy-data-flow
Given a summary-cwl.json whose graph.edges[].via contains "pickValue:<mode>" (or whose workflow_outputs[].output_source is multi-valued — when the typed-pick_value field is added to the schema, prefer that):
- Detect by edge fan-in pattern. Group edges by
to(workflow-output id or step-input id) whereviacontains apickValue:*token. Each group becomes onepick_valuestep. - Emit a
pick_valuestep per group. Suggested step idpick_<output_id>. Inputsinput_0,input_1, … in CWL declaration order.state.modeis the CWL mode verbatim. - Rewire the workflow output to read from the new pick step’s
outputrather than the multi-sourceoutputSourcearray. - When
pickValueis on a step input, insert thepick_valuestep upstream of that step and rewire its input to a single source. - Preserve upstream
when:predicates unchanged.pick_valueconsumes whatever each branch emits (real dataset or skipped HDA) and decides at execution. - Type-shape warning for
all_non_null: the workflow output type changes from a CWL scalarT(post-pickValue) to a GalaxylistHDCA. Surface in the brief. - Do not invent a “pick non-empty” custom tool. Use the native
pick_valuemodule — prior Foundry recommendations to synthesize a picker step are superseded. - Two-sibling-workflow fallback stays valid for readability when the entire DAG splits cleanly on a single mode predicate (IWC convention: see
EBI-Metagenomics/pipeline-v5↔amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-quality-control-paired-end/…-single-end). With #22222 merged, the single-workflowpick_valuetranslation is now also viable; the template Mold may choose either.
Risks, gotchas, open issues
- Importer not wired (PR-observed). Translators must emit the
pick_valuestep themselves; Galaxy’s CWL importer won’t convertpickValueautomatically yet. - No CWL-direct mode for
first_or_skip. Translator should never silently emit it from CWL input — only when the human author asks for skip-propagation semantics. the_only_non_nullis strict. Fails on>1non-null too. Don’t swap modes during translation.- Empty
all_non_nullresult is legal. Downstream tools must accept empty collections. pick_valueis a step, not a step-input attribute. Translator can’t represent a CWL inlinepickValueon a step input without adding a real Galaxy step — step count increases vs CWL.- Mapped-collection output shape.
all_non_nullover mapped inputs yieldslist:list. Consumers expecting flatlistwill break. - Galaxy version floor. This module landed on
main2026-03-31. Workflows usingtype: pick_valuewill not import on older Galaxy releases. Translator should emit a metadata note / required-version hint when the draft usespick_value. summary-cwlschema gap. The Foundry’ssummary-cwl.jsonschema currently encodes workflow-levelpickValueonly via edgeviamarkers; first-classpick_valueonWorkflowOutputwould make detection trivial. Open work tracked incontent/molds/summarize-cwl/refinements/2026-05-11-mgnify-seqprep-subwf.md.
Citations
- PR metadata + body: https://github.com/galaxyproject/galaxy/pull/22222.
PickValueModuleimplementation:lib/galaxy/workflow/modules.pylines 1951–2128 at PR head (mode list, error semantics, null detection, mapped execution).- gxformat2 surface examples:
lib/galaxy_test/workflow/pick_value_first_non_null_mapped.gxwf.yml,pick_value_all_non_null_mapped.gxwf.yml,pick_value_skip_pja.gxwf.ymlat PR head. - CWL spec:
PickValueMethoddoc strings in https://raw.githubusercontent.com/common-workflow-language/cwl-v1.2/main/Workflow.yml; rendered at https://www.commonwl.org/v1.2/Workflow.html#PickValueMethod. - IWC fallback exemplar:
amplicon/amplicon-mgnify/mgnify-amplicon-pipeline-v5-quality-control-paired-end/…-single-end(sibling-workflows convention).
Evidence quality
- PR-observed (concrete): module type, modes, gxformat2 surface, mapped-collection output shape, importer not yet wired, version floor.
- CWL-spec (concrete): mode definitions and error conditions; placement on
WorkflowStepInputandWorkflowOutputParameter. - Inference (marked inline):
pickValuecomposing withwhen:; step-input-side translation requiring an upstream step;scatter + pickValuerarity; deeply-nestedlinkMergeinteraction.