Claude skill · cast cwl-summary-to-galaxy-data-flow
Translate a CWL summary into a Galaxy data-flow design brief.
← All cast skills · Source mold →
Install
/plugin marketplace add jmchilton/foundry
/plugin install foundry-skills@galaxy-workflow-foundry
Then invoke as:
/foundry-skills:cwl-summary-to-galaxy-data-flow
Skill Bundle
/ packaged cast
- attached files
- 8
- upfront
- 4
- on demand
- 4
- cast rev
- n/a
- validated
- 0
Produces: 1 artifact.
Consumes: 2 artifacts.
Artifact Contract
/ skill handoff Produces
cwl-galaxy-data-flow
Reviewable Markdown brief: abstract topology, Galaxy collection semantics, placeholder transformations, unresolved Galaxy tool needs.
markdowncwl-galaxy-data-flow.md
Raw artifact contract
{
"id": "cwl-galaxy-data-flow",
"kind": "markdown",
"default_filename": "cwl-galaxy-data-flow.md",
"description": "Reviewable Markdown brief: abstract topology, Galaxy collection semantics, placeholder transformations, unresolved Galaxy tool needs."
}
Consumes
summary-cwl
Structured CWL summary emitted by [[summarize-cwl]]; consumed alongside the Galaxy interface brief.
Raw artifact contract
{
"id": "summary-cwl",
"description": "Structured CWL summary emitted by [[summarize-cwl]]; consumed alongside the Galaxy interface brief.",
"inherited_schema": "[[summary-cwl]]",
"producers": [
"summarize-cwl"
]
}
cwl-galaxy-interface
Preceding Galaxy interface brief from [[cwl-summary-to-galaxy-interface]] that pins inputs, outputs, and labels.
Raw artifact contract
{
"id": "cwl-galaxy-interface",
"description": "Preceding Galaxy interface brief from [[cwl-summary-to-galaxy-interface]] that pins inputs, outputs, and labels.",
"producers": [
"cwl-summary-to-galaxy-interface"
]
}
Attached Files
/ runtime references Load upfront
Use CWL's native graph and mark only the features that need Galaxy reinterpretation.
upfront runtime verbatim hypothesis deterministic 5.5 KB
- bundle
references/notes/component-cwl-workflow-anatomy.md - source
content/research/component-cwl-workflow-anatomy.md
Preview md
---
type: research
subtype: component
title: "CWL workflow anatomy"
tags:
- research/component
- source/cwl
status: draft
created: 2026-05-10
revised: 2026-05-10
revision: 1
ai_generated: true
related_notes:
- "[[summary-cwl]]"
- "[[cwl-v1.2-schemas]]"
- "[[galaxy-collection-semantics]]"
related_molds:
- "[[summarize-cwl]]"
- "[[cwl-summary-to-galaxy-interface]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-template]]"
sources:
- "https://www.commonwl.org/v1.2/Workflow.html"
- "https://cwltool.readthedocs.io/en/stable/"
- "https://github.com/common-workflow-language/cwl-utils#normalize-a-cwl-document"
- "https://pypi.org/project/cwl-utils/"
- "https://github.com/common-workflow-language/cwldep"
summary: "CWL structure relevant to summarize-cwl: normalized documents, steps, scatter, conditionals, requirements, and dependency handling."
---
# CWL Workflow Anatomy
CWL is a structured workflow language, not a pipeline framework that must be inferred from ecosystem conventions. The `summarize-cwl` Mold should therefore start from CWL's own validated object model and avoid recreating the heavy Nextflow extraction stack.
## Normalization Posture
Use `cwltool --validate` as the first gate. If validation fails, the summary should emit provenance plus validation diagnostics and stop before producing downstream-looking graph claims.
Use `cwl-normalizer` from `cwl-utils` as the default normalization surface. The cwl-utils README describes it as producing JSON CWL documents with dependencies packed together, upgrading to CWL v1.2 as needed, and optionally refactoring CWL expressions into separate steps. This is the right handoff for `summarize-cwl`: structured enough for extraction, still source-faithful, and not a Galaxy design
...
Default reference for translating CWL when:/pickValue branching: pick among `paired_or_unpaired` collection input, native `pick_value` workflow step, or sibling workflows per mode.
upfront runtime verbatim corpus-observed deterministic 7.7 KB
- bundle
references/notes/cwl-when-pickvalue-to-galaxy-branching.md - source
content/research/cwl-when-pickvalue-to-galaxy-branching.md
Preview md
---
type: research
subtype: design-spec
title: "CWL when:/pickValue → Galaxy branching translation"
tags:
- research/design-spec
- source/cwl
- target/galaxy
status: draft
created: 2026-05-11
revised: 2026-05-11
revision: 1
ai_generated: true
related_notes:
- "[[cwl-pickvalue-to-galaxy]]"
- "[[galaxy-paired-or-unpaired-collections]]"
- "[[galaxy-collection-semantics]]"
- "[[component-cwl-workflow-anatomy]]"
- "[[galaxy-data-flow-draft-contract]]"
related_molds:
- "[[cwl-summary-to-galaxy-interface]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-template]]"
- "[[compare-against-iwc-exemplar]]"
summary: "CWL `when:`/`pickValue` → Galaxy. Three honest translations (paired_or_unpaired input, native pick_value step, sibling workflows) plus how to pick among them."
---
# CWL `when:`/`pickValue` → Galaxy branching translation
Audience: a Mold author looking at a `summary-cwl.json` whose steps carry `when:` predicates and/or whose workflow outputs use `pickValue`, deciding which Galaxy translation to recommend.
## The three honest translations
CWL has two related branching mechanisms with no 1:1 gxformat2 equivalent (until galaxy#22222 — see `cwl-pickvalue-to-galaxy`):
- **`when:` on a step** — execute conditionally on a JS predicate.
- **`pickValue:` on a step input or workflow output** — fan in N candidate sources and pick `first_non_null` / `the_only_non_null` / `all_non_null`.
Three Galaxy-idiomatic translations are available; each is honest for a different source shape.
### Translation A — `paired_or_unpaired` collection (preferred when the discriminator is paired-vs-single)
When the CWL `when:` predicates discriminate the **paired-vs-single mode of read inputs** (the seqprep-subwf pattern: `single_reads: File?` trigger
...
Keep the data-flow brief separate from gxformat2 templating and concrete step implementation.
upfront runtime verbatim hypothesis deterministic 6.4 KB
- bundle
references/notes/galaxy-data-flow-draft-contract.md - source
content/research/galaxy-data-flow-draft-contract.md
Preview md
---
type: research
subtype: design-spec
title: "Galaxy data-flow draft contract"
tags:
- research/design-spec
- target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-03
revision: 2
ai_generated: true
related_notes:
- "[[nextflow-to-galaxy-channel-shape-mapping]]"
- "[[nextflow-operators-to-galaxy-collection-recipes]]"
related_molds:
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[paper-summary-to-galaxy-design]]"
- "[[nextflow-summary-to-galaxy-template]]"
- "[[cwl-summary-to-galaxy-template]]"
- "[[paper-summary-to-galaxy-template]]"
- "[[compare-against-iwc-exemplar]]"
sources:
- "https://github.com/jmchilton/foundry/issues/54"
summary: "Defines the proposed boundary between Galaxy data-flow drafts, gxformat2 templates, and concrete step implementation."
---
# Galaxy Data-Flow Draft Contract
This is an architectural contract, not a schema. Evidence is strongest for Mold and Pipeline boundaries. Proposed fields are speculative until exercised by two or three worked translations.
## Boundary
The data-flow draft owns a target-shaped abstract DAG for Galaxy. It should not be valid `gxformat2` and should not resolve exact Tool Shed tools.
Data-flow draft owns:
- Galaxy-facing workflow inputs and outputs.
- Abstract nodes, edges, branches, collection mapping, collection reduction, and placeholder transformations.
- Input/output shape decisions such as `File`, `list`, `paired`, `list:paired`, or `list:list`.
- Conceptual Galaxy idioms: map-over, reduction, Apply Rules, collection cleanup, identifier synchronization, tabular bridge.
- Abstract unresolved tool needs with input and output shapes.
- Confidence and rationale on inferred nodes, edges, transforms, and tool needs.
The Galaxy template
...
schema
summary-cwl
packaged Read CWL step graph, edge markers, scatter, conditionals, secondary files, and tool requirements while drafting Galaxy-facing data flow.
upfront runtime verbatim cast-validated deterministic 19.2 KB
- bundle
references/schemas/summary-cwl.schema.json - source
package://@galaxy-foundry/summary-cwl-schema#summaryCwlSchema
Preview json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://galaxyproject.org/foundry/schemas/summary-cwl.schema.json",
"$comment": "Canonical source: packages/summary-cwl-schema/src/summary-cwl.schema.json in jmchilton/foundry. Mold frontmatter cites this schema via [[summary-cwl]] wiki-links; the cast pipeline imports the `summaryCwlSchema` runtime export and serializes it into cast bundles.",
"title": "CWL Workflow Summary",
"description": "Structured per-source summary emitted by the summarize-cwl Mold. CWL is already a typed workflow language, so this schema records validated and normalized workflow/tool structure rather than inferred pipeline semantics.",
"type": "object",
"additionalProperties": false,
"required": [
"summary_version",
"source",
"documents",
"workflow_inputs",
"workflow_outputs",
"steps",
"tools",
"graph",
"tests",
"warnings"
],
"properties": {
"summary_version": {
"type": "string",
"enum": [
"1"
],
"description": "Summary schema major version."
},
"source": {
"$ref": "#/$defs/SourceRecord"
},
"documents": {
"$ref": "#/$defs/DocumentSet"
},
"workflow_inputs": {
"type": "array",
"items": {
"$ref": "#/$defs/WorkflowInput"
}
},
"workflow_outputs": {
"type": "array",
"items": {
"$ref": "#/$defs/WorkflowOutput"
}
},
"steps": {
"type": "array",
"items": {
"$ref": "#/$defs/WorkflowStep"
}
},
"tools": {
"type": "array",
"items": {
"$ref": "#/$defs/CommandLineTool"
}
},
"graph": {
"$ref": "#/$defs/WorkflowGraph"
},
"tests": {
"type": "array",
"items": {
"$ref": "#/$defs/TestCase"
}
},
"warnings": {
"type": "array",
"items": {
"$ref": "#/$defs/Warning"
}
}
},
"$defs": {
"SourceRecord": {
"type": "object",
"additionalProperties": false,
"required": [
"ecosystem",
"workflow",
"url",
"version",
"license",
"slug",
"cwl_version",
"entrypoint"
],
"properties": {
"ecosystem": {
"type": "string",
"enum": [
"cwl"
],
"description": "Source ecosy
...
Load on demand
Ground collection reshape, relabel, cleanup, and map-over choices in corpus-observed Galaxy recipes.
Trigger: When CWL scatter, arrays, nested arrays, records, or secondary-file contracts require explicit Galaxy collection operations.
on-demand runtime verbatim corpus-observed deterministic 4.4 KB
- bundle
references/patterns/galaxy-collection-patterns.md - source
content/patterns/galaxy-collection-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: collection patterns"
aliases:
- "Galaxy collection pattern MOC"
- "collection transformation patterns"
- "IWC collection pattern map"
tags:
- pattern
- target/galaxy
- topic/galaxy-transform
- topic/collection-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy collection transformation patterns."
related_notes:
- "[[iwc-transformations-survey]]"
- "[[iwc-conditionals-survey]]"
related_patterns:
- "[[manifest-to-mapped-collection-lifecycle]]"
- "[[cleanup-sync-and-publish-nonempty-results]]"
- "[[reshape-relabel-remap-by-collection-axis]]"
- "[[fan-in-bundle-consume-and-flatten]]"
- "[[collection-cleanup-after-mapover-failure]]"
- "[[sync-collections-by-identifier]]"
- "[[harmonize-by-sortlist-from-identifiers]]"
- "[[regex-relabel-via-tabular]]"
- "[[relabel-via-rules-and-find-replace]]"
- "[[collection-swap-nesting-with-apply-rules]]"
- "[[collection-split-identifier-via-rules]]"
- "[[collection-build-list-paired-with-apply-rules]]"
- "[[tabular-to-collection-by-row]]"
- "[[tabular-concatenate-collection-to-table]]"
- "[[tabular-pivot-collection-to-wide]]"
related_molds:
- "[[implement-galaxy-tool-step]]"
- "[[nextflow-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-template]]"
- "[[cwl-summary-to-galaxy-template]]"
- "[[paper-summary-to-galaxy-template]]"
- "[[compare-against-iwc-exemplar]]"
---
# Galaxy: collection patterns
This is the runtime-facing map for Galaxy collection transformation choices. Use it before loading raw survey notes. The survey remains evidence backing;
...
Map CWL pickValue (first_non_null / the_only_non_null / all_non_null) on workflow outputs or step inputs into Galaxy's native `pick_value` workflow module added by galaxy#22222.
Trigger: When any summary-cwl edge `via` contains a `pickValue:*` marker, OR any workflow_outputs[].output_source is multi-valued with pickValue, OR any steps[].in[].pick_value is non-null in the source workflow or referenced subworkflows.
on-demand runtime verbatim corpus-observed deterministic 10.9 KB
- bundle
references/notes/cwl-pickvalue-to-galaxy.md - source
content/research/cwl-pickvalue-to-galaxy.md
Preview md
---
type: research
subtype: component
title: "CWL pickValue → Galaxy pick_value (post galaxy#22222)"
tags:
- research/component
- source/cwl
- target/galaxy
status: draft
created: 2026-05-11
revised: 2026-05-11
revision: 1
ai_generated: true
related_notes:
- "[[component-cwl-workflow-anatomy]]"
- "[[galaxy-data-flow-draft-contract]]"
- "[[galaxy-workflow-draft-format]]"
related_molds:
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[cwl-summary-to-galaxy-template]]"
summary: "CWL `pickValue` (first_non_null / the_only_non_null / all_non_null) → Galaxy's native `pick_value` workflow step added by galaxyproject/galaxy#22222."
---
# CWL `pickValue` → Galaxy `pick_value`
Audience: a Mold author who just saw a `pickValue:*` marker in a `summary-cwl.json` edge `via:` array (or a `WorkflowOutputParameter.output_source` multi-value carrying a `pickValue` hint) and needs to emit gxformat2.
## CWL `pickValue` — canonical semantics
Source: CWL v1.2 schema `Workflow.yml` (`PickValueMethod`) and the rendered spec at <https://www.commonwl.org/v1.2/Workflow.html#PickValueMethod>.
- **`first_non_null`** — "For the first level of a list input, pick the first non-null element. The result is a scalar. **It is an error if there is no non-null element.**"
- **`the_only_non_null`** — "For the first level of a list input, pick the single non-null element. The result is a scalar. **It is an error if there is more than one non-null element.**"
- **`all_non_null`** — "For the first level of a list input, pick all non-null values. **The result is a list, which may be empty.**"
Placement: declared on **both** `WorkflowStepInput` and `WorkflowOutputParameter` with identical semantics. Operates on the array produced when `source:` / `outputSource:` is multi-valued. First level only
...
Translate CWL arrays, records, scatter, and secondary-file shapes into Galaxy dataset and collection semantics.
Trigger: When CWL input/output or step wiring implies Galaxy collections, map-over, reduction, or shape changes.
on-demand runtime verbatim corpus-observed deterministic 1.8 KB
- bundle
references/notes/galaxy-collection-semantics.md - source
content/research/galaxy-collection-semantics.md
Preview md
---
type: research
subtype: component
title: "Galaxy collection semantics"
tags:
- research/component
- target/galaxy
status: draft
created: 2026-04-30
revised: 2026-05-05
revision: 3
ai_generated: false
related_notes:
- "[[galaxy-xsd]]"
- "[[galaxy-collection-tools]]"
- "[[galaxy-apply-rules-dsl]]"
- "[[nextflow-to-galaxy-channel-shape-mapping]]"
- "[[nextflow-operators-to-galaxy-collection-recipes]]"
- "[[galaxy-tool-job-failure-reference]]"
- "[[galaxy-workflow-invocation-failure-reference]]"
- "[[iwc-transformations-survey]]"
sources:
- "https://github.com/galaxyproject/galaxy/blob/7765fae934fbfdee77e3be5f5b235e43735273ae/lib/galaxy/model/dataset_collections/types/collection_semantics.yml"
summary: "Vendored formal spec of Galaxy dataset-collection mapping/reduction semantics, with labeled examples and pinned test references."
---
> **Vendored from upstream**, pinned at SHA `7765fae`. Two files live next to this note:
>
> - `galaxy-collection-semantics.yml` — the structured source. **Agents and casting should consume this.** It carries the `tests:` blocks that pin concrete Galaxy test names; the rendered upstream view drops them.
> - `galaxy-collection-semantics.upstream.myst` — Galaxy's auto-generated MyST/LaTeX rendering of the YAML, vendored only so the human view below has something to render. Sync is manual.
>
> **When to consult:** authoring or reasoning about Molds and patterns that touch `data_collection` inputs, map-over / reduction shape changes, sub-collection mapping, `paired_or_unpaired`, or `sample_sheet`.
```vendored-myst
file: galaxy-collection-semantics.upstream.myst
source: https://github.com/galaxyproject/galaxy/blob/7765fae934fbfdee77e3be5f5b235e43735273ae/doc/source/dev/collection_semantics.md
sha: 7765fae
```
When the interface brief adopted a `paired_or_unpaired` shape, model inner-tool branching via `has_single_item` semantics instead of a Galaxy-level mode switch.
Trigger: When the preceding cwl-galaxy-interface brief uses `paired_or_unpaired` (or `list:paired_or_unpaired`) as a workflow input, OR the data-flow brief is considering it as an option.
on-demand runtime verbatim corpus-observed deterministic 8.3 KB
- bundle
references/notes/galaxy-paired-or-unpaired-collections.md - source
content/research/galaxy-paired-or-unpaired-collections.md
Preview md
---
type: research
subtype: component
title: "Galaxy paired_or_unpaired collection type"
tags:
- research/component
- target/galaxy
status: draft
created: 2026-05-11
revised: 2026-05-11
revision: 1
ai_generated: true
related_notes:
- "[[galaxy-collection-semantics]]"
- "[[component-cwl-workflow-anatomy]]"
related_molds:
- "[[cwl-summary-to-galaxy-interface]]"
- "[[cwl-summary-to-galaxy-data-flow]]"
- "[[nextflow-summary-to-galaxy-interface]]"
summary: "Galaxy's `paired_or_unpaired` collection type: discriminated-union shape for paired-or-single reads, no workflow-level mode switch needed. Galaxy PR #19377."
---
# Galaxy `paired_or_unpaired` collections
Audience: a Mold author shaping a Galaxy workflow interface from an upstream (CWL / Nextflow / paper) source whose reads can be paired-end *or* single-end *or* a mixed batch of both.
## The shape
`paired_or_unpaired` is a Galaxy collection type modeling a **discriminated union of 1 or 2 elements**:
- **Unpaired variant** — one element with identifier `unpaired`.
- **Paired variant** — two elements with identifiers `forward` and `reverse`.
`list:paired_or_unpaired` lifts the same shape to a *heterogeneous* batch where some samples are paired and some are single-end — a representation that did not exist before this type. A `list:paired` forces every sample to be paired; a plain `list` of flat datasets loses pairing structure.
The type and rank `paired_or_unpaired` may occur at any rank within nested types (`list:paired_or_unpaired`, `list:list:paired_or_unpaired`) but **only at the deepest (innermost) rank** — the subtyping logic is implemented at the suffix level. See "Limitation: only deepest rank" below.
## When to reach for it (decision rule for translators)
Reach for `paired_or_unpaired` when the
...
SKILL.md
# cwl-summary-to-galaxy-data-flow
Follow the procedure below and use the artifact/reference sections as the runtime contract.
## When To Use
- Translate a CWL summary into a Galaxy data-flow design brief.
## Inputs
- Read artifact `summary-cwl`. Schema: summary-cwl. Produced by `summarize-cwl`. Structured CWL summary emitted by summarize-cwl; consumed alongside the Galaxy interface brief.
- Read artifact `cwl-galaxy-interface`. Produced by `cwl-summary-to-galaxy-interface`. Preceding Galaxy interface brief from cwl-summary-to-galaxy-interface that pins inputs, outputs, and labels.
## Outputs
- Write artifact `cwl-galaxy-data-flow` as `cwl-galaxy-data-flow.md`. Format: `markdown`. Reviewable Markdown brief: abstract topology, Galaxy collection semantics, placeholder transformations, unresolved Galaxy tool needs.
## Load Upfront
- `references/notes/component-cwl-workflow-anatomy.md`: Research note copied verbatim into the bundle. Use CWL's native graph and mark only the features that need Galaxy reinterpretation.
- `references/notes/cwl-when-pickvalue-to-galaxy-branching.md`: Research note copied verbatim into the bundle. Default reference for translating CWL when:/pickValue branching: pick among `paired_or_unpaired` collection input, native `pick_value` workflow step, or sibling workflows per mode.
- `references/notes/galaxy-data-flow-draft-contract.md`: Research note copied verbatim into the bundle. Keep the data-flow brief separate from gxformat2 templating and concrete step implementation.
- `references/schemas/summary-cwl.schema.json`: Schema file copied verbatim into the bundle. Read CWL step graph, edge markers, scatter, conditionals, secondary files, and tool requirements while drafting Galaxy-facing data flow.
## Load On Demand
- `references/patterns/galaxy-collection-patterns.md`: Pattern note copied verbatim into the bundle. Ground collection reshape, relabel, cleanup, and map-over choices in corpus-observed Galaxy recipes. Use when: cWL scatter, arrays, nested arrays, records, or secondary-file contracts require explicit Galaxy collection operations.
- `references/notes/cwl-pickvalue-to-galaxy.md`: Research note copied verbatim into the bundle. Map CWL pickValue (first_non_null / the_only_non_null / all_non_null) on workflow outputs or step inputs into Galaxy's native `pick_value` workflow module added by galaxy#22222. Use when: any summary-cwl edge `via` contains a `pickValue:*` marker, OR any workflow_outputs[].output_source is multi-valued with pickValue, OR any steps[].in[].pick_value is non-null in the source workflow or referenced subworkflows.
- `references/notes/galaxy-collection-semantics.md`: Research note copied verbatim into the bundle. Translate CWL arrays, records, scatter, and secondary-file shapes into Galaxy dataset and collection semantics. Use when: cWL input/output or step wiring implies Galaxy collections, map-over, reduction, or shape changes.
- `references/notes/galaxy-paired-or-unpaired-collections.md`: Research note copied verbatim into the bundle. When the interface brief adopted a `paired_or_unpaired` shape, model inner-tool branching via `has_single_item` semantics instead of a Galaxy-level mode switch. Use when: the preceding cwl-galaxy-interface brief uses `paired_or_unpaired` (or `list:paired_or_unpaired`) as a workflow input, OR the data-flow brief is considering it as an option.
## Validation
- None declared.
## Procedure
Read a CWL summary plus the preceding Galaxy interface brief and emit a reviewable Markdown data-flow brief. Capture abstract topology, Galaxy collection semantics, placeholder transformations, unresolved Galaxy tool needs, confidence, and open questions.
CWL already carries structured workflow shape, so this skill should be lighter than nextflow-summary-to-galaxy-data-flow.
Start from `summary-cwl.graph.edges[]` instead of rediscovering the DAG. The main work is translation pressure: CWL scatter into Galaxy map-over or collection steps, `linkMerge`/`pickValue` into explicit fan-in choices, secondary files into output contracts, and `valueFrom`/`when` into reviewable placeholders when Galaxy cannot express them directly.
## Runtime Notes
- Do not read Foundry source files at runtime; use only files packaged in this skill bundle and user-supplied artifacts.
- Preserve declared artifact filenames unless the user or harness supplies explicit paths.
- Carry unresolved assumptions into the output artifact instead of silently inventing missing source evidence.