Claude skill · cast

nextflow-summary-to-galaxy-data-flow

Translate a Nextflow summary into a Galaxy data-flow design brief.

← All cast skills · Source mold →

Install

/plugin marketplace add jmchilton/foundry
/plugin install foundry-skills@galaxy-workflow-foundry

Then invoke as:

/foundry-skills:nextflow-summary-to-galaxy-data-flow

Skill Bundle

/ packaged cast
attached files
11
upfront
4
on demand
7
cast rev
3
validated
0

Produces: 1 artifact.

Consumes: 3 artifacts.

Artifact Contract

/ skill handoff

Produces

nextflow-galaxy-data-flow

Reviewable Markdown brief: abstract operations, collection map/reduce choices, shape-changing placeholder steps, unresolved Galaxy tool needs, confidence, open questions.

markdownnextflow-galaxy-data-flow.md
Raw artifact contract
{
  "id": "nextflow-galaxy-data-flow",
  "kind": "markdown",
  "default_filename": "nextflow-galaxy-data-flow.md",
  "description": "Reviewable Markdown brief: abstract operations, collection map/reduce choices, shape-changing placeholder steps, unresolved Galaxy tool needs, confidence, open questions."
}

Consumes

summary-nextflow

Structured Nextflow pipeline summary emitted by [[summarize-nextflow]]; the JSON the data-flow translation reads.

Raw artifact contract
{
  "id": "summary-nextflow",
  "description": "Structured Nextflow pipeline summary emitted by [[summarize-nextflow]]; the JSON the data-flow translation reads.",
  "inherited_schema": "[[summary-nextflow]]",
  "producers": [
    "summarize-nextflow"
  ]
}

nextflow-galaxy-reference-data

Reference-data shape brief from [[nextflow-summary-to-galaxy-reference-data]] that pins per-asset reference inputs and rebuild-on-absence behavior.

Raw artifact contract
{
  "id": "nextflow-galaxy-reference-data",
  "description": "Reference-data shape brief from [[nextflow-summary-to-galaxy-reference-data]] that pins per-asset reference inputs and rebuild-on-absence behavior.",
  "producers": [
    "nextflow-summary-to-galaxy-reference-data"
  ]
}

nextflow-galaxy-interface

Preceding Galaxy interface brief from [[nextflow-summary-to-galaxy-interface]] that pins inputs, outputs, and labels.

Raw artifact contract
{
  "id": "nextflow-galaxy-interface",
  "description": "Preceding Galaxy interface brief from [[nextflow-summary-to-galaxy-interface]] that pins inputs, outputs, and labels.",
  "producers": [
    "nextflow-summary-to-galaxy-interface"
  ]
}

Attached Files

/ runtime references

Load upfront

research

galaxy-data-flow-draft-contract

packaged

Keep the data-flow brief separate from gxformat2 templating and concrete step implementation.

upfront runtime verbatim hypothesis deterministic 6.4 KB
bundle
references/notes/galaxy-data-flow-draft-contract.md
source
content/research/galaxy-data-flow-draft-contract.md
Preview md
---
type: research
subtype: design-spec
title: "Galaxy data-flow draft contract"
tags:
  - research/design-spec
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-03
revision: 2
ai_generated: true
related_notes:
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[nextflow-operators-to-galaxy-collection-recipes]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[paper-summary-to-galaxy-design]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[paper-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
sources:
  - "https://github.com/jmchilton/foundry/issues/54"
summary: "Defines the proposed boundary between Galaxy data-flow drafts, gxformat2 templates, and concrete step implementation."
---

# Galaxy Data-Flow Draft Contract

This is an architectural contract, not a schema. Evidence is strongest for Mold and Pipeline boundaries. Proposed fields are speculative until exercised by two or three worked translations.

## Boundary

The data-flow draft owns a target-shaped abstract DAG for Galaxy. It should not be valid `gxformat2` and should not resolve exact Tool Shed tools.

Data-flow draft owns:

- Galaxy-facing workflow inputs and outputs.
- Abstract nodes, edges, branches, collection mapping, collection reduction, and placeholder transformations.
- Input/output shape decisions such as `File`, `list`, `paired`, `list:paired`, or `list:list`.
- Conceptual Galaxy idioms: map-over, reduction, Apply Rules, collection cleanup, identifier synchronization, tabular bridge.
- Abstract unresolved tool needs with input and output shapes.
- Confidence and rationale on inferred nodes, edges, transforms, and tool needs.

The Galaxy template
...
research

nextflow-operators-to-galaxy-collection-recipes

packaged

Classify Nextflow operators as Galaxy wiring, collection semantics, explicit steps, or review triggers.

upfront runtime verbatim corpus-observed deterministic 6.5 KB
bundle
references/notes/nextflow-operators-to-galaxy-collection-recipes.md
source
content/research/nextflow-operators-to-galaxy-collection-recipes.md
Preview md
---
type: research
subtype: component
title: "Nextflow operators to Galaxy collection recipes"
tags:
  - research/component
  - source/nextflow
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
related_notes:
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[galaxy-collection-semantics]]"
  - "[[galaxy-collection-tools]]"
  - "[[galaxy-apply-rules-dsl]]"
  - "[[iwc-transformations-survey]]"
  - "[[iwc-tabular-operations-survey]]"
  - "[[galaxy-data-flow-draft-contract]]"
  - "[[iwc-map-over-lifecycle-survey]]"
  - "[[nextflow-patterns]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[implement-galaxy-tool-step]]"
  - "[[debug-galaxy-workflow-output]]"
sources:
  - "https://github.com/jmchilton/foundry/issues/53"
summary: "Classifies common Nextflow operators as Galaxy wiring, collection semantics, explicit steps, or review triggers."
---

# Nextflow Operators To Galaxy Collection Recipes

Most Nextflow operators are not Galaxy tools. Translate them first as source-side data-flow intent, then decide whether the Galaxy representation is simple wiring, collection semantics, an explicit Galaxy step, or a user-review checkpoint.

## Decision Vocabulary

| Label | Meaning |
|---|---|
| `channel-only rewiring` | The operator disappears into Galaxy connections, labels, branch wiring, or output selection. |
| `Galaxy collection semantics` | Translation relies on collection identifiers, collection type, map-over, reduction, or nesting behavior. |
| `explicit Galaxy step` | Add a collection-operation, tabular, text-processing, or domain tool step. |
| `user review` | Translation is likely lossy or semantically ambiguous. |

## Operator Recipes

| Nextflow operator | Galaxy recipe | Class | Confidenc
...
research

nextflow-to-galaxy-channel-shape-mapping

packaged

Translate Nextflow channel, tuple, and path shapes into Galaxy dataset and collection shapes.

upfront runtime verbatim corpus-observed deterministic 8.6 KB
bundle
references/notes/nextflow-to-galaxy-channel-shape-mapping.md
source
content/research/nextflow-to-galaxy-channel-shape-mapping.md
Preview md
---
type: research
subtype: component
title: "Nextflow-to-Galaxy channel shape mapping"
tags:
  - research/component
  - source/nextflow
  - target/galaxy
status: draft
created: 2026-05-02
revised: 2026-05-06
revision: 2
ai_generated: true
related_notes:
  - "[[nextflow-workflow-io-semantics]]"
  - "[[nextflow-params-to-galaxy-inputs]]"
  - "[[nextflow-path-glob-to-galaxy-datatype]]"
  - "[[galaxy-collection-semantics]]"
  - "[[galaxy-collection-tools]]"
  - "[[galaxy-apply-rules-dsl]]"
  - "[[iwc-transformations-survey]]"
  - "[[nextflow-operators-to-galaxy-collection-recipes]]"
  - "[[galaxy-data-flow-draft-contract]]"
  - "[[iwc-conditionals-survey]]"
  - "[[manifest-to-mapped-collection-lifecycle]]"
  - "[[map-workflow-enum-to-tool-parameter]]"
  - "[[regex-relabel-via-tabular]]"
  - "[[relabel-via-rules-and-find-replace]]"
  - "[[reshape-relabel-remap-by-collection-axis]]"
  - "[[sync-collections-by-identifier]]"
  - "[[tabular-compute-new-column]]"
  - "[[tabular-concatenate-collection-to-table]]"
  - "[[tabular-cut-and-reorder-columns]]"
  - "[[tabular-filter-by-column-value]]"
  - "[[tabular-filter-by-regex]]"
  - "[[tabular-group-and-aggregate-with-datamash]]"
  - "[[tabular-join-on-key]]"
  - "[[tabular-pivot-collection-to-wide]]"
  - "[[tabular-prepend-header]]"
  - "[[tabular-relabel-by-row-counter]]"
  - "[[tabular-split-taxonomy-string]]"
  - "[[tabular-sql-query]]"
  - "[[tabular-synthesize-bed-from-3col]]"
  - "[[tabular-to-collection-by-row]]"
  - "[[iwc-map-over-lifecycle-survey]]"
  - "[[nextflow-patterns]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-interface]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[implement-galaxy-tool-step]]"
sources:
  - "https://github.com/jmchilton/foundry/issu
...
schema

summary-nextflow

packaged

Read process, channel, operator, and fixture structure while drafting Galaxy-facing abstract data flow.

upfront runtime verbatim corpus-observed deterministic 56.7 KB
bundle
references/schemas/summary-nextflow.schema.json
source
package://@galaxy-foundry/summarize-nextflow#summaryNextflowSchema
Preview json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://galaxyproject.org/foundry/schemas/summary-nextflow.schema.json",
  "$comment": "Canonical source: packages/summarize-nextflow/src/schema/summary-nextflow.schema.json in jmchilton/foundry. Mold frontmatter cites this schema via [[summary-nextflow]] wiki-links; the cast pipeline imports the `summaryNextflowSchema` runtime export and serializes it into cast bundles.",
  "title": "Nextflow Pipeline Summary",
  "description": "Structured per-source summary emitted by the summarize-nextflow Mold.\n\nPer-source schema by design — paper, Nextflow, and CWL each have their own summary shape; downstream Molds (data flow, templates, tool wrappers) consume any source's summary and handle the polymorphism.\n\nField names mirror gxy-sketches' SketchSource / ToolSpec / TestDataRef / ExpectedOutputRef where parity exists; see content/research/gxy-sketches-alignment.md.",
  "$ref": "#/$defs/Summary",
  "$defs": {
    "Summary": {
      "title": "Summary",
      "description": "Top-level shape. Every Nextflow summary is exactly this object.",
      "type": "object",
      "additionalProperties": false,
      "required": [
        "source",
        "params",
        "sample_sheets",
        "profiles",
        "tools",
        "processes",
        "subworkflows",
        "workflow",
        "reference_assets",
        "reference_rebuilds",
        "test_fixtures",
        "nf_tests"
      ],
      "properties": {
        "source": {
          "$ref": "#/$defs/SourceRecord"
        },
        "params": {
          "type": "array",
          "items": {
            "$ref": "#/$defs/Param"
          }
        },
        "sample_sheets": {
          "type": "array",
          "items": {
            "$ref": "#/$defs/SampleSheet"
          },
          "description": "Structured sample-sheet inputs. Each entry binds one `params[]` parameter to a row schema (column names, types, path-vs-meta classification, required flags, enums, patterns). Promoted from prose inside `params[].description` so downstream target translations (Galaxy `sample_sheet*` collections, CWL records-of-arrays) can choose collection variants without re-parsing the source pipeline. Empty array when no sample-sheet idiom is detected. Discovery sources: nf-schema `schema:` references, `samplesheetToList()` calls, and `splitCsv(header: true)` m
...

Load on demand

pattern

galaxy-collection-patterns

packaged

Ground collection-shape choices in curated, corpus-observed operation and recipe patterns.

Trigger: When selecting collection cleanup, reshape, identifier, or collection-tabular bridge patterns.

on-demand runtime verbatim corpus-observed deterministic 4.4 KB
bundle
references/patterns/galaxy-collection-patterns.md
source
content/patterns/galaxy-collection-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: collection patterns"
aliases:
  - "Galaxy collection pattern MOC"
  - "collection transformation patterns"
  - "IWC collection pattern map"
tags:
  - pattern
  - target/galaxy
  - topic/galaxy-transform
  - topic/collection-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy collection transformation patterns."
related_notes:
  - "[[iwc-transformations-survey]]"
  - "[[iwc-conditionals-survey]]"
related_patterns:
  - "[[manifest-to-mapped-collection-lifecycle]]"
  - "[[cleanup-sync-and-publish-nonempty-results]]"
  - "[[reshape-relabel-remap-by-collection-axis]]"
  - "[[fan-in-bundle-consume-and-flatten]]"
  - "[[collection-cleanup-after-mapover-failure]]"
  - "[[sync-collections-by-identifier]]"
  - "[[harmonize-by-sortlist-from-identifiers]]"
  - "[[regex-relabel-via-tabular]]"
  - "[[relabel-via-rules-and-find-replace]]"
  - "[[collection-swap-nesting-with-apply-rules]]"
  - "[[collection-split-identifier-via-rules]]"
  - "[[collection-build-list-paired-with-apply-rules]]"
  - "[[tabular-to-collection-by-row]]"
  - "[[tabular-concatenate-collection-to-table]]"
  - "[[tabular-pivot-collection-to-wide]]"
related_molds:
  - "[[implement-galaxy-tool-step]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[paper-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
---

# Galaxy: collection patterns

This is the runtime-facing map for Galaxy collection transformation choices. Use it before loading raw survey notes. The survey remains evidence backing; 
...
pattern

galaxy-tabular-patterns

packaged

Ground tabular bridge and table-operation choices in curated, corpus-observed operation patterns.

Trigger: When data-flow translation needs filtering, joining, aggregation, pivoting, or tabular-collection bridges.

on-demand runtime verbatim corpus-observed deterministic 3.1 KB
bundle
references/patterns/galaxy-tabular-patterns.md
source
content/patterns/galaxy-tabular-patterns.md
Preview md
---
type: pattern
pattern_kind: moc
evidence: corpus-observed
title: "Galaxy: tabular patterns"
aliases:
  - "Galaxy tabular pattern MOC"
  - "tabular transformation patterns"
  - "IWC tabular pattern map"
tags:
  - pattern
  - target/galaxy
  - topic/galaxy-transform
  - topic/tabular-transform
status: draft
created: 2026-05-02
revised: 2026-05-02
revision: 1
ai_generated: true
summary: "Use this MOC to choose corpus-grounded Galaxy tabular transformation patterns."
related_notes:
  - "[[iwc-tabular-operations-survey]]"
related_patterns:
  - "[[tabular-filter-by-column-value]]"
  - "[[tabular-filter-by-regex]]"
  - "[[tabular-cut-and-reorder-columns]]"
  - "[[tabular-compute-new-column]]"
  - "[[tabular-join-on-key]]"
  - "[[tabular-group-and-aggregate-with-datamash]]"
  - "[[tabular-sql-query]]"
  - "[[tabular-prepend-header]]"
  - "[[tabular-synthesize-bed-from-3col]]"
  - "[[tabular-split-taxonomy-string]]"
  - "[[tabular-relabel-by-row-counter]]"
  - "[[tabular-to-collection-by-row]]"
  - "[[tabular-concatenate-collection-to-table]]"
  - "[[tabular-pivot-collection-to-wide]]"
related_molds:
  - "[[implement-galaxy-tool-step]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[cwl-summary-to-galaxy-template]]"
  - "[[paper-summary-to-galaxy-template]]"
  - "[[compare-against-iwc-exemplar]]"
---

# Galaxy: tabular patterns

This is the runtime-facing map for Galaxy tabular transformation choices. Use it before loading raw survey notes. The survey remains evidence backing; the operation pages are the actionable references.

## Row And Column Operations

- [[tabular-filter-by-column-value]] — keep/drop rows by string column value with `Filter1`.
- [[tabular-filter-by-regex]] — k
...
research

galaxy-sample-sheet-collections

packaged

Preserve per-row metadata on the data-flow side: keep sample_sheet column_definitions wired through identifier-keyed steps instead of dropping into parallel parameter inputs, and re-attach metadata after map-over steps that lose it.

Trigger: When the upstream interface brief carries a sample_sheet[:paired|:paired_or_unpaired|:record] input, or when the Nextflow summary shows tuple(meta, path...) channel shape originating from samplesheetToList or splitCsv(header: true).

on-demand runtime verbatim corpus-observed deterministic 8.4 KB
bundle
references/notes/galaxy-sample-sheet-collections.md
source
content/research/galaxy-sample-sheet-collections.md
Preview md
---
type: research
subtype: component
title: "Galaxy sample_sheet collection types"
tags:
  - research/component
  - target/galaxy
status: draft
created: 2026-05-05
revised: 2026-05-06
revision: 2
ai_generated: true
related_notes:
  - "[[galaxy-collection-semantics]]"
  - "[[galaxy-collection-tools]]"
  - "[[nextflow-workflow-io-semantics]]"
  - "[[nextflow-params-to-galaxy-inputs]]"
  - "[[nextflow-path-glob-to-galaxy-datatype]]"
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[nextflow-to-galaxy-reference-data-mapping]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-interface]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
sources:
  - "Galaxy PR #19305 (Implement Sample Sheets), merged 2025-07-30"
  - "lib/galaxy/model/dataset_collections/types/sample_sheet.py"
  - "lib/galaxy/model/dataset_collections/types/sample_sheet_util.py"
  - "lib/galaxy/model/dataset_collections/type_description.py"
  - "lib/galaxy/schema/schema.py (SampleSheetColumnDefinition, SampleSheetRow)"
  - "lib/galaxy/tools/wrappers.py (DatasetCollectionWrapper.sample_sheet_row)"
  - "lib/galaxy/tools/sample_sheet_to_tabular.xml"
  - "lib/galaxy/webapps/galaxy/api/dataset_collections.py (sample_sheet_workbook endpoints)"
  - "lib/galaxy/model/migrations/alembic/versions_gxy/3af58c192752_implement_sample_sheets.py"
summary: "Galaxy's sample_sheet collection family: typed column metadata, four variants, mapping rules, validator allowlist."
---

# Galaxy sample_sheet collection types

Reference for the Galaxy backend shape that targets structured per-row metadata — the natural landing zone for Nextflow `samplesheetToList` parameters and for any source-side idiom that pairs typed metadata columns with dataset references.

## Shape

A `sample_sheet` is a list-shaped collection where each el
...
research

nextflow-conditional-to-galaxy-subworkflow-when

packaged

Decide between subworkflow `when:` and inline tool-step `when:` for each source conditional, and pick the right output fan-in primitive (`pick_value` vs twin-cascade) so the data-flow brief carries a coherent conditional disposition forward.

Trigger: When the Nextflow summary's `workflow.conditionals[]` is non-empty, or when subworkflow boundaries in the source align with parameter-driven branches (step, aligner, wes, tools, skip_*, use_*).

on-demand runtime verbatim corpus-observed deterministic 13.7 KB
bundle
references/notes/nextflow-conditional-to-galaxy-subworkflow-when.md
source
content/research/nextflow-conditional-to-galaxy-subworkflow-when.md
Preview md
---
type: research
subtype: component
title: "Nextflow conditional to Galaxy subworkflow / when"
tags:
  - research/component
  - source/nextflow
  - target/galaxy
status: draft
created: 2026-05-08
revised: 2026-05-08
revision: 1
ai_generated: true
related_notes:
  - "[[nextflow-to-galaxy-reference-data-mapping]]"
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[summary-nextflow]]"
  - "[[gxformat2-schema]]"
related_molds:
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
sources:
  - "https://github.com/galaxyproject/gxformat2"
  - "https://github.com/iwc-workflows"
summary: "Stub. Translate Nextflow conditionals into Galaxy `when:` (single-workflow v1). Subworkflow vs inline is an aesthetic call, not a rule."
---

# Nextflow conditional to Galaxy subworkflow / when

Stub. Surfaced from sarek emulation (2026-05-08). Companion to [[nextflow-to-galaxy-reference-data-mapping]] — same v1 posture (one Galaxy workflow per source pipeline; trench-coat shape is acceptable as a draft for human review), different gap (control flow rather than reference data).

## Posture

For v1 of the Nextflow-to-Galaxy translation Molds the output is a single Galaxy workflow per source pipeline, even when the source has substantial branching. IWC reviewers historically prefer sibling workflows for what looks like one pipeline with toggles, and we agree; but for the *translation step* a single artifact keeps the Mold pipeline deterministic, the harness simple, and the reviewer's mental model of "this draft maps 1:1 to the source" intact. Sibling-extraction is a polish pass a human or follow-up Mold runs *after* translation, not a decision the translation Mold makes.

The question this note addresses is: given that v1 is one Galaxy workflow, *h
...
research

nextflow-path-glob-to-galaxy-datatype

packaged

Preserve datatype confidence while translating path-like data-flow edges, process output patterns, and published outputs.

Trigger: When choosing or reviewing Galaxy datatype extensions for data-flow edges, collection elements, or output datasets.

on-demand runtime verbatim corpus-observed deterministic 12.8 KB
bundle
references/notes/nextflow-path-glob-to-galaxy-datatype.md
source
content/research/nextflow-path-glob-to-galaxy-datatype.md
Preview md
---
type: research
subtype: component
title: "Nextflow path/glob to Galaxy datatype mapping"
tags:
  - research/component
  - source/nextflow
  - target/galaxy
status: draft
created: 2026-05-06
revised: 2026-05-06
revision: 1
ai_generated: true
related_notes:
  - "[[nextflow-workflow-io-semantics]]"
  - "[[gxformat2-workflow-inputs]]"
  - "[[galaxy-datatypes-conf]]"
  - "[[galaxy-sample-sheet-collections]]"
  - "[[nextflow-params-to-galaxy-inputs]]"
  - "[[nextflow-to-galaxy-channel-shape-mapping]]"
  - "[[summary-nextflow]]"
  - "[[nextflow-summary-to-galaxy-interface]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
related_molds:
  - "[[summarize-nextflow]]"
  - "[[nextflow-summary-to-galaxy-interface]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
sources:
  - "content/research/datatypes_conf.xml.sample"
  - "https://github.com/galaxyproject/galaxy/blob/7765fae934fbfdee77e3be5f5b235e43735273ae/config/datatypes_conf.xml.sample"
  - "https://www.nextflow.io/docs/latest/process.html"
  - "https://www.nextflow.io/docs/latest/reference/channel.html"
  - "https://nextflow-io.github.io/nf-schema/latest/nextflow_schema/nextflow_schema_specification/"
summary: "Rules for mapping Nextflow path, glob, sample-sheet, and output filename evidence to Galaxy datatype extensions."
---

# Nextflow path/glob to Galaxy datatype mapping

Use this note when a Nextflow-to-Galaxy Mold needs a gxformat2 `format` value for a `data` input, collection element, or workflow output. [[nextflow-params-to-galaxy-inputs]] decides whether something is a dataset or collection; this note only decides datatype extension and confidence.

Evidence quality:

- **Corpus-observed** claims cite pinned fixtures under `$NEXTFLOW_FIXTURES`, the shared clone at `/Users/jxc755/projects/repositories/workflow-fixt
...
research

nextflow-reference-data-classification

packaged

Cross-check source-side reference-data classifications before deciding how reference assets and optional rebuild branches flow through the Galaxy data-flow draft.

Trigger: When the reference-data or interface brief is silent, low-confidence, or conflicts with source evidence for iGenomes-derived params, coordinated bundles, compute-if-missing branches, multi-DB pick-lists, or cohort-specific assets.

on-demand runtime verbatim corpus-observed deterministic 7.8 KB
bundle
references/notes/nextflow-reference-data-classification.md
source
content/research/nextflow-reference-data-classification.md
Preview md
---
type: research
subtype: component
title: "Nextflow reference-data classification"
tags:
  - research/component
  - source/nextflow
status: draft
created: 2026-05-10
revised: 2026-05-10
revision: 3
ai_generated: true
related_notes:
  - "[[summary-nextflow]]"
  - "[[nextflow-to-galaxy-reference-data-mapping]]"
  - "[[nextflow-summary-to-galaxy-reference-data]]"
  - "[[nextflow-summary-to-galaxy-interface]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
related_molds:
  - "[[summarize-nextflow]]"
  - "[[nextflow-summary-to-galaxy-reference-data]]"
  - "[[nextflow-summary-to-galaxy-interface]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
sources:
  - "https://nf-co.re/docs/usage/reference_genomes"
  - "https://github.com/nf-core/sarek/blob/master/conf/igenomes.config"
  - "https://github.com/nf-core/configs"
  - "https://github.com/jmchilton/foundry/issues/221"
summary: "Source-side taxonomy of how Nextflow pipelines use reference data — eight classifications detectable from a summary-nextflow artifact."
---

# Nextflow reference-data classification

Reference-data shape varies along several roughly orthogonal dimensions: whether the pipeline consumes or produces reference data, the cardinality of the assets, whether they're keyed or per-asset, whether rebuild fallback exists, and whether multiple bundles run in parallel. The classifications below are flags an LLM can detect from a `summary-nextflow` artifact; a single pipeline often matches more than one. Grounded in the complexity bridge fixtures from jmchilton/foundry#221.

For the Galaxy-side translation of these classifications, see [[nextflow-to-galaxy-reference-data-mapping]].

## None

Pipeline consumes no reference d
...
research

nextflow-to-galaxy-reference-data-mapping

packaged

Decide how reference assets and their indexes flow through the Galaxy data-flow draft (preserving dbkey through map-overs, deferring index-building to wrappers vs surfacing as workflow steps).

Trigger: When the upstream interface brief carries reference-data inputs (FASTA, fai, dict, indexes, known sites, intervals, PoN) or when the source pipeline's compute-if-missing branches imply rebuild semantics the data flow has to honor.

on-demand runtime verbatim corpus-observed deterministic 12.0 KB
bundle
references/notes/nextflow-to-galaxy-reference-data-mapping.md
source
content/research/nextflow-to-galaxy-reference-data-mapping.md
Preview md
---
type: research
subtype: component
title: "Nextflow to Galaxy reference-data mapping"
tags:
  - research/component
  - source/nextflow
  - target/galaxy
status: draft
created: 2026-05-08
revised: 2026-05-10
revision: 5
ai_generated: true
related_notes:
  - "[[nextflow-reference-data-classification]]"
  - "[[nextflow-params-to-galaxy-inputs]]"
  - "[[nextflow-path-glob-to-galaxy-datatype]]"
  - "[[summary-nextflow]]"
  - "[[nextflow-summary-to-galaxy-reference-data]]"
  - "[[nextflow-summary-to-galaxy-template]]"
  - "[[galaxy-sample-sheet-collections]]"
  - "[[galaxy-datatypes-conf]]"
related_molds:
  - "[[summarize-nextflow]]"
  - "[[nextflow-summary-to-galaxy-reference-data]]"
  - "[[nextflow-summary-to-galaxy-interface]]"
  - "[[nextflow-summary-to-galaxy-data-flow]]"
  - "[[nextflow-summary-to-galaxy-template]]"
sources:
  - "https://github.com/jmchilton/foundry/issues/221"
summary: "Galaxy-side translation of Nextflow reference-data classifications: idioms available, the v1 posture, datatype defaults, and the in-tool rebuild trade-off."
---

# Nextflow to Galaxy reference-data mapping

Mapping research for [[nextflow-summary-to-galaxy-reference-data]]. Once a Nextflow pipeline's reference-data usage is classified per [[nextflow-reference-data-classification]], this note pins the Galaxy-side translation: idioms available, the v1 posture, datatype defaults, the in-tool rebuild trade-off, and known representation gaps the brief should flag.

## Galaxy side

Galaxy has multiple idioms for surfacing reference data. The bullets below are presented as available shapes; the recommendations that follow narrow them to the v1 posture.

- **`dbkey`-keyed cached lookups.** Workflow inputs carry a `dbkey` annotation; tools consume an admin-pre-loaded data table indexed by `db
...

SKILL.md


# nextflow-summary-to-galaxy-data-flow

Follow the procedure below and use the artifact/reference sections as the runtime contract.

## When To Use

- Translate a Nextflow summary into a Galaxy data-flow design brief.

## Inputs

- Read artifact `summary-nextflow`. Schema: summary-nextflow. Produced by `summarize-nextflow`. Structured Nextflow pipeline summary emitted by summarize-nextflow; the JSON the data-flow translation reads.
- Read artifact `nextflow-galaxy-reference-data`. Produced by `nextflow-summary-to-galaxy-reference-data`. Reference-data shape brief from nextflow-summary-to-galaxy-reference-data that pins per-asset reference inputs and rebuild-on-absence behavior.
- Read artifact `nextflow-galaxy-interface`. Produced by `nextflow-summary-to-galaxy-interface`. Preceding Galaxy interface brief from nextflow-summary-to-galaxy-interface that pins inputs, outputs, and labels.

## Outputs

- Write artifact `nextflow-galaxy-data-flow` as `nextflow-galaxy-data-flow.md`. Format: `markdown`. Reviewable Markdown brief: abstract operations, collection map/reduce choices, shape-changing placeholder steps, unresolved Galaxy tool needs, confidence, open questions.

## Required Tools

- None declared. Procedure should not assume external CLIs are present.

## Load Upfront

- `references/notes/galaxy-data-flow-draft-contract.md`: Research note copied verbatim into the bundle. Keep the data-flow brief separate from gxformat2 templating and concrete step implementation.
- `references/notes/nextflow-operators-to-galaxy-collection-recipes.md`: Research note copied verbatim into the bundle. Classify Nextflow operators as Galaxy wiring, collection semantics, explicit steps, or review triggers.
- `references/notes/nextflow-to-galaxy-channel-shape-mapping.md`: Research note copied verbatim into the bundle. Translate Nextflow channel, tuple, and path shapes into Galaxy dataset and collection shapes.
- `references/schemas/summary-nextflow.schema.json`: Schema file copied verbatim into the bundle. Read process, channel, operator, and fixture structure while drafting Galaxy-facing abstract data flow.

## Load On Demand

- `references/patterns/galaxy-collection-patterns.md`: Pattern note copied verbatim into the bundle. Ground collection-shape choices in curated, corpus-observed operation and recipe patterns. Use when: selecting collection cleanup, reshape, identifier, or collection-tabular bridge patterns.
- `references/patterns/galaxy-tabular-patterns.md`: Pattern note copied verbatim into the bundle. Ground tabular bridge and table-operation choices in curated, corpus-observed operation patterns. Use when: data-flow translation needs filtering, joining, aggregation, pivoting, or tabular-collection bridges.
- `references/notes/galaxy-sample-sheet-collections.md`: Research note copied verbatim into the bundle. Preserve per-row metadata on the data-flow side: keep sample_sheet column_definitions wired through identifier-keyed steps instead of dropping into parallel parameter inputs, and re-attach metadata after map-over steps that lose it. Use when: the upstream interface brief carries a sample_sheet[:paired|:paired_or_unpaired|:record] input, or when the Nextflow summary shows tuple(meta, path...) channel shape originating from samplesheetToList or splitCsv(header: true).
- `references/notes/nextflow-conditional-to-galaxy-subworkflow-when.md`: Research note copied verbatim into the bundle. Decide between subworkflow `when:` and inline tool-step `when:` for each source conditional, and pick the right output fan-in primitive (`pick_value` vs twin-cascade) so the data-flow brief carries a coherent conditional disposition forward. Use when: the Nextflow summary's `workflow.conditionals[]` is non-empty, or when subworkflow boundaries in the source align with parameter-driven branches (step, aligner, wes, tools, skip_*, use_*).
- `references/notes/nextflow-path-glob-to-galaxy-datatype.md`: Research note copied verbatim into the bundle. Preserve datatype confidence while translating path-like data-flow edges, process output patterns, and published outputs. Use when: choosing or reviewing Galaxy datatype extensions for data-flow edges, collection elements, or output datasets.
- `references/notes/nextflow-reference-data-classification.md`: Research note copied verbatim into the bundle. Cross-check source-side reference-data classifications before deciding how reference assets and optional rebuild branches flow through the Galaxy data-flow draft. Use when: the reference-data or interface brief is silent, low-confidence, or conflicts with source evidence for iGenomes-derived params, coordinated bundles, compute-if-missing branches, multi-DB pick-lists, or cohort-specific assets.
- `references/notes/nextflow-to-galaxy-reference-data-mapping.md`: Research note copied verbatim into the bundle. Decide how reference assets and their indexes flow through the Galaxy data-flow draft (preserving dbkey through map-overs, deferring index-building to wrappers vs surfacing as workflow steps). Use when: the upstream interface brief carries reference-data inputs (FASTA, fai, dict, indexes, known sites, intervals, PoN) or when the source pipeline's compute-if-missing branches imply rebuild semantics the data flow has to honor.

## Validation

- None declared.

## Procedure

Read a Nextflow summary plus the preceding Galaxy interface brief and emit a reviewable Markdown data-flow brief. Capture abstract operations, collection map/reduce choices, shape-changing placeholder transformations, unresolved Galaxy tool needs, confidence, and open questions.

The output is not gxformat2 and should not resolve exact Tool Shed tools. nextflow-summary-to-galaxy-template turns this handoff and the interface brief into a skeleton.

## Runtime Notes

- Do not read Foundry source files at runtime; use only files packaged in this skill bundle and user-supplied artifacts.
- Preserve declared artifact filenames unless the user or harness supplies explicit paths.
- Carry unresolved assumptions into the output artifact instead of silently inventing missing source evidence.