Home Pattern

nf-core channel input → Galaxy data / collection

Map an nf-core process's tuple(meta, path) input channel to a Galaxy <param type="data"> or paired/list collection input.

Revised
2026-05-10
Rev
2

Pattern health

warn
  • IWC exemplar anchors

    No structured IWC exemplars yet.

  • Foundry verification fixture

    No structural verification fixture yet.

  • Pattern map coverage

    No pattern map links to this leaf yet.

  • Metadata contract

    Pattern frontmatter matches the site contract.

nf-core channel input → Galaxy data / collection

Cited modules pinned to nf-core/modules@9b261a459473bc8e2d830bfc626f480c0733f4fe.

Triage table

nf-core input shapemeta-driven branching?Galaxy mapping
tuple val(meta), path(input) (single artifact)noone <param type="data">
tuple val(meta), path(input), path(secondary)notwo <param type="data"> (one per role)
tuple val(meta), path(reads) with meta.single_end driving the scriptyes<conditional> switching <param type="data"> (single) ↔ <param type="data_collection" collection_type="paired"> (paired)
tuple val(meta), path(reads, stageAs: "input*/*") (multi-list)yes (single/paired branching)<conditional> switching <param type="data" multiple="true"><param type="data_collection" collection_type="list:paired">
tuple val(meta), path("*.bam") globno<param type="data_collection" collection_type="list">

The meta map itself is never an input — see [[nfcore-meta-map-to-galaxy-params]]. Its keys may gate a <conditional>, but the map data does not flow as a Galaxy param.

Cited cases

Single artifact, no meta branching → single <param type="data">

modules/nf-core/samtools/index/main.nf:

input:
tuple val(meta), path(input)

output:
tuple val(meta), path("*.{bai,csi,crai}"), emit: index

Galaxy mapping:

<inputs>
  <param name="input" type="data" format="bam,cram" label="BAM / CRAM to index"/>
</inputs>
<outputs>
  <data name="index" format_source="input">
    <discover_datasets pattern="__name_and_ext__" directory="."/>
  </data>
</outputs>

samtools/index’s upstream behavior already reads file-extension cues to pick the index format; the convert Mold should not surface index_format as a Galaxy param when the upstream tool can derive it. Surface it only if task.ext.args exposes it as a knob.

Two artifacts, role-split → two <param type="data">

modules/nf-core/seqkit/grep/main.nf:

input:
tuple val(meta), path(sequence)
path pattern

sequence is the FASTQ/FASTA to filter; pattern is a separate file of names. Galaxy:

<param name="sequence" type="data" format="fasta,fastq,fasta.gz,fastq.gz" label="Sequences to filter"/>
<param name="pattern"  type="data" format="txt" optional="true" label="Pattern file (one ID per line)"/>

Two roles → two params. A <repeat> would only fit if the module accepted N pattern files; here it’s exactly one.

Paired/single branching driven by meta.single_end<conditional>

modules/nf-core/fastp/main.nf:

input:
tuple val(meta), path(reads), path(adapter_fasta)
val   discard_trimmed_pass
val   save_trimmed_fail
val   save_merged

script:
def args = task.ext.args ?: ''
...
if ( task.ext.args?.contains('--interleaved_in') ) {
    """ ...fastp --stdout --in1 ... """          // interleaved-FASTQ mode
} else if (meta.single_end) {
    """ ...fastp --in1 ... """                   // single-end
} else {
    """ ...fastp --in1 ... --in2 ... """         // paired-end
}

Three branches: interleaved (driven by task.ext.args introspection — see [[nfcore-task-ext-args-to-galaxy-additional-options]]), single-end (driven by meta.single_end == true), paired (default). The first two are meta- and args-driven convergence onto a single-input shape; the third is the genuine paired shape.

tools-iuc/tools/fastp/fastp.xml (revision pinned by viewing master) shows the canonical Galaxy translation as a two-arm conditional:

<conditional name="single_paired">
  <param name="single_paired_selector" type="select" label="Single-end or paired reads">
    <option value="single" selected="true">Single-end</option>
    <option value="paired_collection">Paired Collection</option>
  </param>
  <when value="single">
    <param name="in1" type="data" format="fastqsanger,fastqsanger.gz"/>
    ...
  </when>
  <when value="paired_collection">
    <param name="paired_input" type="data_collection" collection_type="paired"
           format="fastqsanger,fastqsanger.gz"/>
    ...
  </when>
</conditional>

Two reasons IUC chose paired_collection over a literal paired (in1 + in2 separate datasets) arm:

  1. Galaxy collection idiom. Paired-end data in Galaxy lives in paired collections; pairing them into a collection at upload time is the recommended workflow shape (and the only shape that survives downstream collection-aware tools cleanly).
  2. Single source of truth for the pair. Two separate data params lets a user pair non-matching files; a paired collection enforces the pair structure at the dataset level.

The convert Mold’s posture should match IUC: emit single + paired_collection only. The interleaved-FASTQ path collapses into the single arm — task.ext.args carrying --interleaved_in is a usage choice the user makes via the additional-options bag (see [[nfcore-task-ext-args-to-galaxy-additional-options]]), not a third <conditional> arm.

Multi-list (cat/fastq style) → <conditional> over multi-data + list:paired

modules/nf-core/cat/fastq/main.nf consumes tuple val(meta), path(reads, stageAs: "input*/*") — N reads per sample, single-end or paired. The Galaxy mapping mirrors the fastp shape with multiple="true" on the single arm and collection_type="list:paired" on the paired arm.

Pitfalls

  • Don’t promote meta.id to a Galaxy input. Galaxy datasets carry their own identifier ($input.element_identifier); meta.id belongs in the wrapper script, not in <inputs>. See [[nfcore-meta-map-to-galaxy-params]].
  • Glob inputs (path('*.bam')) ≠ multiple="true". They map to a <param type="data_collection" collection_type="list">, because Galaxy’s collection abstraction is the right idiom for “N files of the same shape with stable identifiers.” multiple="true" discards element identifiers.
  • stageAs: directives carry information. path(reads, stageAs: "input*/*") tells Nextflow to stage each read under a numbered subdirectory; the convert Mold should preserve this in <command> via && mkdir input1 input2 && ln -sf style staging, not flatten to a single dir.
  • Optional artifacts are optional="true", not a separate <conditional> arm. tuple val(meta), path(adapter_fasta) where adapter_fasta may be empty maps to <param ... optional="true">, not a paired/no-paired conditional.

See also

  • [[nfcore-meta-map-to-galaxy-params]] — sibling: how to handle the meta map keys that travel with the data path.
  • [[galaxy-discover-datasets]] — reference for the <discover_datasets> element used in the glob-input mapping.
  • [[convert-nfcore-module-to-galaxy-tool]] — Mold that consumes this pattern.
  • tools-iuc/tools/fastp/fastp.xml — canonical IUC paired/single conditional shape.

Incoming References (3)

  • nf-core meta-map → Galaxy paramsrelated pattern— Promote nf-core meta-map keys to Galaxy <param>s only when they drive script behavior; drop identity-only keys; pull naming from $input.element_identifier.
  • Galaxy <discover_datasets>related note— Reference for the <discover_datasets> Galaxy XML element — attributes, named/regex patterns, <data> vs <collection> contexts, test assertions.
  • Galaxy <discover_datasets>related pattern— Reference for the <discover_datasets> Galaxy XML element — attributes, named/regex patterns, <data> vs <collection> contexts, test assertions.