Tabular: concatenate collection to table
Tool
toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0. The tabular survey found 44 step instances, making this the dominant collection-to-tabular bridge for row-binding per-element tabular outputs into one dataset.
When to reach for it
Use this when a Galaxy dataset collection of tabular-like files must become one tabular dataset for downstream Cut1, Filter1, datamash_ops, tp_find_and_replace, or reporting steps.
Do not use this for plain two-file concatenation; use tp_cat or legacy cat1 only when the input is not a collection. Do not use this when each collection element should become a column; use tabular-pivot-collection-to-wide. For grouped collapse within one table, use tabular-group-and-aggregate-with-datamash.
Parameters
input_list: connected collection input.filename.add_name: whether to inject the collection element identifier into output rows.filename.place_name: where/how to place the element identifier whenadd_name: true.one_header: whether to keep only one header row across all collection elements.
The canonical headered-table shape is:
tool_id: toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0
tool_state:
filename:
add_name: true
place_name: same_multiple
input_list: { __class__: ConnectedValue }
one_header: true
Idiomatic shapes
Per-sample tabulars to one annotated table:
tool_state:
filename:
add_name: true
place_name: same_multiple
input_list: { __class__: ConnectedValue }
one_header: true
Anchored by the SARS-CoV-2 variation reporting IWC exemplar.
Collection concat with no element identifier:
tool_state:
filename:
add_name: false
input_list: { __class__: ConnectedValue }
one_header: false
Anchored by the MAPseq-to-ampvis2 IWC exemplar.
Headerless outputs with row provenance:
tool_state:
filename:
add_name: true
place_name: same_multiple
input_list: { __class__: ConnectedValue }
one_header: false
Anchored by the influenza consensus and subtyping IWC exemplar.
Pitfalls
add_name: falseloses provenance. This is silent if downstream needs sample or element identity.one_header: falseduplicates headers when each collection element has its own header row.one_header: trueon headerless data may drop a real first row. Only enable it when inputs have headers.place_name: same_multipleis the row-provenance idiom. It repeats the element name so each output row carries identity.same_onceis a different shape. Use it only when downstream expects block labels, not per-row identity.- This is row-bind, not wide pivot. If each collection element should become a column, use tabular-pivot-collection-to-wide.
Legacy alternative
For non-collection two-file concatenation, older workflows may use tp_cat or legacy core cat1. Those are not replacements for this pattern because they do not carry collection element identity.
See also
- iwc-tabular-operations-survey — candidate 9 evidence.
- iwc-transformations-survey — collection-side cross-reference for the same bridge.
- tabular-pivot-collection-to-wide — collection elements become columns instead of rows.
- tabular-cut-and-reorder-columns — common cleanup after concatenation.