Collection: regex relabel via tabular
Tool
Use a tabular mapping source, optionally transform it with tp_find_and_replace, then relabel the collection with __RELABEL_FROM_FILE__.
Corpus-shaped chain:
collection_element_identifiersor another mapping source emits identifiers as data.tp_find_and_replacerewrites identifier text.__RELABEL_FROM_FILE__applies the modified labels to the target collection.
When to reach for it
Use this when the collection structure is right but element identifiers are wrong, noisy, or need a simple regex/string transformation.
Good fits include tool output that lost original sample names, identifiers that need prefix/suffix cleanup, and paired/unpaired outputs that need the same manifest-derived names.
Do not use this when relabeling is coupled to a structural reshape. Use relabel-via-rules-and-find-replace for the influenza-style restructure, extract identifiers, regex-relabel, restructure again shape.
This page is about label rewrite. Use sync-collections-by-identifier for membership sync and harmonize-by-sortlist-from-identifiers for order harmonization.
Parameters
collection_element_identifiers emits one identifier per line and no header.
For tp_find_and_replace, set regex mode deliberately. For no-header identifier lists, do not skip the first line.
For __RELABEL_FROM_FILE__, choose line-based labels only when row order matches collection order. Use tabular old-to-new mapping when order is uncertain.
Idiomatic shape
# collection_element_identifiers(input collection)
# -> optional tp_find_and_replace
# -> __RELABEL_FROM_FILE__(target collection, labels=modified identifiers)
Regex cleanup before relabel:
tool_id: toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/9.5+galaxy3
tool_state:
infile: { __class__: ConnectedValue }
find_pattern: "^(.*)\\.fastq(\\.gz)?$"
replace_pattern: "\\1"
is_regex: true
skip_first_line: false
Pitfalls
- Line-based relabeling assumes mapping rows match collection element order.
collection_element_identifiershas no header; skipping the first line drops a real identifier.- Relabeling does not reorder or filter. Use a filter or sort pattern when membership or order changes.
- Keep regex cleanup narrow enough that distinct elements do not collapse to the same identifier.
See also
- iwc-transformations-survey — Recipe G and candidate boundary.
- relabel-via-rules-and-find-replace — use when relabeling is fused with Apply Rules structural fan-out.
- sync-collections-by-identifier — use when identifiers filter or relabel sibling collections without regex cleanup.