Home Pattern

Collection: split identifier via rules

Use Apply Rules regex columns to split one collection identifier into nested list identifiers.

draft pattern
Revised
2026-05-03
Rev
2

Pattern health

warn
  • IWC exemplar anchors

    1 abstract workflow anchor declared.

  • Foundry verification fixture

    No structural verification fixture yet.

  • Pattern map coverage

    1 pattern map link here.

  • Metadata contract

    Pattern frontmatter matches the site contract.

Collection: split identifier via rules

Tool

Use __APPLY_RULES__ to turn a flat list into a nested list:list by splitting each element identifier into two parts.

When to reach for it

Use this when identifiers encode two nesting axes in one string, such as sampleA_rep1, and downstream tools need sampleA -> rep1 nesting.

Do not use this for swapping two existing nesting levels; use collection-swap-nesting-with-apply-rules. Do not use this to make forward/reverse pairs; use collection-build-list-paired-with-apply-rules when one parsed axis is a paired-end role.

This page is about deriving list nesting from one identifier. Use regex-relabel-via-tabular when the collection shape is already right and only labels need cleanup.

Parameters

The corpus shape uses two parallel add_column_regex rules, each with one capture result. Do not encode this as one group_count: 2 rule when following the IWC exemplar.

Conceptual Apply Rules shape:

tool_id: __APPLY_RULES__
tool_state:
  rules:
    - type: add_column_metadata
      value: identifier0
    - type: add_column_regex
      target_column: 0
      expression: "^(.*)_([^_]*)$"
      replacement: "\\1"
    - type: add_column_regex
      target_column: 0
      expression: "^(.*)_([^_]*)$"
      replacement: "\\2"
  mapping:
    list_identifiers: [1, 2]

Pitfalls

  • Use two regex rules, not one group_count: 2 rule, for corpus parity.
  • Target the original identifier column both times.
  • ^(.*)_([^_]*)$ splits on the last underscore; use a stricter regex if identifiers can contain multiple separators.
  • Validate unmatched behavior instead of silently creating empty nesting keys.

See also

IWC exemplars1 anchor

IWC Exemplars

epigenetics/average-bigwig-between-replicates/average-bigwig-between-replicateshigh

Splits flat bigWig identifiers into sample-prefix and replicate-suffix nesting with two regex-derived columns.

Incoming References (7)