Home Research

Planemo Asserts Idioms

Decision and idiom guide for picking planemo workflow-test assertions: which family per output type, how to size tolerances, when to validate.

Raw
Revised
2026-05-11
Rev
6
component

Planemo asserts: idiom and decision guide

Companion to iwc-test-data-conventions (input shapes), galaxy-workflow-testability-design (workflow structure before test YAML exists), and iwc-shortcuts-anti-patterns (what’s accepted vs smell). This note is forward-looking: when authoring a new <workflow>-tests.yml, which assertion family fits which output, and what the recommended tolerances and operators are.

The vocabulary itself is not restated here — every assertion’s parameter list, types, defaults, required fields, and Python docstring is rendered from the test-format JSON Schema at tests-format. Assertion names below deep-link into that page (e.g. has_text jumps straight to that $def).

1. Choose by output type

The single most useful decision table. Pick the row that matches the file format the workflow emits; default to the recommended assertion family.

Output typeDefault assertion familyWhyFallback
Plain text reports / logs (FastQC summary, MultiQC text section)[[tests-format#has_text_modelhas_text]] (substring on a known stable token) + [[tests-format#has_n_lines_modelhas_n_lines]] with delta:
HTML reports (MultiQC HTML, custom dashboards)[[tests-format#has_text_modelhas_text]] against stable section namesHTML embeds timestamps and asset hashes; byte-diff is hopeless
Tabular (TSV, CSV, BED-like)[[tests-format#has_n_columns_modelhas_n_columns]] + [[tests-format#has_text_modelhas_text]] for headers + [[tests-format#has_n_lines_model
VCFcompare: diff with lines_diff: 6The lines_diff: 6 constant matches the typical VCF header preamble that embeds ##fileDate= and ##source=[[tests-format#has_text_matching_model
BAM[[tests-format#has_size_modelhas_size]] + [[tests-format#has_archive_member_modelhas_archive_member]] (BAM is a gzipped block format)
FASTA (deterministic — assemblies, consensus)file: exact comparison or [[tests-format#has_text_modelhas_text]] for known sequenceOutput is byte-stable when the upstream tool is deterministic
FASTA (non-deterministic — RepeatModeler libraries)compare: sim_size with large delta:Family content varies run-to-run[[tests-format#has_n_lines_model
FASTQ (rare as workflow output)[[tests-format#has_n_lines_modelhas_n_lines]] (must be multiple of 4)Quality scores are read-id-dependent
JSON (deterministic — config dumps, params)[[tests-format#has_json_property_with_value_modelhas_json_property_with_value]] / [[tests-format#has_json_property_with_text_modelhas_json_property_with_text]]
JSON (stochastic — HyPhy stats, MCMC results)has_text: text: "{" (existence-only)Embedded floats break any structural assertion; see iwc-shortcuts-anti-patterns §1[[tests-format#has_h5_keys_model
HDF5 / AnnData[[tests-format#has_h5_keys_modelhas_h5_keys]] + [[tests-format#has_h5_attribute_modelhas_h5_attribute]] for known structure
XML[[tests-format#is_valid_xml_modelis_valid_xml]] + [[tests-format#has_element_with_path_modelhas_element_with_path]] + element_text_is/element_text_matches
PNG / image plots[[tests-format#has_image_width_modelhas_image_width]] + [[tests-format#has_image_height_modelhas_image_height]] + [[tests-format#has_size_model
TIFF / multipage images[[tests-format#has_image_frames_modelhas_image_frames]] + [[tests-format#has_image_channels_modelhas_image_channels]] + [[tests-format#has_size_model
Archives (zip, tar.gz)has_archive_member: path: "regex" with nested asserts:Asserts on a specific member; archive timestamps never byte-stable[[tests-format#has_size_model
GFF / GTF[[tests-format#has_n_lines_modelhas_n_lines]] with delta: + [[tests-format#has_text_modelhas_text]] for known feature
Cool / HiC matricescompare: sim_size with multi-MB delta:Binary, run-to-run variance[[tests-format#has_archive_member_model

When in doubt: start with has_size + delta_frac: 0.1. It catches the catastrophic failure mode (empty / 10x bigger output). Then add a content probe.

1a. If the assertion is too weak, revisit the workflow output

Assertion choice sometimes reveals a workflow-design problem. If the only available output can support only a size check or image-dimension smoke test, check whether the workflow should expose a stronger checkpoint before settling for the weak assertion.

Rule: if a translated workflow exposes only weakly assertable final reports, consult galaxy-workflow-testability-design and consider promoting a table/text/HDF5 checkpoint before writing the final assertions.

2. The compare: operators

Top-level on a file: output assertion. Vocabulary, in decreasing strictness:

  • diff (default). Byte-for-byte equality with optional lines_diff: tolerance for a fixed number of header lines. Use only when the upstream tool is deterministic on fixed inputs and the output has no embedded timestamps, command lines, version banners, or hash-ordered Python-dict-style keys.
  • re_match / re_match_multiline. Each line of the expected fixture is a regex that must match the corresponding output line. Useful when a few fields per row are timestamped but the rest is canonical. Rare in the corpus.
  • contains. The expected fixture is a substring of the output. Cheap; weak. Prefer asserts: has_text for new code unless you genuinely have a multi-line block to assert as a whole.
  • sim_size. Output file size matches the fixture’s size within delta: (bytes) or delta_frac: (fraction). Use when the output is necessarily non-deterministic but its rough size is reproducible (RepeatModeler libraries, HiC matrices, Bayesian sampler outputs).

Picking lines_diff:: count the mutable header lines in the output format. VCF: ~6 (##fileformat, ##fileDate, ##source, ##reference, contig/info lines vary). SAM/text headers: count @HD/@PG/@CO lines. Set lines_diff: to the count exactly — looser values mask real diffs.

3. Tolerance picking

delta: is bytes (for has_size and compare: sim_size) or absolute count (for has_n_lines, has_n_columns, has_image_width, etc.). Suffix multipliers documented in the schema — 1K, 1M, 1G work.

delta_frac: is a fraction (0.1 = 10%). Use when expected size scales with input volume. Three IWC tests use it (scRNAseq/baredsc/*, genome-assembly/polish-with-long-reads/*); the rest use absolute delta:.

Picking magnitudes (from corpus survey in iwc-shortcuts-anti-patterns §2):

  • Image dimensions: delta: 25–30 pixels (5% of typical matplotlib defaults).
  • Image file size: delta: 5K–60K (5–10% of file size).
  • Small text reports: delta: 1K–10K.
  • HTML reports: delta: 25K–100K.
  • BAM files: delta: 1M–10M.
  • RepeatModeler / Bayesian sampler outputs: delta: 30K–90M (extreme, but justified by the underlying nondeterminism).

Heuristic for new outputs: delta_frac: 0.1 is a defensible default. Tighten if the output proves more deterministic than expected.

4. Text family — has_text vs has_text_matching vs has_line vs has_line_matching vs has_n_lines

All five are common; choose by what you’re verifying.

  • has_text — output contains the substring text:. Anywhere in the output, any number of times. Add n: / min: / max: to constrain occurrence count. Add delta: to allow slack on the count.
  • has_text_matching — output matches the regex expression:. Use sparingly; prefer literal has_text when you can.
  • has_line — output has at least one line matching line: exactly. Use when line boundaries matter (e.g. asserting on a specific row in a table).
  • has_line_matching — same but with regex.
  • has_n_lines — assert the line count is n: ± delta:.

A common combination in IWC: has_n_lines: n: 100, delta: 5 + has_text: text: "expected_token" — line-count sanity-check plus a content marker. This catches both truncation and content drift in one assertion pair.

negate: true is supported on every assertion. Used for the “this output should NOT contain X” case.

5. Collection output assertions (element_tests:)

For Galaxy collection outputs, the test format keys element assertions by element identifier:

my_collection_output:
  element_tests:
    sample_1:
      asserts:
        has_text:
          text: "expected"
    sample_2:
      file: test-data/expected_sample_2.txt

Optional attributes: at the collection level can assert on the produced collection shape:

my_collection_output:
  attributes: {collection_type: list:list}
  element_tests:
    ...

Nested collections: outer element_tests: keyed by outer identifier; inner uses elements: (note plural, no _tests suffix on the inner). See iwc-test-data-conventions §2f for the live example.

For a list-of-files where every element should pass the same minimal check, the existence-probe pattern (has_text: "{" for JSON; has_size: min: 100 for any non-empty binary) is widely used and accepted in IWC.

6. The validate-against-workflow inner loop

A -tests.yml file can be structurally invalid in two distinct ways:

  1. Schema-invalid — wrong field names, wrong nesting, wrong types. Caught by the test-format JSON Schema.
  2. Workflow-incoherent — schema-valid YAML, but the input/output labels don’t match the actual workflow. Renaming an output in the .ga and forgetting to update its sibling -tests.yml produces this case. Planemo will surface it as an “output not found” error at test-runtime, but only after a full workflow run.

The @galaxy-tool-util/schema npm package ships two validators that catch both cases statically — no Galaxy or Planemo invocation needed:

  • validateTestsFile(yaml) — runs the file against tests.schema.json (AJV). Reports schema violations with paths.
  • checkTestsAgainstWorkflow(workflow, tests) — cross-checks a .ga / format2 workflow against a tests file: missing input labels, missing output labels, type incompatibilities (e.g. test supplies a File for a parameter typed int).

Both are pure-JS, take milliseconds, and have no Galaxy dependency. Wire them into the inner authoring loop:

edit -tests.yml
  → validateTestsFile()                    # schema gate
  → checkTestsAgainstWorkflow(.ga, tests)  # coherence gate
  → planemo workflow_test_on_invocation    # assertion gate (no full re-run)
  → planemo test                           # full integration (slow)

The first two gates short-circuit cheap mistakes before a slow planemo run. They are the static-validation equivalent of gxwf for tests, and the implement-galaxy-workflow-test mold should reference them as its primary inner-loop tooling. Source: galaxy-tool-util-ts package, src/test-format/index.ts exports.

7. Authoring loop — generation, then refinement

Reviewer convention is to generate the initial -tests.yml rather than hand-write it. Two planemo subcommands cover this:

  • planemo workflow_test_init --from_invocation <invocation_id> (planemo-workflow_test_init) — given a successful Galaxy invocation ID, emit a -tests.yml with a job: block that captures all inputs (with SHA-1 hashes) and an outputs: block with file: references to the actual outputs (downloaded into test-data/). Hand-tighten the assertions afterward.
  • planemo workflow_test_on_invocation <tests.yml> <invocation_id> (planemo-workflow_test_on_invocation) — re-evaluate an edited -tests.yml against a saved invocation without re-running the workflow. The fast inner loop for assertion iteration; complements the static gates in §6.

Together these cut the assertion-iteration cost dramatically. An agent should:

  1. Run the workflow once on usegalaxy.* (or local) to get a known-good invocation.
  2. --from_invocation to bootstrap the test file.
  3. Replace the autogenerated file: exact-comparison assertions with assertion-family-appropriate alternatives per §1.
  4. planemo-workflow_test_on_invocation after each edit; full planemo-test at the end.

8. What the schema gives you for free

When the test-format schema lands as a Foundry-rendered note, the agent can consult any assertion’s $def directly for: parameter types, defaults, required fields, the that discriminator constant, and the original Python docstring (carried through as description). This note does not restate that vocabulary — it complements it with the corpus-grounded which-and-when.

What’s still missing from the schema and worth keeping in research notes:

  • This decision table (§1) — output-type → assertion family.
  • Tolerance magnitudes (§3) — corpus-derived defaults.
  • The validateTestsFile / checkTestsAgainstWorkflow integration story (§6).
  • Anti-pattern flags — see iwc-shortcuts-anti-patterns.

9. Common combinations (recipes)

Six recipes worth memorizing.

Stable text report (FastQC summary, simple stats).

my_report:
  asserts:
    has_n_lines: { n: 12, delta: 2 }
    has_text: { text: "Total Sequences" }

MultiQC HTML report.

multiqc_report:
  asserts:
    has_text: { text: "Filtered Reads" }
    has_text: { text: "FastQC" }

VCF (pinned tool, fixed reference).

called_variants:
  file: test-data/expected.vcf
  compare: diff
  lines_diff: 6

Stochastic JSON (HyPhy-style).

hyphy_meme:
  element_tests:
    geneA: { asserts: { has_text: { text: "{" } } }
    geneB: { asserts: { has_text: { text: "{" } } }

Matplotlib plot.

umap_plot:
  asserts:
    has_size: { size: 68416, delta: 6000 }
    has_image_width: { width: 601, delta: 30 }
    has_image_height: { height: 429, delta: 25 }

AnnData (HDF5).

clustered_anndata:
  asserts:
    has_h5_keys: { keys: "obs/louvain" }
    has_h5_keys: { keys: "var/highly_variable" }
    has_h5_keys: { keys: "uns/rank_genes_groups" }
    has_size: { size: 12000000, delta: 1500000 }

10. Cross-references

Incoming References (11)

  • Validate Testsrelated note— Validate Galaxy workflow test files and optionally cross-check labels against their workflow.
  • implement-galaxy-workflow-testrelated note— Assemble Galaxy workflow test fixtures and assertions.
  • Component Nextflow Testingrelated note— nf-test patterns mapped to Galaxy planemo asserts and CWL test equivalents — backs nextflow-test-to-target-tests Mold and summarize-nextflow §7.
  • Galaxy <discover_datasets>related note— Reference for the <discover_datasets> Galaxy XML element — attributes, named/regex patterns, <data> vs <collection> contexts, test assertions.
  • Galaxy Workflow Testability Designrelated note— Design guidance for Galaxy workflow inputs, outputs, and checkpoints that make IWC-style workflow tests possible.
  • Iwc Shortcuts Anti Patternsrelated note— What IWC test suites cut corners on (accepted) vs what's a code smell — existence-only probes, sim_size deltas, image dim checks, label coupling.
  • Iwc Tabular Operations Surveyrelated note— Corpus survey of tabular tools and operations across IWC workflows; map for the operation pattern hierarchy on row/column data manipulation.
  • Iwc Test Data Conventionsrelated note— How IWC workflows organize and reference test data — Zenodo-first, SHA-1 integrity, collection shapes, CVMFS gotchas.
  • Nextflow nf-test snapshots to Galaxy/Planemo assertionsrelated note— Translates nf-test snapshot assertions into Galaxy workflow test-format assertions, broken out by module-level vs pipeline-level test shape.
  • Planemo workflow-test architecturerelated note— Reference for Planemo workflow test/run architecture, Galaxy modes, API polling, and noisy failure boundaries.
  • Galaxy workflow test formatrelated note— JSON Schema for the planemo workflow test format (`<workflow>-tests.yml`), vendored from `@galaxy-tool-util/schema`.