Home Mold

summarize-galaxy-tool

Pull JSON schema, container, source, inputs/outputs for a Galaxy tool.

Mold health

ok
  • Source layout

    Directory Mold with only index.md frontmatter.

  • Axis fields

    target-specific fields are coherent.

  • Eval plan

    eval.md declares cases and check type.

  • Typed refs

    5 typed references; 0 resolver issues.

  • On-demand triggers

    All on-demand references describe triggers.

  • Evidence checks

    Hypothesis references include verification.

axis
target-specific
target
galaxy
name
summarize-galaxy-tool
contract

Reference Loading

Typed Mold references describe what casting consumes and when the generated skill should load each artifact.

Schemaparsed-tool

Structured contract copied for validation or lookup.

Purpose
Resolve field-level questions about the upstream `ParsedTool` payload embedded under `parsed_tool`.
Trigger
When a downstream Mold needs a specific input/output/help/citation field from the embedded `ParsedTool`.
Researchgalaxy-tool-summary-input-source

Background synthesis loaded by explicit progressive-disclosure metadata.

Purpose
Treat cached ParsedTool JSON from galaxy-tool-cache as the v1 input source for Galaxy tool summaries.
Verify
Summarize FastQC, bwa_mem2, and samtools_sort from cached ParsedTool JSON and confirm downstream step implementation can bind inputs and outputs.

Artifact handoffs

/ pipeline contract

Produces

Consumes

  • Pin from [[discover-shed-tool]]; identifies which cached ParsedTool to summarize. Authored UDTs from [[author-galaxy-tool-wrapper]] bypass this Mold.

    produced by

summarize-galaxy-tool

Read a cached Galaxy ParsedTool object for an existing wrapper and emit a compact tool summary that downstream step implementation can bind to. This Mold runs after discover-shed-tool has selected a Tool Shed pin and after the caller has populated galaxy-tool-cache for that pin.

This Mold owns the wrapper summarization step only. It does not search the Tool Shed, choose a version, author XML, or decide how a workflow step should use the wrapper. Its job is to preserve the wrapper’s executable contract: identity, command shape, inputs, outputs, requirements, tests, and any conditional or data-table behavior that could affect binding.

The v1 input-source decision is galaxy-tool-summary-input-source: read cached ParsedTool JSON, using raw XML only as supporting evidence when the cache object is lossy or ambiguous.

Inputs

The Mold expects:

  • A Tool Shed pin from discover-shed-tool: tool_shed_url, owner, repo, tool_id, version, and changeset_revision.
  • A galaxy-tool-cache directory containing the cached ParsedTool JSON for that pin.
  • Optional raw XML source for ambiguity checks, normally fetched through cache metadata rather than treated as the primary input.
  • Optional step intent from the caller, used only to prioritize which wrapper details to explain; it must not change the wrapper facts.

Outputs

A single JSON document conforming to galaxy-tool-summary — the deterministic manifest emitted by galaxy-tool-cache summarize from @galaxy-tool-util/cli. Top-level fields: schema_version, tool_id, tool_version, cache_key, source, artifacts, parsed_tool, input_schemas, warnings. The parsed_tool subtree is the upstream parsed-tool payload verbatim; input_schemas.workflow_step and input_schemas.workflow_step_linked carry generated JSON Schemas describing the tool’s inputs at workflow-step authoring time.

This Mold does not hand-author the manifest — it invokes galaxy-tool-cache summarize against a cache populated for the Tool Shed pin. Wrapper-derived facts that are not yet exposed by upstream ParsedTool (currently: requirements, containers, stdio) flow into the manifest additively as Galaxy upstream extends ParsedTool; no Foundry-side schema change is needed when they ship.

Procedure

1. Load the cached wrapper

Locate the ParsedTool JSON in the configured galaxy-tool-cache directory using the Tool Shed pin. Fail early if the cache entry is missing; do not silently re-search the Tool Shed.

Confirm the cached identity matches the requested pin. If the cache exposes a tool id or version that conflicts with the pin, emit a hard failure rather than summarizing the wrong wrapper.

2. Capture identity and provenance

Populate source from the discovery pin and cache metadata. Populate tool from the parsed wrapper identity fields, not from the search hit. Search hits can omit version and changeset detail.

Keep both forms when they differ:

  • Short XML id for human matching.
  • Fully qualified installed Tool Shed id for gxformat2 step binding.

3. Extract executable requirements

Read <requirements> into structured package/container requirements. Preserve requirement type, name, version, and any container URI or resolver hints exposed by the cache.

Do not invent Bioconda equivalences here. Equivalence inference belongs to author-galaxy-tool-wrapper when authoring a new UDT. Existing wrapper summaries report what the wrapper declares.

4. Summarize command and failure behavior

Preserve the command template enough for downstream binding to understand which inputs and parameters are consumed. Record strict-shell, stdio regexes, exit-code handling, environment variables, and dynamic output behavior when present.

The command summary should be readable, but lossy prose is not enough. Keep template fragments and wrapper flags where they affect required inputs, output discovery, or runtime failure classification.

5. Enumerate inputs

For every wrapper input parameter, emit:

  • name and label/help text when available.
  • Parameter kind (data, data_collection, select, integer, float, boolean, text, conditional section, repeat, section).
  • Required/optional/default semantics.
  • Datatypes, collection types, and multiple-value behavior.
  • Select choices and dynamic options when statically available.
  • Data-table references separately from user-provided parameters.

Nested conditionals must preserve branch ownership. Do not flatten when branches into independent parameters without recording the controlling selector and branch value.

6. Enumerate outputs

For every output and collection output, emit:

  • Output name, datatype, label, and from_work_dir or discovery rule when available.
  • Conditional output ownership.
  • Dynamic output collection shape and naming rules.
  • Relationship to input collections when the wrapper maps over or preserves identifiers.

If output discovery depends on runtime filenames, record that as a warning for downstream test and debug Molds.

7. Capture tests and citations

Summarize wrapper tests into input fixture expectations, parameter settings, and output assertions. Do not treat wrapper tests as workflow tests; they are evidence about legal parameter combinations and output behavior.

Preserve citations, help text, and upstream URLs when present because they help resolve ambiguous wrappers during review.

8. Emit warnings

Warnings should identify missing or lossy surfaces, especially:

  • ParsedTool JSON omits raw XML behavior that affects binding.
  • Conditional inputs cannot be reconstructed completely.
  • Dynamic data-table options require Galaxy instance configuration.
  • Output discovery is runtime-dependent.
  • Tests are absent or too thin to confirm key outputs.

Non-goals

Incoming References (8)

  • CWL → GALAXYphase of pipeline— Path from a CWL Workflow to a Galaxy gxformat2 workflow. Lighter upstream extraction.
  • NEXTFLOW → GALAXYphase of pipeline— Direct path from a Nextflow pipeline to a Galaxy gxformat2 workflow.
  • PAPER → GALAXYphase of pipeline— Direct path from a paper to a Galaxy gxformat2 workflow. No CWL intermediate.
  • Component Nextflow Containers And Envsrelated mold— Container URL grammar (depot, BioContainers, mulled-v2, Wave, ORAS) and conda directive resolution rules backing summarize-nextflow §5.
  • Galaxy tool summary input sourcerelated mold— Decides that summarize-galaxy-tool reads cached ParsedTool JSON as its v1 input source.
  • Galaxy tool discovery recommendationrelated mold— JSON Schema for Tool Shed discovery hit, weak, and miss recommendations.
  • Galaxy tool summary manifestrelated mold— JSON Schema for the deterministic per-tool manifest emitted by `galaxy-tool-cache summarize`.
  • Galaxy ParsedToolrelated mold— JSON Schema for the upstream Galaxy `ParsedTool` model, vendored from `@galaxy-tool-util/schema`.