Claude skill · cast

summarize-cwl

Validate and normalize a CWL Workflow tree, then emit a lightweight structured summary for downstream Galaxy translation.

← All cast skills · Source mold →

Install

/plugin marketplace add jmchilton/foundry
/plugin install foundry-skills@galaxy-workflow-foundry

Then invoke as:

/foundry-skills:summarize-cwl

Skill Bundle

/ packaged cast

attached files: 6
upfront: 5
on demand: 1
cast rev: n/a
validated: 0

Produces: 1 artifact.

Artifact Contract

/ skill handoff

Produces

summary-cwl

Structured summary of a CWL Workflow + CommandLineTool tree: inputs, outputs, scatter, conditionals, requirements.

jsonsummary-cwl.json[[summary-cwl]]

Raw artifact contract

{
  "id": "summary-cwl",
  "kind": "json",
  "default_filename": "summary-cwl.json",
  "schema": "[[summary-cwl]]",
  "description": "Structured summary of a CWL Workflow + CommandLineTool tree: inputs, outputs, scatter, conditionals, requirements."
}

Attached Files

/ runtime references

Load upfront

cli-tool

cwl-utils

packaged

Normalize the CWL workflow (cwl-normalizer) into a single JSON document for extraction.

upfront runtime verbatim hypothesis deterministic 1.3 KB

bundle: references/cli/cwl-utils.md
source: content/cli/cwl-utils/index.md

Preview md

---
type: cli-tool
tool: cwl-utils
origin: pypi
package: cwl-utils
invoke: cwl-normalizer
invoke_fallback: "uvx --from cwl-utils cwl-normalizer"
availability_check: "cwl-normalizer --help"
docs_url: "https://github.com/common-workflow-language/cwl-utils"
tags:
  - cli-tool
  - cli/cwl-utils
status: draft
created: 2026-05-10
revised: 2026-05-10
revision: 1
ai_generated: true
summary: "CWL document utilities. summarize-cwl uses cwl-normalizer to gather references and upgrade to v1.2 JSON."
---

# cwl-utils

Maintained Python utilities for working with CWL documents. The Foundry's primary entry point is `cwl-normalizer`, which produces a single JSON document with referenced subdocuments gathered and CWL upgraded to v1.2 — the preferred extraction surface for summarize-cwl.

## Install

`uvx --from cwl-utils cwl-normalizer ...` runs a non-default bin in an ephemeral env. `uv tool install cwl-utils` exposes the package's bins on PATH.

Fallback without uv: `pip install cwl-utils`.

## Notes

- The package ships several bins (cwl-normalizer, cwl-graph-split, cwl-docker-extract, ...); the Foundry currently uses cwl-normalizer.
- Normalization is the preferred handoff because it pulls in referenced documents and rewrites references — downstream Molds extract from regular JSON.

cli-tool

cwltool

packaged

Validate the CWL entrypoint before normalization.

upfront runtime verbatim hypothesis deterministic 1.1 KB

bundle: references/cli/cwltool.md
source: content/cli/cwltool/index.md

Preview md

---
type: cli-tool
tool: cwltool
origin: pypi
package: cwltool
invoke: cwltool
invoke_fallback: "uvx cwltool"
availability_check: "cwltool --version"
docs_url: "https://cwltool.readthedocs.io/"
tags:
  - cli-tool
  - cli/cwltool
status: draft
created: 2026-05-10
revised: 2026-05-10
revision: 1
ai_generated: true
summary: "Reference CWL runner and validator. Used by summarize-cwl for entrypoint validation."
---

# cwltool

Reference implementation of the Common Workflow Language standard. The Foundry uses it for entrypoint validation (`cwltool --validate`) before normalization; runtime execution is out of scope for current Molds.

## Install

`uvx cwltool` runs cwltool in an ephemeral environment without a project venv. For repeat use, `uv tool install cwltool` puts the binary on PATH.

Fallback without uv: `pip install cwltool` (in a venv).

## Notes

- Validation is structural, not behavioral. A workflow that validates may still fail under execution.
- The Foundry pairs `cwltool --validate` with `cwl-utils cwl-normalizer` for downstream extraction — the normalized JSON is the preferred surface.

cli-tool

foundry

packaged

Schema-check summary-cwl.json before returning it from the skill.

upfront runtime verbatim cast-validated deterministic 1.5 KB

bundle: references/cli/foundry.md
source: content/cli/foundry/index.md

Preview md

---
type: cli-tool
tool: foundry
origin: npm
package: "@galaxy-foundry/foundry"
invoke: foundry
invoke_fallback: "npx --package @galaxy-foundry/foundry foundry"
availability_check: "foundry --help"
docs_url: "https://github.com/jmchilton/foundry/blob/main/packages/foundry/README.md"
tags:
  - cli-tool
  - cli/foundry
status: draft
created: 2026-05-11
revised: 2026-05-11
revision: 1
ai_generated: true
summary: "Foundry CLI: bundles all Mold IO validators and a summarize-nextflow subcommand."
---

# foundry

Unified Foundry CLI. Subcommands cover every Mold IO validator plus a `summarize-nextflow` wrapper around the standalone `@galaxy-foundry/summarize-nextflow` package.

## Subcommands

- `foundry validate-summary-nextflow <file>` — AJV gate for [[summary-nextflow]] artifacts.
- `foundry validate-summary-cwl <file>` — AJV gate for [[summary-cwl]] artifacts.
- `foundry validate-galaxy-tool-discovery <file>` — AJV gate for [[galaxy-tool-discovery]] recommendations.
- `foundry validate-galaxy-tool-summary <file>` — AJV gate for [[galaxy-tool-summary]] manifests, including the nested `parsed_tool` subtree against [[parsed-tool]].
- `foundry validate-tests-format <file>` — AJV gate for planemo-format workflow tests against [[tests-format]].
- `foundry summarize-nextflow <pipeline>` — wraps `@galaxy-foundry/summarize-nextflow`'s `buildSummary` + `validateSummary` via library import.

## Install

`npx --package @galaxy-foundry/foundry foundry <subcommand>` runs without a global install. For repeat use, `npm install -g @galaxy-foundry/foundry`.

research

component-cwl-workflow-anatomy

packaged

Use CWL's native workflow, step, tool, scatter, conditional, and requirement structure without copying the heavier Nextflow inference pipeline.

upfront runtime verbatim hypothesis deterministic 5.5 KB

bundle: references/notes/component-cwl-workflow-anatomy.md
source: content/research/component-cwl-workflow-anatomy.md

Preview md

---
type: research
subtype: component
title: "CWL workflow anatomy"
tags:
  - research/component
  - source/cwl
status: draft
created: 2026-05-10
revised: 2026-05-10
revision: 1
ai_generated: true
related_notes:
  - "[[summary-cwl]]"
  - "[[cwl-v1.2-schemas]]"
  - "[[galaxy-collection-semantics]]"
related_molds:
  - "[[summarize-cwl]]"
  - "[[cwl-summary-to-galaxy-interface]]"
  - "[[cwl-summary-to-galaxy-data-flow]]"
  - "[[cwl-summary-to-galaxy-template]]"
sources:
  - "https://www.commonwl.org/v1.2/Workflow.html"
  - "https://cwltool.readthedocs.io/en/stable/"
  - "https://github.com/common-workflow-language/cwl-utils#normalize-a-cwl-document"
  - "https://pypi.org/project/cwl-utils/"
  - "https://github.com/common-workflow-language/cwldep"
summary: "CWL structure relevant to summarize-cwl: normalized documents, steps, scatter, conditionals, requirements, and dependency handling."
---

# CWL Workflow Anatomy

CWL is a structured workflow language, not a pipeline framework that must be inferred from ecosystem conventions. The `summarize-cwl` Mold should therefore start from CWL's own validated object model and avoid recreating the heavy Nextflow extraction stack.

## Normalization Posture

Use `cwltool --validate` as the first gate. If validation fails, the summary should emit provenance plus validation diagnostics and stop before producing downstream-looking graph claims.

Use `cwl-normalizer` from `cwl-utils` as the default normalization surface. The cwl-utils README describes it as producing JSON CWL documents with dependencies packed together, upgrading to CWL v1.2 as needed, and optionally refactoring CWL expressions into separate steps. This is the right handoff for `summarize-cwl`: structured enough for extraction, still source-faithful, and not a Galaxy design
...

schema

summary-cwl

packaged

Validate the emitted CWL summary JSON and provide downstream consumers the output contract.

upfront both verbatim cast-validated deterministic 19.2 KB

bundle: references/schemas/summary-cwl.schema.json
source: package://@galaxy-foundry/foundry#summaryCwlSchema

Preview json

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://galaxyproject.org/foundry/schemas/summary-cwl.schema.json",
  "$comment": "Canonical source: packages/foundry/src/schemas/summary-cwl/summary-cwl.schema.json in jmchilton/foundry. Mold frontmatter cites this schema via [[summary-cwl]] wiki-links; the cast pipeline imports the `summaryCwlSchema` runtime export and serializes it into cast bundles.",
  "title": "CWL Workflow Summary",
  "description": "Structured per-source summary emitted by the summarize-cwl Mold. CWL is already a typed workflow language, so this schema records validated and normalized workflow/tool structure rather than inferred pipeline semantics.",
  "type": "object",
  "additionalProperties": false,
  "required": [
    "summary_version",
    "source",
    "documents",
    "workflow_inputs",
    "workflow_outputs",
    "steps",
    "tools",
    "graph",
    "tests",
    "warnings"
  ],
  "properties": {
    "summary_version": {
      "type": "string",
      "enum": [
        "1"
      ],
      "description": "Summary schema major version."
    },
    "source": {
      "$ref": "#/$defs/SourceRecord"
    },
    "documents": {
      "$ref": "#/$defs/DocumentSet"
    },
    "workflow_inputs": {
      "type": "array",
      "items": {
        "$ref": "#/$defs/WorkflowInput"
      }
    },
    "workflow_outputs": {
      "type": "array",
      "items": {
        "$ref": "#/$defs/WorkflowOutput"
      }
    },
    "steps": {
      "type": "array",
      "items": {
        "$ref": "#/$defs/WorkflowStep"
      }
    },
    "tools": {
      "type": "array",
      "items": {
        "$ref": "#/$defs/CommandLineTool"
      }
    },
    "graph": {
      "$ref": "#/$defs/WorkflowGraph"
    },
    "tests": {
      "type": "array",
      "items": {
        "$ref": "#/$defs/TestCase"
      }
    },
    "warnings": {
      "type": "array",
      "items": {
        "$ref": "#/$defs/Warning"
      }
    }
  },
  "$defs": {
    "SourceRecord": {
      "type": "object",
      "additionalProperties": false,
      "required": [
        "ecosystem",
        "workflow",
        "url",
        "version",
        "license",
        "slug",
        "cwl_version",
        "entrypoint"
      ],
      "properties": {
        "ecosystem": {
          "type": "string",
          "enum": [
            "cwl"
          ],
          "description": "Sou
...

Load on demand

research

cwl-v1.2-schemas

packaged

Check official CWL v1.2 field names and source-language semantics when summarizing less-common features.

Trigger: When the workflow uses WorkflowStep features, requirements, hints, Operation, ExpressionTool, or CommandLineTool bindings not covered by the short procedure.

on-demand runtime verbatim corpus-observed deterministic 1.8 KB

bundle: references/notes/cwl-v1.2-schemas.md
source: content/research/cwl-v1.2-schemas.md

Preview md

---
type: research
subtype: component
title: "CWL v1.2 schema documents"
tags:
  - research/component
  - source/cwl
status: draft
created: 2026-05-10
revised: 2026-05-10
revision: 1
ai_generated: true
related_notes:
  - "[[component-cwl-workflow-anatomy]]"
  - "[[summary-cwl]]"
related_molds:
  - "[[summarize-cwl]]"
sources:
  - "https://github.com/common-workflow-language/cwl-v1.2/tree/v1.2.1"
  - "https://www.commonwl.org/v1.2/Workflow.html"
summary: "Vendored official CWL v1.2.1 JSON/SALAD schema documents used as source-structure reference for CWL summarization."
---

# CWL v1.2 Schema Documents

Vendored from `common-workflow-language/cwl-v1.2` tag `v1.2.1`, pinned at SHA `ae6899d`. These files are reference material for [[summarize-cwl]] and [[component-cwl-workflow-anatomy]], not Mold IO schemas.

Vendored files under `content/research/cwl-v1.2/`:

- `cwl.yaml` — generated JSON Schema for CWL v1.2.
- `CommonWorkflowLanguage.yml` — top-level SALAD schema imports.
- `Process.yml` — shared process, requirement, hint, and parameter definitions.
- `CommandLineTool.yml` — command-line tool schema.
- `CommandLineTool-standalone.yml` — standalone command-line tool import surface.
- `Workflow.yml` — workflow, step, scatter, link, and output-source schema.
- `Operation.yml` — abstract operation schema.

Re-sync:

```sh
pnpm sync:vendored
```

The vendored upstream manifest uses pinned raw GitHub URLs. Updating to a new CWL release should change the raw URLs and `pinned_ref` values together, then re-run `pnpm sync:vendored`.

## Foundry Role

Use these documents to check field names, enums, and source semantics while drafting or refining CWL Molds. Do not cite these files from polished Galaxy pattern pages as corpus evidence; they are language specification references, not
...

SKILL.md


# summarize-cwl

Follow the procedure below and use the artifact/reference sections as the runtime contract.

## When To Use

- Validate and normalize a CWL Workflow tree, then emit a lightweight structured summary for downstream Galaxy translation.

## Inputs

- No upstream artifact inputs declared. See the procedure for user-supplied runtime inputs.

## Outputs

- Write artifact `summary-cwl` as `summary-cwl.json`. Format: `json`. Schema: summary-cwl. Structured summary of a CWL Workflow + CommandLineTool tree: inputs, outputs, scatter, conditionals, requirements.

## Required Tools

- **`cwl-normalizer`** (cwl-utils). `uv tool install cwl-utils` (or `pip install cwl-utils`).
  Ephemeral run: `uvx --from cwl-utils cwl-normalizer`.
  Check: `cwl-normalizer --help`.
  Docs: https://github.com/common-workflow-language/cwl-utils
  Bundled reference: `references/cli/cwl-utils.md`.
- **`cwltool`** (cwltool). `uv tool install cwltool` (or `pip install cwltool`).
  Ephemeral run: `uvx cwltool`.
  Check: `cwltool --version`.
  Docs: https://cwltool.readthedocs.io/
  Bundled reference: `references/cli/cwltool.md`.
- **`foundry`** (foundry). `npm install -g @galaxy-foundry/foundry`.
  Ephemeral run: `npx --package @galaxy-foundry/foundry foundry`.
  Check: `foundry --help`.
  Docs: https://github.com/jmchilton/foundry/blob/main/packages/foundry/README.md
  Bundled reference: `references/cli/foundry.md`.

## Load Upfront

- `references/cli/cwl-utils.md`: CLI tool reference copied verbatim into the bundle. Normalize the CWL workflow (cwl-normalizer) into a single JSON document for extraction.
- `references/cli/cwltool.md`: CLI tool reference copied verbatim into the bundle. Validate the CWL entrypoint before normalization.
- `references/cli/foundry.md`: CLI tool reference copied verbatim into the bundle. Schema-check summary-cwl.json before returning it from the skill.
- `references/notes/component-cwl-workflow-anatomy.md`: Research note copied verbatim into the bundle. Use CWL's native workflow, step, tool, scatter, conditional, and requirement structure without copying the heavier Nextflow inference pipeline.
- `references/schemas/summary-cwl.schema.json`: Schema file copied verbatim into the bundle. Validate the emitted CWL summary JSON and provide downstream consumers the output contract.

## Load On Demand

- `references/notes/cwl-v1.2-schemas.md`: Research note copied verbatim into the bundle. Check official CWL v1.2 field names and source-language semantics when summarizing less-common features. Use when: the workflow uses WorkflowStep features, requirements, hints, Operation, ExpressionTool, or CommandLineTool bindings not covered by the short procedure.

## Validation

- Validate `summary-cwl.json` before returning it: run `foundry summary-cwl.json` from `@galaxy-foundry/foundry`. If the command is not on PATH, run `npx --package @galaxy-foundry/foundry foundry summary-cwl.json`. This checks artifact `summary-cwl` against the summary-cwl schema.

## Procedure

Read a CWL Workflow entrypoint, resolve referenced `Workflow`, `CommandLineTool`, `ExpressionTool`, and `Operation` documents, and emit `summary-cwl.json`. This skill is source-specific and target-agnostic: it records what the CWL says, validates and normalizes references, and leaves Galaxy interface/data-flow choices to downstream molds.

CWL is already a structured workflow language. Do not imitate summarize-nextflow's heavy inference machinery unless a real CWL fixture proves the need.

### Inputs

The skill expects:

- A local CWL entrypoint path or an HTTP(S) URL.
- Optional pin/version metadata supplied by the harness or user.
- Optional output directory/path for a normalized CWL document.
- Optional test/job file hints. If no test files are supplied or discoverable, emit `tests: []`.

### Outputs

A single JSON document conforming to summary-cwl. Sketch shape:

```jsonc
{
  "summary_version": "1",
  "source": {
    "ecosystem": "cwl",
    "workflow": "rnaseq-qc",
    "url": "https://example.org/workflows/rnaseq-qc.cwl",
    "version": "abc123",
    "license": null,
    "slug": "rnaseq-qc",
    "cwl_version": "v1.2",
    "entrypoint": "rnaseq-qc.cwl#main"
  },
  "documents": {
    "entrypoint": "rnaseq-qc.cwl",
    "normalized_path": "normalized/rnaseq-qc.cwl.json",
    "validation": {
      "command": "cwltool --validate rnaseq-qc.cwl",
      "status": "valid",
      "diagnostics": []
    }
  },
  "workflow_inputs": [
    {
      "id": "reads",
      "label": "reads",
      "type": "File[]",
      "optional": false,
      "default": null,
      "doc": "Input FASTQ files.",
      "format": "edam:format_1930",
      "secondary_files": []
    }
  ],
  "workflow_outputs": [
    {
      "id": "report",
      "label": "report",
      "type": "File",
      "output_source": "multiqc/report",
      "doc": null,
      "format": "edam:format_2330",
      "secondary_files": []
    }
  ],
  "steps": [
    {
      "id": "fastqc",
      "run": "#fastqc_tool",
      "run_class": "CommandLineTool",
      "label": "FastQC",
      "doc": null,
      "in": [{ "id": "reads", "source": ["reads"], "value_from": null }],
      "out": ["html", "zip"],
      "scatter": ["reads"],
      "scatter_method": "dotproduct",
      "when": null,
      "requirements": [],
      "hints": []
    }
  ],
  "tools": [
    {
      "id": "fastqc_tool",
      "label": "FastQC",
      "base_command": ["fastqc"],
      "arguments": [],
      "inputs": [],
      "outputs": [],
      "requirements": [
        {
          "class": "DockerRequirement",
          "docker_pull": "quay.io/biocontainers/fastqc:0.12.1--hdfd78af_0",
          "docker_image_id": null,
          "packages": [],
          "raw": {}
        }
      ],
      "hints": []
    }
  ],
  "graph": {
    "nodes": [{ "id": "fastqc", "kind": "step", "label": "FastQC" }],
    "edges": [{ "from": "reads", "to": "fastqc/reads", "via": ["scatter"] }]
  },
  "tests": [],
  "warnings": []
}
```

### Procedure

1. Validate the entrypoint with `cwltool --validate` or equivalent library validation. If invalid, emit source provenance, validation diagnostics, `warnings[]`, and do not invent graph structure.
2. Normalize the workflow with `cwl-normalizer` from `cwl-utils` when possible. Use the normalized JSON document as the preferred extraction surface because referenced documents have been gathered, older CWL versions have been upgraded to v1.2 when needed, and the output is regular JSON.
3. Extract `Workflow` inputs/outputs, step wiring, `scatter`, `scatterMethod`, `when`, `requirements`, and `hints` directly from the normalized CWL object model.
4. Extract every referenced `CommandLineTool` command surface: `baseCommand`, `arguments`, input/output bindings, output globs, `DockerRequirement`, and `SoftwareRequirement`.
5. Build a simple graph from workflow inputs to step inputs, step outputs to step inputs, and step outputs to workflow outputs. Add `via` markers for `scatter`, `linkMerge`, `pickValue`, `valueFrom`, and `secondaryFiles`.
6. Record test/job files only when supplied or discoverable by convention. Do not infer expected outputs from command names.
7. Validate the assembled object with `foundry validate-summary-cwl summary-cwl.json` before returning it.

### Caveats Baked Into The Procedure

- **Expressions are preserved, not executed.** `valueFrom`, `when`, expression-based globs, and JavaScript-heavy tools should surface warnings when they affect data shape.
- **Directory is a review trigger.** Preserve `Directory` types; downstream Galaxy molds decide whether to use directory-capable wrappers, explicit files, or collections.
- **Nested workflows stay visible.** A nested `Workflow` in `run:` is a step target, not a reason to flatten blindly. Summarize its boundary and warn if downstream Galaxy translation needs expansion.
- **Dependency solving is downstream.** Capture `DockerRequirement` and `SoftwareRequirement`, but do not resolve them into Tool Shed tools or new wrappers here.
- **Remote document resolution is bounded.** Resolve referenced CWL documents and tool files; do not recursively download arbitrary input data.

### Reference Dispatch

- summary-cwl — always validate output against this schema before emitting.
- component-cwl-workflow-anatomy — use for normalization, graph extraction, scatter/conditionals, requirements, and known non-goals.

### Non-Goals

- **Translation to Galaxy.** Collection choice, datatype choice, data-flow reshaping, IWC comparison, and gxformat2 authoring belong downstream.
- **Tool discovery or wrapper authoring.** Existing Galaxy wrapper search and new wrapper authoring are handled by the per-step Galaxy loop.
- **Runtime execution.** This skill summarizes and validates CWL structure; run-workflow-test owns execution.

## Runtime Notes

- Do not read Foundry source files at runtime; use only files packaged in this skill bundle and user-supplied artifacts.
- Preserve declared artifact filenames unless the user or harness supplies explicit paths.
- Carry unresolved assumptions into the output artifact instead of silently inventing missing source evidence.