Home Schema

CWL workflow summary

JSON Schema for the structured summary emitted by the summarize-cwl Mold.

Revised
2026-05-10
Rev
1
schema summary-cwl @galaxy-foundry/foundry @ 0.0.0 upstream ↗

CWL Workflow Summary

Structured per-source summary emitted by the summarize-cwl Mold. CWL is already a typed workflow language, so this schema records validated and normalized workflow/tool structure rather than inferred pipeline semantics.

18 definitions. Anchor links per definition (e.g. #has_text_model) are stable.

BindingSummary

BindingSummary
field type req description
glob string | array | null
position number | integer | null
prefix string | null
value_from string | null

CommandLineTool

CommandLineTool
field type req description
arguments string[]
base_command string[]
hints → Requirement[]
id string
inputs → ToolInput[]
label string | null
outputs → ToolOutput[]
requirements → Requirement[]

DocumentSet

DocumentSet
field type req description
entrypoint string Local path or URL passed to cwl-utils/cwltool.
normalized_path string | null Path to the cwl-utils cwl-normalizer JSON output when normalization was requested or possible.
validation → ValidationResult

ExpectedOutputRef

ExpectedOutputRef
field type req description
assertions string[]
checksum string | null
filetype string | null
id string
path string | null
url string | null

GraphEdge

GraphEdge
field type req description
from string
to string
via string[] Shape-affecting CWL features on this edge, e.g. scatter, linkMerge, pickValue, valueFrom.

GraphNode

GraphNode
field type req description
id string
kind string
label string | null

Requirement

Requirement
field type req description
class string Requirement or hint class, e.g. DockerRequirement, SoftwareRequirement, ScatterFeatureRequirement.
docker_image_id string | null DockerRequirement.dockerImageId when present.
docker_pull string | null DockerRequirement.dockerPull when present.
packages string[] SoftwareRequirement packages as name/version/spec strings.
raw object Verbatim requirement/hint object for unsupported or target-specific fields.

SourceRecord

SourceRecord
field type req description
cwl_version string | null Declared cwlVersion after loading the entry document, commonly v1.0, v1.1, or v1.2.
ecosystem string Source ecosystem; fixed for this per-source schema.
entrypoint string Main workflow document/id after resolution, e.g. workflow.cwl#main.
license string | null License identifier or detected license filename when available.
slug string Stable kebab-case run slug.
url string | null Original path or URL supplied by the user.
version string | null Pinned commit, tag, release, or digest when available.
workflow string Human-oriented workflow name, usually the entry Workflow id or source filename stem.

StepInput

StepInput
field type req description
default string | number | integer | boolean | object | array | null Verbatim WorkflowStepInput default value when present.
id string
link_merge string | null CWL linkMerge setting for multiple inbound sources.
pick_value string | null CWL pickValue setting for choosing non-null values among multiple sources.
source array | null Normalized CWL source links. Null means the input has no source and is driven by default or runtime binding.
value_from string | null Expression from valueFrom, preserved verbatim when present.

TestCase

TestCase
field type req description
expected_outputs → ExpectedOutputRef[]
job_path string | null
name string
provenance string

ToolInput

ToolInput
field type req description
doc string | null
format string | array | null
id string
input_binding → BindingSummary
secondary_files string[]
type string

ToolOutput

ToolOutput
field type req description
doc string | null
format string | array | null
id string
output_binding → BindingSummary
secondary_files string[]
type string

ValidationResult

ValidationResult
field type req description
command string Validation command or library operation used.
diagnostics string[]
status string

Warning

Warning
field type req description
code string
message string
path string | null

WorkflowGraph

WorkflowGraph
field type req description
edges → GraphEdge[]
nodes → GraphNode[]

WorkflowInput

WorkflowInput
field type req description
default string | number | integer | boolean | object | array | null Verbatim default value when present.
doc string | null
format string | array | null
id string
label string | null
optional boolean
secondary_files string[]
type string Compact CWL type expression, preserving arrays, records, enums, nullability, File, and Directory.

WorkflowOutput

WorkflowOutput
field type req description
doc string | null
format string | array | null
id string
label string | null
output_source string | array | null
secondary_files string[]
type string

WorkflowStep

WorkflowStep
field type req description
doc string | null
hints → Requirement[]
id string
in → StepInput[]
label string | null
out string[]
requirements → Requirement[]
run string Resolved run reference as it appears in the normalized document, e.g. #tool_id.
run_class string
scatter string[]
scatter_method string | null
when string | null CWL v1.2 conditional expression, preserved verbatim when present.

This page points to the JSON Schema authored in this repo and shipped as part of @galaxy-foundry/foundry (orphan schema — no TypeScript producer Mold owns it). The schema is intentionally smaller than summary-nextflow because CWL already carries typed workflow and tool structure.

Source-of-truth chain:

  1. packages/foundry/src/schemas/summary-cwl/summary-cwl.schema.json — canonical JSON, hand-edited as part of the Mold/cast loop.
  2. packages/foundry/scripts/sync-schema.mjs regenerates the typed summary-cwl.schema.generated.ts const wrapper at prebuild.
  3. Published as part of @galaxy-foundry/foundry, exporting summaryCwlSchema and the foundry validate-summary-cwl subcommand.

Generated skills should validate emitted summaries with:

foundry validate-summary-cwl summary-cwl.json

Why This Is Lighter Than Nextflow

Nextflow summarization has to infer process graphs, channel shapes, sample-sheet semantics, containers, and tests from DSL2 plus ecosystem conventions. CWL has first-class workflow inputs, workflow outputs, WorkflowStep.run, CommandLineTool inputs/outputs, scatter, when, and requirement blocks. The summary therefore records validated CWL structure and adds only the annotations downstream Galaxy molds need.

Top-Level Shape

The v1 summary contains:

  • sourceecosystem: cwl, workflow name, entrypoint, declared cwlVersion, URL/path, optional pin/license.
  • documents — entrypoint, optional cwl-utils normalized JSON output path, and validation diagnostics.
  • workflow_inputs / workflow_outputs — CWL ids, compact type strings, formats, secondary files, defaults, labels, and docs.
  • stepsrun, run class, input sources, outputs, scatter/scatterMethod, when, requirements, and hints.
  • toolsCommandLineTool command surface: base command, arguments, tool inputs/outputs, bindings, Docker/Software requirements, and hints.
  • graph — simple source-to-sink edges with via markers for shape-affecting features such as scatter, linkMerge, pickValue, and valueFrom.
  • tests — job files and expected outputs only when supplied or discoverable by convention.
  • warnings — unsupported extensions, expression-heavy edges, missing referenced files, and remote resolution failures.

Intentional Non-Goals

  • No target translation. Galaxy collection choice, datatype choice, Tool Shed discovery, and gxformat2 authoring belong to downstream molds.
  • No full JavaScript evaluation. valueFrom, when, and expression-derived globbing are preserved verbatim and flagged when they affect shape.
  • No deep package solving. DockerRequirement and SoftwareRequirement are recorded; dependency resolution into concrete Galaxy wrappers is downstream work.
  • No recursive data download by default. URL resolution is for CWL documents and directly referenced tool/workflow files, not arbitrary input data crawling.

Incoming References (8)

  • Validate Summary Cwlrelated note— AJV gate for summarize-cwl JSON documents.
  • cwl-summary-to-galaxy-data-flowrelated note— Translate a CWL summary into a Galaxy data-flow design brief.
  • cwl-summary-to-galaxy-interfacerelated note— Map a CWL summary into a Galaxy workflow interface design brief.
  • cwl-summary-to-galaxy-templaterelated note— gxformat2 skeleton with per-step TODOs from a CWL summary and prior Galaxy design briefs.
  • summarize-cwlrelated note— Validate and normalize a CWL Workflow tree, then emit a lightweight structured summary for downstream Galaxy translation.
  • summarize-cwlschema of artifact output— Validate and normalize a CWL Workflow tree, then emit a lightweight structured summary for downstream Galaxy translation.
  • CWL workflow anatomyrelated note— CWL structure relevant to summarize-cwl: normalized documents, steps, scatter, conditionals, requirements, and dependency handling.
  • CWL v1.2 schema documentsrelated note— Vendored official CWL v1.2.1 JSON/SALAD schema documents used as source-structure reference for CWL summarization.