IDEA_CWL_BACKGROUND

IDEA_CWL_BACKGROUND.md

Pitch frame. OS4LS Track 2 supports open-source life-sciences software with an AI-readiness emphasis. We propose to revitalize the Common Workflow Language (CWL) not as a competing execution engine but as a portable intermediate representation (IR) for agent-driven workflow translation. The Galaxy Workflow Foundry already compiles informal artifacts (papers, Nextflow scripts, .ga files) into executable workflows; CWL — with its formal, schema-salad-grounded semantics, abstract Operation class, container-resolution determinism, and language-independent parameter schemas — is the most defensible canonical form an agent can target while still emitting Nextflow, gxformat2, or WDL on the other side. Investment lines: cwl-tool-util (a CommandLineTool registry/validator/LSP analog of galaxy-tool-util), Foundry molds for paper-to-cwl and nextflow-to-cwl, a curated CWL workflow corpus analogous to IWC/nf-core, and a portability test harness proving the same logical workflow round-trips across CWL, Galaxy, and Nextflow.

1. CWL current state (May 2026)

2. CWL tooling gaps vs Galaxy / Nextflow

CapabilityCWL todayGalaxyNextflow / nf-core
Reference runnercwltool (active)Galaxy servernextflow (very active, Seqera)
Cloud / K8s runnerArvados, Calrissian, Toil, arvados-cwl-runnerPulsar / KubernetesNative + Fusion + Tower
Tool/process schema libschema-salad (foundational, low surface), no galaxy-tool-util analoggalaxy-tool-util (rich: lint, cache, test, citations, EDAM)nf-core/tools (lint, sync, template)
IDE / LSPRabix Benten — last release Jan 2021, 64 stars, effectively unmaintainedPlanemo + Galaxy IDE pluginsnf-core VSCode + nextflow-language-server (active)
Validatorcwltool --validate, schema-salad-toolplanemo lintnextflow lint, nf-core lint
Curated workflow registryNone CWL-specific. Workflows land in Dockstore, WorkflowHub (cross-language)IWC (Intergalactic Workflow Commission)nf-core: 124+ curated pipelines (Feb 2025); cited as 149 on portal
Test-data conventionsAd-hoc; conformance tests for spec onlyPlanemo test conventions, baked-in <test> blocksnf-test, test profiles, stub-run
Container resolutionDockerRequirement + SoftwareRequirement; Singularity/Podman supportedmulled / BioContainers / quay resolution chainBioContainers + Wave (Seqera)
Parameter schemasStrong (schema-salad, JSON-Schema-like, typed)tool_state JSON; gxformat2 evolvingGroovy-ish DSL2; nf-schema plugin

Most important gaps for an agentic IR play:

  1. No cwl-tool-util. Galaxy has a single library (galaxy-tool-util) that handles parsing, linting, dependency resolution, citation extraction, test discovery, EDAM annotation, and shed metadata. CWL has schema-salad (parser only) and cwltool (runtime). An agent wanting to manipulate CommandLineTool documents has to roll its own utilities.
  2. No curated corpus. IWC and nf-core are the obvious models. Dockstore/WorkflowHub indexes are heterogeneous and not curated to a single quality bar.
  3. No live LSP. Benten is abandoned. Modern agents and IDE users alike need diagnostics and completion driven by the schema.
  4. No agent-facing translation utilities. Translators exist (wdl-cwl-translator, the abandoned cwl2nxf, the UChicago CNT prototype claiming 81% coverage), but none is positioned as a Foundry-grade compiler.

3. Competitive landscape

4. The IR analogy

Compiler precedents:

Workflow-language translation literature is sparser but real:

No paper to date positions a single workflow language as a canonical IR with bidirectional lowering across the major engines. The OS4LS pitch is intellectually novel in framing but technically conservative: every individual translation pair already has a prototype.

5. Politics

Curii is the de facto CWL commercial steward; Amstutz is on the leadership team alongside Crusoe (Project Lead) and Chilton (Galaxy). For this proposal to land cleanly:

6. AI-readiness framing

Why CWL is the right IR for agents (not necessarily for humans):

  1. Formal grammar. Schema-salad is a real schema language with named, typed records and explicit inheritance. Agents can validate generated artifacts structurally before any execution attempt.
  2. Static parameter schemas. Inputs and outputs are typed (File, Directory, int, enum, records). Nextflow channels and Galaxy tool_state are far harder to introspect.
  3. Container-resolution determinism. DockerRequirement + SoftwareRequirement give an agent a reproducible binding from logical tool to image without engine-specific resolution rules.
  4. Abstract Operation class (v1.2). Lets the agent emit the shape of a step before committing to a specific runtime — exactly what gxformat2.abstract_export already exploits to round-trip Galaxy steps.
  5. Provenance. CWL co-evolved with RO-Crate; an agent that generates CWL gets FAIR provenance metadata for free.
  6. Stable target. The standard has been deliberately frozen at v1.2 for >2 years. For an LLM training/fine-tuning corpus or a static codegen target, boring is a feature.

By contrast: Nextflow DSL2 is Turing-complete Groovy with closures and channel side effects; gxformat2 is improving but still tied to Galaxy server state. Neither is a comfortable static-analysis surface.

7. Suggested LOI landscape-analysis paragraph (~180 words)

The bioinformatics workflow ecosystem has bifurcated into two camps: large execution engines with engaged commercial sponsors (Nextflow/Seqera with 120+ curated nf-core pipelines, Galaxy with the Intergalactic Workflow Commission, WDL/Cromwell on Terra) and a portable open standard, the Common Workflow Language (CWL v1.2.1), governed by a vendor-neutral multi-organization working group with Curii (Arvados) as its commercial anchor. CWL’s reference runner cwltool remains actively maintained (April 2026 release), but its IDE tooling, validators, and registries have stagnated — the Rabix Benten language server has not seen a release since January 2021, and CWL lacks a curated workflow corpus analogous to nf-core or IWC. Meanwhile, the rise of agent-driven workflow construction (Seqera AI, Galaxy’s Workflow Foundry, paper-to-pipeline translators) has created new demand for a machine-readable, formally specified workflow IR. We see an under-served niche: revitalize CWL not as a competing execution engine but as the portable IR through which agents compile across systems.

8. Risks / weaknesses to address proactively

Open questions for the human

Sources