CWL Conformance Tests in Galaxy

How Galaxy downloads, structures, and uses the official CWL conformance test suites for validating its CWL runtime.

What Are CWL Conformance Tests?

The Common Workflow Language project maintains official conformance test suites for each CWL spec version. These suites are the canonical way for CWL implementations (cwltool, Toil, Arvados, Galaxy, etc.) to prove spec compliance. Each suite lives in its own GitHub repository:

Version	Repository	Branch
v1.0	`common-workflow-language/common-workflow-language`	`main`
v1.1	`common-workflow-language/cwl-v1.1`	`main`
v1.2	`common-workflow-language/cwl-v1.2`	`main`

A conformance suite consists of:

conformance_tests.yaml — the test manifest describing every test case
A tests/ directory — CWL tool/workflow files, input job JSON, and test data referenced by the manifest

Galaxy tests against all three versions.

How Galaxy Downloads the Tests

Shell Script: `scripts/update_cwl_conformance_tests.sh`

The conformance suites are not vendored or submoduled. They are downloaded on-demand and excluded from git via .gitignore:

# CWL conformance tests
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_?.py
test/functional/tools/cwl_tools/v1.?/

The shell script downloads each version as a zip from GitHub:

wget https://github.com/common-workflow-language/${repo}/archive/main.zip

For each version it:

Extracts the zip
Copies conformance_tests.yaml into test/functional/tools/cwl_tools/v${version}/
Copies the test tools/data directory alongside it
Runs scripts/cwl_conformance_to_test_cases.py to generate Python test files

The v1.0 layout is slightly different from v1.1/v1.2 due to the older repo structure:

Version	Conformance YAML source path	Tests dir source	Local tests dir
v1.0	`v1.0/conformance_test_v1.0.yaml`	`v1.0/v1.0/`	`cwl_tools/v1.0/v1.0/`
v1.1	`conformance_tests.yaml`	`tests/`	`cwl_tools/v1.1/tests/`
v1.2	`conformance_tests.yaml`	`tests/`	`cwl_tools/v1.2/tests/`

Resulting Directory Structure

test/functional/tools/cwl_tools/
├── v1.0_custom/                    # committed — Galaxy-specific CWL test tools
├── v1.0/                           # gitignored — downloaded
│   ├── conformance_tests.yaml
│   └── v1.0/                       # tool/workflow files + test data
│       ├── bwa-mem-tool.cwl
│       ├── cat1-testcli.cwl
│       ├── bwa-mem-job.json
│       └── ...
├── v1.1/                           # gitignored — downloaded
│   ├── conformance_tests.yaml
│   └── tests/
│       ├── bwa-mem-tool.cwl
│       └── ...
└── v1.2/                           # gitignored — downloaded
    ├── conformance_tests.yaml
    └── tests/
        ├── bwa-mem-tool.cwl
        ├── mixed-versions/
        │   └── test-index.yaml     # sub-index, referenced via $import
        ├── string-interpolation/
        │   └── test-index.yaml
        └── ...

Makefile Targets

make generate-cwl-conformance-tests   # download + generate
make update-cwl-conformance-tests     # clean + download + generate
make clean-cwl-conformance-tests      # remove downloaded dirs

Structure of `conformance_tests.yaml`

Each conformance_tests.yaml is a YAML list of test entries. Every entry describes one test case — a tool or workflow to run with specific inputs and expected outputs.

Entry Fields

Field	Required	Description
`id`	yes	Unique identifier (no spaces), e.g. `cl_basic_generation`
`doc`	yes	Unique human-readable description, used as lookup key at runtime
`tool`	yes	Relative path to the `.cwl` tool or workflow file
`job`	yes	Relative path to the input JSON file (or `null` / `tests/empty.json`)
`output`	yes	Expected output values for comparison
`tags`	yes	List of classification tags (see below)
`should_fail`	no	When `true`, the test expects the runner to report failure

Example Entries

A standard passing test:

- id: cl_basic_generation
  doc: General test of command line generation
  tool: tests/bwa-mem-tool.cwl
  job: tests/bwa-mem-job.json
  output:
    args: [bwa, mem, -t, '2', -I, '1,2,3,4', -m, '3',
      chr20.fa,
      example_human_Illumina.pe_1.fastq,
      example_human_Illumina.pe_2.fastq]
  tags: [ required, command_line_tool ]

A should_fail test (runtime failure expected):

- job: tests/empty.json
  tool: tests/echo-tool.cwl
  should_fail: true
  id: any_without_defaults_unspecified_fails
  doc: Test Any without defaults, unspecified, should fail.
  tags: [ command_line_tool, required ]

An intentionally invalid tool (schema-level invalid CWL):

- job: null
  tool: invalid-tool-v10.cwl
  id: invalid_syntax_v10_uses_v12_tool
  doc: test tool with v1.2 syntax marked as v1.0 (should fail)
  should_fail: true
  tags: [ command_line_tool, json_schema_invalid ]

`$import` Directives

v1.2’s conformance_tests.yaml uses $import to pull in sub-index files from subdirectories. These are standard CWL schema-salad $import references:

# In conformance_tests.yaml:
- $import: tests/string-interpolation/test-index.yaml
- $import: tests/conditionals/test-index.yaml
- $import: tests/secondaryfiles/test-index.yaml
- $import: tests/mixed-versions/test-index.yaml
- $import: tests/loadContents/test-index.yaml
- $import: tests/iwd/test-index.yaml
- $import: tests/scatter/test-index.yaml

Each test-index.yaml has the same structure as the top-level file. Tool paths within imported indexes are relative to that sub-directory.

v1.0 and v1.1 do not use $import at the top level.

Tag	Count	Meaning
`command_line_tool`	428	Tests a CommandLineTool
`workflow`	369	Tests a Workflow
`inline_javascript`	302	Requires InlineJavascriptRequirement
`required`	192	Required for minimal conformance
`scatter`	82	Tests scatter patterns
`expression_tool`	81	Tests an ExpressionTool
`initial_work_dir`	69	Tests InitialWorkDirRequirement
`shell_command`	64	Tests ShellCommandRequirement
`step_input`	53	Tests workflow step input features
`multiple_input`	51	Tests MultipleInputFeatureRequirement
`conditional`	46	Tests conditional workflow steps (v1.2)
`inputs_should_parse`	33	Tool definition is valid CWL even though test should fail
`subworkflow`	31	Tests SubworkflowFeatureRequirement
`docker`	30	Requires DockerRequirement
`resource`	27	Tests ResourceRequirement
`schema_def`	18	Tests SchemaDefRequirement
`timelimit`	18	Tests ToolTimeLimit requirement
`env_var`	12	Tests EnvVarRequirement
`format_checking`	8	Tests format validation
`input_object_requirements`	6	Tests input object requirements
`work_reuse`	5	Tests WorkReuse requirement
`networkaccess`	4	Tests NetworkAccess requirement
`inplace_update`	4	Tests InplaceUpdateRequirement
`json_schema_invalid`	4	Tool is intentionally invalid CWL schema
`load_listing`	3	Tests LoadListingRequirement
`secondary_files`	2	Tests secondaryFiles handling

Suite Sizes

Version	Total Entries	`should_fail`	`json_schema_invalid`	Unique Tools	Unique Workflows
v1.0	197	5	0	85	76
v1.1	253	18	0	122	88
v1.2	378	41	4	178	138

How Galaxy Uses the Conformance Tests

1. API Conformance Tests (Runtime Execution)

The primary use. Galaxy actually runs each conformance test against a live Galaxy server.

Generation: scripts/cwl_conformance_to_test_cases.py reads conformance_tests.yaml and generates a Python test class per version:

lib/galaxy_test/api/cwl/test_cwl_conformance_v1_0.py  (generated, gitignored)
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_1.py  (generated, gitignored)
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_2.py  (generated, gitignored)

Each generated test method looks like:

@pytest.mark.cwl_conformance
@pytest.mark.cwl_conformance_v1_0
@pytest.mark.required
@pytest.mark.command_line_tool
@pytest.mark.green
def test_conformance_v1_0_cl_basic_generation(self):
    """General test of command line generation"""
    self.cwl_populator.run_conformance_test("v1.0",
        "General test of command line generation")

Red/Green classification: The script maintains a hardcoded RED_TESTS dict mapping version -> list of test IDs known to fail in Galaxy. Tests not in the list get @pytest.mark.green; those in the list get @pytest.mark.red. Each red test is also annotated with # required or # not required comments.

Runtime: CwlPopulator.run_conformance_test(version, doc) in lib/galaxy_test/base/populators.py:

Looks up the test entry by matching doc string against conformance_tests.yaml
Resolves tool path and job input path relative to the conformance directory
Stages input files (uploads to Galaxy via the API)
If tool — dynamically creates it via create_tool_from_path() if not already loaded
If workflow — imports via import_workflow_from_path()
Executes via tool request API (POST /api/jobs) or workflow invocation
Compares outputs using cwltest.compare.compare() from the cwltest package

2. CI Pipeline

File: .github/workflows/cwl_conformance.yaml

Runs as a GitHub Actions matrix:

matrix:
  marker: ['green', 'red and required', 'red and not required']
  conformance-version: [cwl_conformance_v1_0, cwl_conformance_v1_1, cwl_conformance_v1_2]
  exclude:
    - marker: red and required
      conformance-version: cwl_conformance_v1_0

Green tests: must pass — CI failure blocks merge
Red and required: continue-on-error: true — tracked but non-blocking
Red and not required: continue-on-error: true — optional spec features

The CI command:

./run_tests.sh --coverage --skip_flakey_fails -cwl lib/galaxy_test/api/cwl \
  -- -m "${{ matrix.marker }} and ${{ matrix.conformance-version }}"

This starts a Galaxy test server, downloads conformance tools on first run, and executes the filtered subset of generated tests.

3. Tool Specification Loading Tests (Unit Tests)

File: test/unit/tool_util/test_cwl_tool_specification_loading.py

Tests that CWL tool files can be parsed into ToolParameterBundleModel objects — validating the type-mapping pipeline without running a Galaxy server.

Uses conformance_tests.yaml to discover tool files (rather than filesystem walking). This YAML-driven approach naturally excludes json_schema_invalid tools that are intentionally unparseable. See _conformance_cwl_tools() which iterates entries via conformance_tests_gen(), skips json_schema_invalid tags, deduplicates tool paths, and filters out workflows/$graph documents.

4. CWL Unit Tests

File: test/unit/tool_util/test_cwl.py

Lower-level tests that exercise the ToolProxy creation pipeline directly (cwltool loading, schema validation, proxy construction). These reference specific conformance tool files by path (e.g. v1.0/v1.0/cat1-testcli.cwl) and require the conformance tests to have been downloaded.

`conformance_tests_gen()` — The Shared Parser

Both the test generation script and runtime test infrastructure need to iterate conformance test entries, following $import directives. This is handled by conformance_tests_gen() in lib/galaxy/tool_util/unittest_utils/cwl_data.py:

def conformance_tests_gen(directory, filename="conformance_tests.yaml"):
    conformance_tests_path = os.path.join(directory, filename)
    with open(conformance_tests_path) as f:
        conformance_tests = yaml.safe_load(f)

    for conformance_test in conformance_tests:
        if "$import" in conformance_test:
            import_dir, import_filename = os.path.split(conformance_test["$import"])
            yield from conformance_tests_gen(
                os.path.join(directory, import_dir), import_filename)
        else:
            conformance_test["directory"] = directory
            yield conformance_test

Imported by:

lib/galaxy_test/base/populators.py — runtime conformance test execution
scripts/cwl_conformance_to_test_cases.py — pytest code generation
test/unit/tool_util/test_cwl_tool_specification_loading.py — tool loading tests

Relationship Between Conformance Entries and Tool Files

A single .cwl tool file may appear in multiple conformance entries (different input jobs testing different behaviors). Conversely, some .cwl files on disk may not be referenced by any entry (helper files, schema definitions, etc.).

The should_fail field and json_schema_invalid tag create important distinctions:

Category	`should_fail`	`json_schema_invalid`	Tool file valid?	Should load?
Normal test	no	no	yes	yes
Runtime failure test	yes	no	yes	yes
Schema-invalid test	yes	yes	no	no

Most should_fail tests use valid tool files with bad inputs — the tool itself is parseable. Only json_schema_invalid entries have intentionally broken .cwl files.

Galaxy-Specific Test Tools (Not From Conformance)

In addition to the downloaded conformance suites, Galaxy maintains its own CWL test tools that are committed to the repository:

Location	Count	Purpose
`test/functional/tools/parameters/cwl_*.cwl`	~10	CWL parameter type testing
`test/functional/tools/cwl_tools/v1.0_custom/`	~18	Galaxy-specific CWL features
`test/functional/tools/galactic_*.cwl`	2	`gx:Interface` hint testing

These are always available regardless of whether conformance tests have been downloaded.

DEPENDENCIES_CWL_CONFORMANCE_TESTS

CWL Conformance Tests in Galaxy

What Are CWL Conformance Tests?

How Galaxy Downloads the Tests

Shell Script: `scripts/update_cwl_conformance_tests.sh`

Resulting Directory Structure

Makefile Targets

Structure of `conformance_tests.yaml`

Entry Fields

Example Entries

`$import` Directives

Tags

Suite Sizes

How Galaxy Uses the Conformance Tests

1. API Conformance Tests (Runtime Execution)

2. CI Pipeline

3. Tool Specification Loading Tests (Unit Tests)

4. CWL Unit Tests

`conformance_tests_gen()` — The Shared Parser

Relationship Between Conformance Entries and Tool Files

Galaxy-Specific Test Tools (Not From Conformance)

DEPENDENCIES_CWL_CONFORMANCE_TESTS

CWL Conformance Tests in Galaxy

What Are CWL Conformance Tests?

How Galaxy Downloads the Tests

Shell Script: scripts/update_cwl_conformance_tests.sh

Resulting Directory Structure

Makefile Targets

Structure of conformance_tests.yaml

Entry Fields

Example Entries

$import Directives

Tags

Suite Sizes

How Galaxy Uses the Conformance Tests

1. API Conformance Tests (Runtime Execution)

2. CI Pipeline

3. Tool Specification Loading Tests (Unit Tests)

4. CWL Unit Tests

conformance_tests_gen() — The Shared Parser

Relationship Between Conformance Entries and Tool Files

Galaxy-Specific Test Tools (Not From Conformance)

Shell Script: `scripts/update_cwl_conformance_tests.sh`

Structure of `conformance_tests.yaml`

`$import` Directives

`conformance_tests_gen()` — The Shared Parser