CWL Conformance Tests in Galaxy
How Galaxy downloads, structures, and uses the official CWL conformance test suites for validating its CWL runtime.
What Are CWL Conformance Tests?
The Common Workflow Language project maintains official conformance test suites for each CWL spec version. These suites are the canonical way for CWL implementations (cwltool, Toil, Arvados, Galaxy, etc.) to prove spec compliance. Each suite lives in its own GitHub repository:
| Version | Repository | Branch |
|---|---|---|
| v1.0 | common-workflow-language/common-workflow-language | main |
| v1.1 | common-workflow-language/cwl-v1.1 | main |
| v1.2 | common-workflow-language/cwl-v1.2 | main |
A conformance suite consists of:
conformance_tests.yaml— the test manifest describing every test case- A
tests/directory — CWL tool/workflow files, input job JSON, and test data referenced by the manifest
Galaxy tests against all three versions.
How Galaxy Downloads the Tests
Shell Script: scripts/update_cwl_conformance_tests.sh
The conformance suites are not vendored or submoduled. They are downloaded
on-demand and excluded from git via .gitignore:
# CWL conformance tests
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_?.py
test/functional/tools/cwl_tools/v1.?/
The shell script downloads each version as a zip from GitHub:
wget https://github.com/common-workflow-language/${repo}/archive/main.zip
For each version it:
- Extracts the zip
- Copies
conformance_tests.yamlintotest/functional/tools/cwl_tools/v${version}/ - Copies the test tools/data directory alongside it
- Runs
scripts/cwl_conformance_to_test_cases.pyto generate Python test files
The v1.0 layout is slightly different from v1.1/v1.2 due to the older repo structure:
| Version | Conformance YAML source path | Tests dir source | Local tests dir |
|---|---|---|---|
| v1.0 | v1.0/conformance_test_v1.0.yaml | v1.0/v1.0/ | cwl_tools/v1.0/v1.0/ |
| v1.1 | conformance_tests.yaml | tests/ | cwl_tools/v1.1/tests/ |
| v1.2 | conformance_tests.yaml | tests/ | cwl_tools/v1.2/tests/ |
Resulting Directory Structure
test/functional/tools/cwl_tools/
├── v1.0_custom/ # committed — Galaxy-specific CWL test tools
├── v1.0/ # gitignored — downloaded
│ ├── conformance_tests.yaml
│ └── v1.0/ # tool/workflow files + test data
│ ├── bwa-mem-tool.cwl
│ ├── cat1-testcli.cwl
│ ├── bwa-mem-job.json
│ └── ...
├── v1.1/ # gitignored — downloaded
│ ├── conformance_tests.yaml
│ └── tests/
│ ├── bwa-mem-tool.cwl
│ └── ...
└── v1.2/ # gitignored — downloaded
├── conformance_tests.yaml
└── tests/
├── bwa-mem-tool.cwl
├── mixed-versions/
│ └── test-index.yaml # sub-index, referenced via $import
├── string-interpolation/
│ └── test-index.yaml
└── ...
Makefile Targets
make generate-cwl-conformance-tests # download + generate
make update-cwl-conformance-tests # clean + download + generate
make clean-cwl-conformance-tests # remove downloaded dirs
Structure of conformance_tests.yaml
Each conformance_tests.yaml is a YAML list of test entries. Every entry describes
one test case — a tool or workflow to run with specific inputs and expected outputs.
Entry Fields
| Field | Required | Description |
|---|---|---|
id | yes | Unique identifier (no spaces), e.g. cl_basic_generation |
doc | yes | Unique human-readable description, used as lookup key at runtime |
tool | yes | Relative path to the .cwl tool or workflow file |
job | yes | Relative path to the input JSON file (or null / tests/empty.json) |
output | yes | Expected output values for comparison |
tags | yes | List of classification tags (see below) |
should_fail | no | When true, the test expects the runner to report failure |
Example Entries
A standard passing test:
- id: cl_basic_generation
doc: General test of command line generation
tool: tests/bwa-mem-tool.cwl
job: tests/bwa-mem-job.json
output:
args: [bwa, mem, -t, '2', -I, '1,2,3,4', -m, '3',
chr20.fa,
example_human_Illumina.pe_1.fastq,
example_human_Illumina.pe_2.fastq]
tags: [ required, command_line_tool ]
A should_fail test (runtime failure expected):
- job: tests/empty.json
tool: tests/echo-tool.cwl
should_fail: true
id: any_without_defaults_unspecified_fails
doc: Test Any without defaults, unspecified, should fail.
tags: [ command_line_tool, required ]
An intentionally invalid tool (schema-level invalid CWL):
- job: null
tool: invalid-tool-v10.cwl
id: invalid_syntax_v10_uses_v12_tool
doc: test tool with v1.2 syntax marked as v1.0 (should fail)
should_fail: true
tags: [ command_line_tool, json_schema_invalid ]
$import Directives
v1.2’s conformance_tests.yaml uses $import to pull in sub-index files from
subdirectories. These are standard CWL schema-salad $import references:
# In conformance_tests.yaml:
- $import: tests/string-interpolation/test-index.yaml
- $import: tests/conditionals/test-index.yaml
- $import: tests/secondaryfiles/test-index.yaml
- $import: tests/mixed-versions/test-index.yaml
- $import: tests/loadContents/test-index.yaml
- $import: tests/iwd/test-index.yaml
- $import: tests/scatter/test-index.yaml
Each test-index.yaml has the same structure as the top-level file. Tool paths
within imported indexes are relative to that sub-directory.
v1.0 and v1.1 do not use $import at the top level.
Tags
Tags classify tests by feature area. Galaxy uses them for pytest markers and CI matrix filtering. Complete tag inventory (across all 3 versions, 828 total entries):
| Tag | Count | Meaning |
|---|---|---|
command_line_tool | 428 | Tests a CommandLineTool |
workflow | 369 | Tests a Workflow |
inline_javascript | 302 | Requires InlineJavascriptRequirement |
required | 192 | Required for minimal conformance |
scatter | 82 | Tests scatter patterns |
expression_tool | 81 | Tests an ExpressionTool |
initial_work_dir | 69 | Tests InitialWorkDirRequirement |
shell_command | 64 | Tests ShellCommandRequirement |
step_input | 53 | Tests workflow step input features |
multiple_input | 51 | Tests MultipleInputFeatureRequirement |
conditional | 46 | Tests conditional workflow steps (v1.2) |
inputs_should_parse | 33 | Tool definition is valid CWL even though test should fail |
subworkflow | 31 | Tests SubworkflowFeatureRequirement |
docker | 30 | Requires DockerRequirement |
resource | 27 | Tests ResourceRequirement |
schema_def | 18 | Tests SchemaDefRequirement |
timelimit | 18 | Tests ToolTimeLimit requirement |
env_var | 12 | Tests EnvVarRequirement |
format_checking | 8 | Tests format validation |
input_object_requirements | 6 | Tests input object requirements |
work_reuse | 5 | Tests WorkReuse requirement |
networkaccess | 4 | Tests NetworkAccess requirement |
inplace_update | 4 | Tests InplaceUpdateRequirement |
json_schema_invalid | 4 | Tool is intentionally invalid CWL schema |
load_listing | 3 | Tests LoadListingRequirement |
secondary_files | 2 | Tests secondaryFiles handling |
Key semantic tags:
requiredvs absent: whether the test is needed for minimal spec conformancejson_schema_invalid: the.cwlfile itself is intentionally malformed — the test verifies runners reject it. These should NOT be loaded as valid tools.inputs_should_parse(v1.2 only): the tool definition is parseable CWL, but the test fails at runtime (bad inputs, time exceeded, etc.)should_fail(entry field, not a tag): the test expects execution failure — but the tool may still be valid CWL
Suite Sizes
| Version | Total Entries | should_fail | json_schema_invalid | Unique Tools | Unique Workflows |
|---|---|---|---|---|---|
| v1.0 | 197 | 5 | 0 | 85 | 76 |
| v1.1 | 253 | 18 | 0 | 122 | 88 |
| v1.2 | 378 | 41 | 4 | 178 | 138 |
How Galaxy Uses the Conformance Tests
1. API Conformance Tests (Runtime Execution)
The primary use. Galaxy actually runs each conformance test against a live Galaxy server.
Generation: scripts/cwl_conformance_to_test_cases.py reads conformance_tests.yaml
and generates a Python test class per version:
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_0.py (generated, gitignored)
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_1.py (generated, gitignored)
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_2.py (generated, gitignored)
Each generated test method looks like:
@pytest.mark.cwl_conformance
@pytest.mark.cwl_conformance_v1_0
@pytest.mark.required
@pytest.mark.command_line_tool
@pytest.mark.green
def test_conformance_v1_0_cl_basic_generation(self):
"""General test of command line generation"""
self.cwl_populator.run_conformance_test("v1.0",
"General test of command line generation")
Red/Green classification: The script maintains a hardcoded RED_TESTS dict mapping
version -> list of test IDs known to fail in Galaxy. Tests not in the list get
@pytest.mark.green; those in the list get @pytest.mark.red. Each red test is also
annotated with # required or # not required comments.
Runtime: CwlPopulator.run_conformance_test(version, doc) in
lib/galaxy_test/base/populators.py:
- Looks up the test entry by matching
docstring againstconformance_tests.yaml - Resolves tool path and job input path relative to the conformance directory
- Stages input files (uploads to Galaxy via the API)
- If tool — dynamically creates it via
create_tool_from_path()if not already loaded - If workflow — imports via
import_workflow_from_path() - Executes via tool request API (
POST /api/jobs) or workflow invocation - Compares outputs using
cwltest.compare.compare()from thecwltestpackage
2. CI Pipeline
File: .github/workflows/cwl_conformance.yaml
Runs as a GitHub Actions matrix:
matrix:
marker: ['green', 'red and required', 'red and not required']
conformance-version: [cwl_conformance_v1_0, cwl_conformance_v1_1, cwl_conformance_v1_2]
exclude:
- marker: red and required
conformance-version: cwl_conformance_v1_0
- Green tests: must pass — CI failure blocks merge
- Red and required:
continue-on-error: true— tracked but non-blocking - Red and not required:
continue-on-error: true— optional spec features
The CI command:
./run_tests.sh --coverage --skip_flakey_fails -cwl lib/galaxy_test/api/cwl \
-- -m "${{ matrix.marker }} and ${{ matrix.conformance-version }}"
This starts a Galaxy test server, downloads conformance tools on first run, and executes the filtered subset of generated tests.
3. Tool Specification Loading Tests (Unit Tests)
File: test/unit/tool_util/test_cwl_tool_specification_loading.py
Tests that CWL tool files can be parsed into ToolParameterBundleModel objects —
validating the type-mapping pipeline without running a Galaxy server.
Uses conformance_tests.yaml to discover tool files (rather than filesystem walking).
This YAML-driven approach naturally excludes json_schema_invalid tools that are
intentionally unparseable. See _conformance_cwl_tools() which iterates entries via
conformance_tests_gen(), skips json_schema_invalid tags, deduplicates tool paths,
and filters out workflows/$graph documents.
4. CWL Unit Tests
File: test/unit/tool_util/test_cwl.py
Lower-level tests that exercise the ToolProxy creation pipeline directly (cwltool
loading, schema validation, proxy construction). These reference specific conformance
tool files by path (e.g. v1.0/v1.0/cat1-testcli.cwl) and require the conformance
tests to have been downloaded.
conformance_tests_gen() — The Shared Parser
Both the test generation script and runtime test infrastructure need to iterate
conformance test entries, following $import directives. This is handled by
conformance_tests_gen() in lib/galaxy/tool_util/unittest_utils/cwl_data.py:
def conformance_tests_gen(directory, filename="conformance_tests.yaml"):
conformance_tests_path = os.path.join(directory, filename)
with open(conformance_tests_path) as f:
conformance_tests = yaml.safe_load(f)
for conformance_test in conformance_tests:
if "$import" in conformance_test:
import_dir, import_filename = os.path.split(conformance_test["$import"])
yield from conformance_tests_gen(
os.path.join(directory, import_dir), import_filename)
else:
conformance_test["directory"] = directory
yield conformance_test
Imported by:
lib/galaxy_test/base/populators.py— runtime conformance test executionscripts/cwl_conformance_to_test_cases.py— pytest code generationtest/unit/tool_util/test_cwl_tool_specification_loading.py— tool loading tests
Relationship Between Conformance Entries and Tool Files
A single .cwl tool file may appear in multiple conformance entries (different input
jobs testing different behaviors). Conversely, some .cwl files on disk may not be
referenced by any entry (helper files, schema definitions, etc.).
The should_fail field and json_schema_invalid tag create important distinctions:
| Category | should_fail | json_schema_invalid | Tool file valid? | Should load? |
|---|---|---|---|---|
| Normal test | no | no | yes | yes |
| Runtime failure test | yes | no | yes | yes |
| Schema-invalid test | yes | yes | no | no |
Most should_fail tests use valid tool files with bad inputs — the tool itself is
parseable. Only json_schema_invalid entries have intentionally broken .cwl files.
Galaxy-Specific Test Tools (Not From Conformance)
In addition to the downloaded conformance suites, Galaxy maintains its own CWL test tools that are committed to the repository:
| Location | Count | Purpose |
|---|---|---|
test/functional/tools/parameters/cwl_*.cwl | ~10 | CWL parameter type testing |
test/functional/tools/cwl_tools/v1.0_custom/ | ~18 | Galaxy-specific CWL features |
test/functional/tools/galactic_*.cwl | 2 | gx:Interface hint testing |
These are always available regardless of whether conformance tests have been downloaded.