DEPENDENCIES_CWL_CONFORMANCE_TESTS

CWL Conformance Tests in Galaxy

How Galaxy downloads, structures, and uses the official CWL conformance test suites for validating its CWL runtime.


What Are CWL Conformance Tests?

The Common Workflow Language project maintains official conformance test suites for each CWL spec version. These suites are the canonical way for CWL implementations (cwltool, Toil, Arvados, Galaxy, etc.) to prove spec compliance. Each suite lives in its own GitHub repository:

VersionRepositoryBranch
v1.0common-workflow-language/common-workflow-languagemain
v1.1common-workflow-language/cwl-v1.1main
v1.2common-workflow-language/cwl-v1.2main

A conformance suite consists of:

Galaxy tests against all three versions.


How Galaxy Downloads the Tests

Shell Script: scripts/update_cwl_conformance_tests.sh

The conformance suites are not vendored or submoduled. They are downloaded on-demand and excluded from git via .gitignore:

# CWL conformance tests
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_?.py
test/functional/tools/cwl_tools/v1.?/

The shell script downloads each version as a zip from GitHub:

wget https://github.com/common-workflow-language/${repo}/archive/main.zip

For each version it:

  1. Extracts the zip
  2. Copies conformance_tests.yaml into test/functional/tools/cwl_tools/v${version}/
  3. Copies the test tools/data directory alongside it
  4. Runs scripts/cwl_conformance_to_test_cases.py to generate Python test files

The v1.0 layout is slightly different from v1.1/v1.2 due to the older repo structure:

VersionConformance YAML source pathTests dir sourceLocal tests dir
v1.0v1.0/conformance_test_v1.0.yamlv1.0/v1.0/cwl_tools/v1.0/v1.0/
v1.1conformance_tests.yamltests/cwl_tools/v1.1/tests/
v1.2conformance_tests.yamltests/cwl_tools/v1.2/tests/

Resulting Directory Structure

test/functional/tools/cwl_tools/
├── v1.0_custom/                    # committed — Galaxy-specific CWL test tools
├── v1.0/                           # gitignored — downloaded
│   ├── conformance_tests.yaml
│   └── v1.0/                       # tool/workflow files + test data
│       ├── bwa-mem-tool.cwl
│       ├── cat1-testcli.cwl
│       ├── bwa-mem-job.json
│       └── ...
├── v1.1/                           # gitignored — downloaded
│   ├── conformance_tests.yaml
│   └── tests/
│       ├── bwa-mem-tool.cwl
│       └── ...
└── v1.2/                           # gitignored — downloaded
    ├── conformance_tests.yaml
    └── tests/
        ├── bwa-mem-tool.cwl
        ├── mixed-versions/
        │   └── test-index.yaml     # sub-index, referenced via $import
        ├── string-interpolation/
        │   └── test-index.yaml
        └── ...

Makefile Targets

make generate-cwl-conformance-tests   # download + generate
make update-cwl-conformance-tests     # clean + download + generate
make clean-cwl-conformance-tests      # remove downloaded dirs

Structure of conformance_tests.yaml

Each conformance_tests.yaml is a YAML list of test entries. Every entry describes one test case — a tool or workflow to run with specific inputs and expected outputs.

Entry Fields

FieldRequiredDescription
idyesUnique identifier (no spaces), e.g. cl_basic_generation
docyesUnique human-readable description, used as lookup key at runtime
toolyesRelative path to the .cwl tool or workflow file
jobyesRelative path to the input JSON file (or null / tests/empty.json)
outputyesExpected output values for comparison
tagsyesList of classification tags (see below)
should_failnoWhen true, the test expects the runner to report failure

Example Entries

A standard passing test:

- id: cl_basic_generation
  doc: General test of command line generation
  tool: tests/bwa-mem-tool.cwl
  job: tests/bwa-mem-job.json
  output:
    args: [bwa, mem, -t, '2', -I, '1,2,3,4', -m, '3',
      chr20.fa,
      example_human_Illumina.pe_1.fastq,
      example_human_Illumina.pe_2.fastq]
  tags: [ required, command_line_tool ]

A should_fail test (runtime failure expected):

- job: tests/empty.json
  tool: tests/echo-tool.cwl
  should_fail: true
  id: any_without_defaults_unspecified_fails
  doc: Test Any without defaults, unspecified, should fail.
  tags: [ command_line_tool, required ]

An intentionally invalid tool (schema-level invalid CWL):

- job: null
  tool: invalid-tool-v10.cwl
  id: invalid_syntax_v10_uses_v12_tool
  doc: test tool with v1.2 syntax marked as v1.0 (should fail)
  should_fail: true
  tags: [ command_line_tool, json_schema_invalid ]

$import Directives

v1.2’s conformance_tests.yaml uses $import to pull in sub-index files from subdirectories. These are standard CWL schema-salad $import references:

# In conformance_tests.yaml:
- $import: tests/string-interpolation/test-index.yaml
- $import: tests/conditionals/test-index.yaml
- $import: tests/secondaryfiles/test-index.yaml
- $import: tests/mixed-versions/test-index.yaml
- $import: tests/loadContents/test-index.yaml
- $import: tests/iwd/test-index.yaml
- $import: tests/scatter/test-index.yaml

Each test-index.yaml has the same structure as the top-level file. Tool paths within imported indexes are relative to that sub-directory.

v1.0 and v1.1 do not use $import at the top level.

Tags

Tags classify tests by feature area. Galaxy uses them for pytest markers and CI matrix filtering. Complete tag inventory (across all 3 versions, 828 total entries):

TagCountMeaning
command_line_tool428Tests a CommandLineTool
workflow369Tests a Workflow
inline_javascript302Requires InlineJavascriptRequirement
required192Required for minimal conformance
scatter82Tests scatter patterns
expression_tool81Tests an ExpressionTool
initial_work_dir69Tests InitialWorkDirRequirement
shell_command64Tests ShellCommandRequirement
step_input53Tests workflow step input features
multiple_input51Tests MultipleInputFeatureRequirement
conditional46Tests conditional workflow steps (v1.2)
inputs_should_parse33Tool definition is valid CWL even though test should fail
subworkflow31Tests SubworkflowFeatureRequirement
docker30Requires DockerRequirement
resource27Tests ResourceRequirement
schema_def18Tests SchemaDefRequirement
timelimit18Tests ToolTimeLimit requirement
env_var12Tests EnvVarRequirement
format_checking8Tests format validation
input_object_requirements6Tests input object requirements
work_reuse5Tests WorkReuse requirement
networkaccess4Tests NetworkAccess requirement
inplace_update4Tests InplaceUpdateRequirement
json_schema_invalid4Tool is intentionally invalid CWL schema
load_listing3Tests LoadListingRequirement
secondary_files2Tests secondaryFiles handling

Key semantic tags:

Suite Sizes

VersionTotal Entriesshould_failjson_schema_invalidUnique ToolsUnique Workflows
v1.0197508576
v1.125318012288
v1.2378414178138

How Galaxy Uses the Conformance Tests

1. API Conformance Tests (Runtime Execution)

The primary use. Galaxy actually runs each conformance test against a live Galaxy server.

Generation: scripts/cwl_conformance_to_test_cases.py reads conformance_tests.yaml and generates a Python test class per version:

lib/galaxy_test/api/cwl/test_cwl_conformance_v1_0.py  (generated, gitignored)
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_1.py  (generated, gitignored)
lib/galaxy_test/api/cwl/test_cwl_conformance_v1_2.py  (generated, gitignored)

Each generated test method looks like:

@pytest.mark.cwl_conformance
@pytest.mark.cwl_conformance_v1_0
@pytest.mark.required
@pytest.mark.command_line_tool
@pytest.mark.green
def test_conformance_v1_0_cl_basic_generation(self):
    """General test of command line generation"""
    self.cwl_populator.run_conformance_test("v1.0",
        "General test of command line generation")

Red/Green classification: The script maintains a hardcoded RED_TESTS dict mapping version -> list of test IDs known to fail in Galaxy. Tests not in the list get @pytest.mark.green; those in the list get @pytest.mark.red. Each red test is also annotated with # required or # not required comments.

Runtime: CwlPopulator.run_conformance_test(version, doc) in lib/galaxy_test/base/populators.py:

  1. Looks up the test entry by matching doc string against conformance_tests.yaml
  2. Resolves tool path and job input path relative to the conformance directory
  3. Stages input files (uploads to Galaxy via the API)
  4. If tool — dynamically creates it via create_tool_from_path() if not already loaded
  5. If workflow — imports via import_workflow_from_path()
  6. Executes via tool request API (POST /api/jobs) or workflow invocation
  7. Compares outputs using cwltest.compare.compare() from the cwltest package

2. CI Pipeline

File: .github/workflows/cwl_conformance.yaml

Runs as a GitHub Actions matrix:

matrix:
  marker: ['green', 'red and required', 'red and not required']
  conformance-version: [cwl_conformance_v1_0, cwl_conformance_v1_1, cwl_conformance_v1_2]
  exclude:
    - marker: red and required
      conformance-version: cwl_conformance_v1_0

The CI command:

./run_tests.sh --coverage --skip_flakey_fails -cwl lib/galaxy_test/api/cwl \
  -- -m "${{ matrix.marker }} and ${{ matrix.conformance-version }}"

This starts a Galaxy test server, downloads conformance tools on first run, and executes the filtered subset of generated tests.

3. Tool Specification Loading Tests (Unit Tests)

File: test/unit/tool_util/test_cwl_tool_specification_loading.py

Tests that CWL tool files can be parsed into ToolParameterBundleModel objects — validating the type-mapping pipeline without running a Galaxy server.

Uses conformance_tests.yaml to discover tool files (rather than filesystem walking). This YAML-driven approach naturally excludes json_schema_invalid tools that are intentionally unparseable. See _conformance_cwl_tools() which iterates entries via conformance_tests_gen(), skips json_schema_invalid tags, deduplicates tool paths, and filters out workflows/$graph documents.

4. CWL Unit Tests

File: test/unit/tool_util/test_cwl.py

Lower-level tests that exercise the ToolProxy creation pipeline directly (cwltool loading, schema validation, proxy construction). These reference specific conformance tool files by path (e.g. v1.0/v1.0/cat1-testcli.cwl) and require the conformance tests to have been downloaded.


conformance_tests_gen() — The Shared Parser

Both the test generation script and runtime test infrastructure need to iterate conformance test entries, following $import directives. This is handled by conformance_tests_gen() in lib/galaxy/tool_util/unittest_utils/cwl_data.py:

def conformance_tests_gen(directory, filename="conformance_tests.yaml"):
    conformance_tests_path = os.path.join(directory, filename)
    with open(conformance_tests_path) as f:
        conformance_tests = yaml.safe_load(f)

    for conformance_test in conformance_tests:
        if "$import" in conformance_test:
            import_dir, import_filename = os.path.split(conformance_test["$import"])
            yield from conformance_tests_gen(
                os.path.join(directory, import_dir), import_filename)
        else:
            conformance_test["directory"] = directory
            yield conformance_test

Imported by:


Relationship Between Conformance Entries and Tool Files

A single .cwl tool file may appear in multiple conformance entries (different input jobs testing different behaviors). Conversely, some .cwl files on disk may not be referenced by any entry (helper files, schema definitions, etc.).

The should_fail field and json_schema_invalid tag create important distinctions:

Categoryshould_failjson_schema_invalidTool file valid?Should load?
Normal testnonoyesyes
Runtime failure testyesnoyesyes
Schema-invalid testyesyesnono

Most should_fail tests use valid tool files with bad inputs — the tool itself is parseable. Only json_schema_invalid entries have intentionally broken .cwl files.


Galaxy-Specific Test Tools (Not From Conformance)

In addition to the downloaded conformance suites, Galaxy maintains its own CWL test tools that are committed to the repository:

LocationCountPurpose
test/functional/tools/parameters/cwl_*.cwl~10CWL parameter type testing
test/functional/tools/cwl_tools/v1.0_custom/~18Galaxy-specific CWL features
test/functional/tools/galactic_*.cwl2gx:Interface hint testing

These are always available regardless of whether conformance tests have been downloaded.