Galaxy Workflow Expression Evaluation Context
Overview
Galaxy’s workflow expression system is CWL-based (Common Workflow Language v1.2.1). Expressions are JavaScript, evaluated via Node.js using cwl_utils.expression.do_eval(). The primary use case in workflow scheduling is when expressions (conditional step execution). value_from expressions are collected but not currently evaluated in the workflow scheduling function.
Key Files
| Purpose | Path | Lines |
|---|---|---|
| Expression evaluation entry point | lib/galaxy/workflow/modules.py | 257-310 |
| Galaxy-to-CWL type conversion | lib/galaxy/workflow/modules.py | 162-239 |
| CWL-to-Galaxy reverse conversion | lib/galaxy/workflow/modules.py | 241-254 |
| Evaluation engine wrapper | lib/galaxy/tools/expressions/evaluation.py | 21-47 |
| JavaScript VM context | lib/galaxy/tools/expressions/cwlNodeEngine.js | 1-46 |
| File properties utility | lib/galaxy/tool_util/cwl/util.py | 42-45 |
Expression Syntax
$(expression)— current CWL expression syntax${expression}— legacy block syntax (Galaxy 23.0), auto-converted to$()at evaluation time
Available Variables
Galaxy’s cwlNodeEngine.js defines these globals in the JavaScript VM context:
| Variable | Contents |
|---|---|
$job | All connected step inputs, converted to CWL types |
$self | Current output/step metadata context |
$runtime | Runtime environment info |
$tmpdir | Temporary directory path |
$outdir | Output directory path |
The inputs alias (as used in $(inputs.foo)) is provided by cwl_utils.expression.do_eval() upstream, not by Galaxy’s own code. Galaxy passes the step_state dict as the jobinput parameter, and cwl_utils makes it accessible as both $job and inputs in the expression context.
In practice, inputs is the primary variable used in workflow when expressions.
Type Mapping: Galaxy to Expression Context
The to_cwl() function (modules.py:162-239) converts Galaxy runtime values into CWL types:
| Galaxy Type | Expression Type | Details |
|---|---|---|
| HDA (extension != expression.json) | File object | Has .path, .location, .format, .basename, .nameroot, .nameext |
| HDA (extension == expression.json) | Deserialized JSON | Parsed via json.load() — can be any JSON type |
| HDCA (list type) | Array | [File, File, ...] recursively converted |
| HDCA (record/other) | Object | {identifier: File, ...} keyed by element_identifier |
| NoReplacement sentinel | null | Missing optional inputs |
| Text parameter | String | Primitive |
| Integer parameter | Number | Primitive |
| Float parameter | Number | Primitive |
| Boolean parameter | Boolean | true/false |
| RuntimeValue | null | Unresolved runtime values (via is_runtime_value() check) |
File Object Properties
When an HDA is converted to a CWL File object:
{
"class": "File",
"location": "step_input://N", // internal reference URI
"format": "bed", // dataset extension
"path": "/path/to/file.bed", // filesystem path
"basename": "file.bed", // filename with extension
"nameroot": "file", // filename without extension
"nameext": ".bed" // extension with dot
}
Properties set by set_basename_and_derived_properties() from tool_util/cwl/util.py.
expression.json Special Handling
Datasets with extension expression.json are not wrapped as File objects. Instead they are deserialized directly via json.load(), allowing structured data (numbers, booleans, objects, arrays, null) to flow between expression-producing and expression-consuming steps.
Context Construction Flow
evaluate_value_from_expressions() (modules.py:257-310):
- Collect expression mappings —
when_expressionfrom step,value_fromfrom each input - If neither exists, return empty dict early (line 268)
- Build
hda_referenceslist (tracks HDA objects for round-trip viastep_input://NURIs) - Build
step_statedict:- From
extra_step_state: each value converted viato_cwl() - From
execution_state.inputs: each value converted viato_cwl()
- From
- Call
do_eval(when_expression, step_state)— JavaScript evaluation - Convert result back via
from_cwl()— resolvesstep_input://NURIs back to HDAs - Validate result is boolean (
isinstance(result, bool))
Note: value_from expressions are collected (line 265) but not evaluated in this function. Only when_expression is evaluated here. The collected value_from_expressions dict is currently unused in this code path.
when_expression vs value_from
| Aspect | when_expression | value_from |
|---|---|---|
| Scope | Step-level | Per-input |
| Purpose | Conditional execution | Compute input value |
| Return type | Must be boolean | Any type |
| Model field | WorkflowStep.when_expression | WorkflowStepInput.value_from |
| Evaluated in | evaluate_value_from_expressions() | Not evaluated in this function |
Note: value_from is a model field on WorkflowStepInput and is collected during expression evaluation setup, but the actual evaluation of value_from expressions does not occur in evaluate_value_from_expressions(). This may be handled elsewhere or may be incomplete.
Dataset Readiness Pre-checks
Before to_cwl() deserializes an expression.json or wraps an HDA as a File object, three checks run (modules.py:176-191):
in_ready_state()— NOT in {NEW, UPLOAD, QUEUED, RUNNING, SETTING_METADATA}. If not ready →DelayedWorkflowEvaluation(retry later)is_ok— state must be specifically OK. If ready but not ok →FailWorkflowEvaluationwithInvocationFailureDatasetFailedpurged— dataset must not be purged →FailWorkflowEvaluationwithInvocationFailureDatasetFailed
Error Handling
| Condition | Exception | Invocation Failure Reason |
|---|---|---|
| Expression syntax error | FailWorkflowEvaluation | expression_evaluation_failed (details hidden for security) |
| When result not boolean | FailWorkflowEvaluation | when_not_boolean (includes actual type name) |
| Dataset not ready | DelayedWorkflowEvaluation | Step delayed, retried next iteration |
| Dataset failed/purged | FailWorkflowEvaluation | dataset_failed |
Security note: expression evaluation failure details are not exposed to users to avoid leaking secrets that may appear in expressions.
HDA Reference System
The step_input://N URI scheme enables serialization while maintaining object identity:
to_cwl(): HDAs are assignedstep_input://Nlocations and appended tohda_referenceslist- JavaScript evaluates using these URIs as opaque location strings
from_cwl(): File objects withstep_input://locations are resolved back to the original HDA viahda_references[N]
from_cwl() Limitations
from_cwl() (modules.py:241-254) handles:
- File dicts with
"class"and"location"keys → resolved viaprogress.raw_to_galaxy() - Lists → recursively converted
- Primitives (strings, numbers, booleans, null) → passed through
It does not handle arbitrary dict structures — dicts without "class" and "location" keys raise NotImplementedError. This means complex object results from expressions (e.g. {key: value}) cannot be converted back to Galaxy types.
Expression Examples
// Boolean conditional — when expression
$(inputs.should_run)
// Static skip
$(false)
// Access file properties
$(inputs.input_file.basename)
$(inputs.input_file.nameroot)
// Block expression with computation
${return parseInt($job.input1)}
// Nested property access on deserialized expression.json
$(inputs.param_output) // if expression.json contains {"value": 42}, this is {"value": 42}
JavaScript Engine Details
- Engine: Node.js via
cwlNodeEngine.js(46 lines) - Isolation:
vm.runInNewContext()— isolated V8 VM context - Block expressions (
${...}): wrapped as IIFE{return function() {...}();} - Inline expressions (
$(...)): wrapped as{return expression;} InlineJavascriptRequirementcan provideexpressionLib— additional JS prepended before evaluation- CWL version: v1.2.1