Galaxy Workflow Expression Evaluation Context

Overview

Galaxy’s workflow expression system is CWL-based (Common Workflow Language v1.2.1). Expressions are JavaScript, evaluated via Node.js using cwl_utils.expression.do_eval(). The primary use case in workflow scheduling is when expressions (conditional step execution). value_from expressions are collected but not currently evaluated in the workflow scheduling function.

Key Files

Purpose	Path	Lines
Expression evaluation entry point	`lib/galaxy/workflow/modules.py`	257-310
Galaxy-to-CWL type conversion	`lib/galaxy/workflow/modules.py`	162-239
CWL-to-Galaxy reverse conversion	`lib/galaxy/workflow/modules.py`	241-254
Evaluation engine wrapper	`lib/galaxy/tools/expressions/evaluation.py`	21-47
JavaScript VM context	`lib/galaxy/tools/expressions/cwlNodeEngine.js`	1-46
File properties utility	`lib/galaxy/tool_util/cwl/util.py`	42-45

Expression Syntax

$(expression) — current CWL expression syntax
${expression} — legacy block syntax (Galaxy 23.0), auto-converted to $() at evaluation time

Available Variables

Galaxy’s cwlNodeEngine.js defines these globals in the JavaScript VM context:

Variable	Contents
`$job`	All connected step inputs, converted to CWL types
`$self`	Current output/step metadata context
`$runtime`	Runtime environment info
`$tmpdir`	Temporary directory path
`$outdir`	Output directory path

The inputs alias (as used in $(inputs.foo)) is provided by cwl_utils.expression.do_eval() upstream, not by Galaxy’s own code. Galaxy passes the step_state dict as the jobinput parameter, and cwl_utils makes it accessible as both $job and inputs in the expression context.

In practice, inputs is the primary variable used in workflow when expressions.

Type Mapping: Galaxy to Expression Context

The to_cwl() function (modules.py:162-239) converts Galaxy runtime values into CWL types:

Galaxy Type	Expression Type	Details
HDA (extension != expression.json)	File object	Has `.path`, `.location`, `.format`, `.basename`, `.nameroot`, `.nameext`
HDA (extension == expression.json)	Deserialized JSON	Parsed via `json.load()` — can be any JSON type
HDCA (list type)	Array	`[File, File, ...]` recursively converted
HDCA (record/other)	Object	`{identifier: File, ...}` keyed by element_identifier
NoReplacement sentinel	`null`	Missing optional inputs
Text parameter	String	Primitive
Integer parameter	Number	Primitive
Float parameter	Number	Primitive
Boolean parameter	Boolean	`true`/`false`
RuntimeValue	`null`	Unresolved runtime values (via `is_runtime_value()` check)

File Object Properties

When an HDA is converted to a CWL File object:

{
    "class": "File",
    "location": "step_input://N",   // internal reference URI
    "format": "bed",                 // dataset extension
    "path": "/path/to/file.bed",    // filesystem path
    "basename": "file.bed",         // filename with extension
    "nameroot": "file",             // filename without extension
    "nameext": ".bed"               // extension with dot
}

Properties set by set_basename_and_derived_properties() from tool_util/cwl/util.py.

expression.json Special Handling

Datasets with extension expression.json are not wrapped as File objects. Instead they are deserialized directly via json.load(), allowing structured data (numbers, booleans, objects, arrays, null) to flow between expression-producing and expression-consuming steps.

Context Construction Flow

evaluate_value_from_expressions() (modules.py:257-310):

Collect expression mappings — when_expression from step, value_from from each input
If neither exists, return empty dict early (line 268)
Build hda_references list (tracks HDA objects for round-trip via step_input://N URIs)
Build step_state dict:
- From extra_step_state: each value converted via to_cwl()
- From execution_state.inputs: each value converted via to_cwl()
Call do_eval(when_expression, step_state) — JavaScript evaluation
Convert result back via from_cwl() — resolves step_input://N URIs back to HDAs
Validate result is boolean (isinstance(result, bool))

Note: value_from expressions are collected (line 265) but not evaluated in this function. Only when_expression is evaluated here. The collected value_from_expressions dict is currently unused in this code path.

when_expression vs value_from

Aspect	when_expression	value_from
Scope	Step-level	Per-input
Purpose	Conditional execution	Compute input value
Return type	Must be boolean	Any type
Model field	`WorkflowStep.when_expression`	`WorkflowStepInput.value_from`
Evaluated in	`evaluate_value_from_expressions()`	Not evaluated in this function

Note: value_from is a model field on WorkflowStepInput and is collected during expression evaluation setup, but the actual evaluation of value_from expressions does not occur in evaluate_value_from_expressions(). This may be handled elsewhere or may be incomplete.

Dataset Readiness Pre-checks

Before to_cwl() deserializes an expression.json or wraps an HDA as a File object, three checks run (modules.py:176-191):

in_ready_state() — NOT in {NEW, UPLOAD, QUEUED, RUNNING, SETTING_METADATA}. If not ready → DelayedWorkflowEvaluation (retry later)
is_ok — state must be specifically OK. If ready but not ok → FailWorkflowEvaluation with InvocationFailureDatasetFailed
purged — dataset must not be purged → FailWorkflowEvaluation with InvocationFailureDatasetFailed

Error Handling

Condition	Exception	Invocation Failure Reason
Expression syntax error	`FailWorkflowEvaluation`	`expression_evaluation_failed` (details hidden for security)
When result not boolean	`FailWorkflowEvaluation`	`when_not_boolean` (includes actual type name)
Dataset not ready	`DelayedWorkflowEvaluation`	Step delayed, retried next iteration
Dataset failed/purged	`FailWorkflowEvaluation`	`dataset_failed`

Security note: expression evaluation failure details are not exposed to users to avoid leaking secrets that may appear in expressions.

HDA Reference System

The step_input://N URI scheme enables serialization while maintaining object identity:

to_cwl(): HDAs are assigned step_input://N locations and appended to hda_references list
JavaScript evaluates using these URIs as opaque location strings
from_cwl(): File objects with step_input:// locations are resolved back to the original HDA via hda_references[N]

from_cwl() Limitations

from_cwl() (modules.py:241-254) handles:

File dicts with "class" and "location" keys → resolved via progress.raw_to_galaxy()
Lists → recursively converted
Primitives (strings, numbers, booleans, null) → passed through

It does not handle arbitrary dict structures — dicts without "class" and "location" keys raise NotImplementedError. This means complex object results from expressions (e.g. {key: value}) cannot be converted back to Galaxy types.

Expression Examples

// Boolean conditional — when expression
$(inputs.should_run)

// Static skip
$(false)

// Access file properties
$(inputs.input_file.basename)
$(inputs.input_file.nameroot)

// Block expression with computation
${return parseInt($job.input1)}

// Nested property access on deserialized expression.json
$(inputs.param_output)  // if expression.json contains {"value": 42}, this is {"value": 42}

JavaScript Engine Details

Engine: Node.js via cwlNodeEngine.js (46 lines)
Isolation: vm.runInNewContext() — isolated V8 VM context
Block expressions (${...}): wrapped as IIFE {return function() {...}();}
Inline expressions ($(...)): wrapped as {return expression;}
InlineJavascriptRequirement can provide expressionLib — additional JS prepended before evaluation
CWL version: v1.2.1

Component Galaxy Workflow Expression Context