CWL Workflow State: Loading, Persistence, and Execution Trace
1. Summary
CWL workflows go through three distinct phases where state is created, persisted, and recovered:
-
Import — CWL file is parsed by
WorkflowProxy, converted to a Galaxy-format dict, and loaded intoWorkflowStepmodel objects. Each tool step gets a minimaltool_inputsdict (nulls for connected inputs). -
Invocation Request Creation —
populate_module_and_state()injects modules and computes runtime state, thenworkflow_run_config_to_request()callsencode_runtime_state()to serialize each step’s state intoWorkflowRequestStepState.value. -
Execution — The scheduler recovers state via
decode_runtime_state(), thenToolModule.execute()builds the actual input dict from live step connections and dispatches jobs.
For CWL tools, the tool_inputs persisted in the database is essentially a placeholder — it contains {input_name: None} entries. The real input data comes from step connections at execution time. The problem is that encode_runtime_state() and decode_runtime_state() use params_to_strings/params_from_strings which assume Galaxy parameter objects (which CWL tools don’t have).
2. Detailed Trace
Phase 1: CWL Workflow Import
Entry Point
API receives a CWL file path. The workflow manager normalizes it:
WorkflowsManager.normalize_workflow_format()—/lib/galaxy/managers/workflows.py:622- Detects
workflow_class == "Workflow"(CWL class) at line 653 - Creates
WorkflowProxyviaworkflow_proxy(workflow_path)— line 662/668 - Calls
wf_proxy.to_dict()— line 673 — to get Galaxy-format dict
- Detects
WorkflowProxy.to_dict()
/lib/galaxy/tool_util/cwl/parser.py:697
Iterates over CWL workflow inputs (creating data_input/parameter_input steps) and CWL workflow steps (creating tool/subworkflow steps).
For each tool step, calls ToolStepProxy.to_dict(input_connections):
# /lib/galaxy/tool_util/cwl/parser.py:1103-1126
def to_dict(self, input_connections):
tool_state: ToolStateType = {}
for input_name in input_connections.keys():
tool_state[input_name] = None # <-- null entries for connected inputs
return {
"id": self._index,
"tool_uuid": self.tool_proxy.uuid,
"label": self.label,
"type": "tool",
"input_connections": input_connections,
"inputs": self.inputs_to_dicts(), # <-- scatter, valueFrom, defaults
"workflow_outputs": outputs,
}
Key point: The tool_state dict in the step dict only contains {input_name: None} for connected inputs. No "tool_state" key is set in the returned dict at all — the variable is created but not included in the return value rval. So d.get("tool_state") in from_dict() will return None.
Wait — re-reading more carefully. The variable tool_state is created but indeed NOT included in the rval dict at line 1113. So d.get("tool_state") returns None during from_dict().
Step Inputs (separate from tool_state)
The "inputs" key at line 1121 contains InputProxy.to_dict() dicts:
# /lib/galaxy/tool_util/cwl/parser.py:1021-1036
def to_dict(self):
as_dict = {"name": self.input_name}
# merge_type, scatter_type, value_from, default
These get stored as WorkflowStepInput objects (line 1916-1929 in workflows.py), including default_value, value_from, and scatter_type.
Module and Step Creation
__module_from_dict() — /lib/galaxy/managers/workflows.py:1880
- Creates
model.WorkflowStep()— line 1892 - Creates module via
module_factory.from_dict(trans, step_dict, ...)— line 1899 - Calls
module.save_to_step(step, ...)— line 1901
from_dict for ToolModule
ToolModule.from_dict() — /lib/galaxy/workflow/modules.py:2194
- Calls
super().from_dict(trans, d, ...)which isWorkflowModule.from_dict()at line 481 - This calls
module.recover_state(d.get("tool_state"), ...)— line 484 - For CWL steps,
d.get("tool_state")isNone(not included inToolStepProxy.to_dict())
recover_state for CWL
ToolModule.recover_state() — /lib/galaxy/workflow/modules.py:2476
- Calls
super().recover_state(state, **kwds)—WorkflowModule.recover_state()at line 543 from_tool_formdefaults toFalse, sostate = self.step_state_to_tool_state(state or {})— line 562stateisNone, so becomes{}
get_inputs()returnsself.tool.inputs— for CWL tools withhas_galaxy_inputs=False, this is{}if inputs:isFalse(empty dict), so takes the else branch — line 567-570:inputs = safe_loads(state) or {} # safe_loads({}) returns {}, or {} -> {} self.validate_state(inputs) # no-op for ToolModule self.state.inputs = inputs # state.inputs = {}
save_to_step
ToolModule.save_to_step() — line 2271 calls super().save_to_step() (line 497):
step.tool_inputs = self.get_state()
get_state() — line 523:
get_inputs()returns{}for CWLif inputs:is False- Returns
self.state.inputs— which is{}
Result: step.tool_inputs in the database = {} (empty dict) for CWL tool steps.
The step’s WorkflowStepInput objects (with default_value, value_from, scatter_type) are stored separately in the workflow_step_input table.
Phase 2: Invocation Request Creation
Entry Point
queue_invoke() — /lib/galaxy/workflow/run.py:128
populate_module_and_state(trans, workflow, param_map, ...)— line 138workflow_run_config_to_request(trans, workflow_run_config, workflow)— line 144
populate_module_and_state
/lib/galaxy/workflow/modules.py:3163
-
module_injector.inject_all(workflow, param_map=param_map)— line 3171- For each step:
module_injector.inject(step, ...)— line 3101module = step.module = module_factory.from_workflow_step(trans, step)— line 3120- This calls
WorkflowModule.from_workflow_step()— line 489:module.recover_state(step.tool_inputs, from_tool_form=False) - For CWL:
step.tool_inputsis{},recover_state({})setsself.state.inputs = {}
- For each step:
-
For each step:
module_injector.compute_runtime_state(step, step_args=step_args)— line 3174- Calls
step.module.compute_runtime_state(trans, step, step_args)— line 3154 ToolModule.compute_runtime_state()— line 2559:- Calls
super().compute_runtime_state(trans, step, step_updates, replace_default_values=False) - Parent
WorkflowModule.compute_runtime_state()— line 627:state = self.get_runtime_state()→state.inputs = self.state.inputs→{}visit_input_values(self.get_runtime_inputs(step, ...), state.inputs, update_value, ...)— line 659get_runtime_inputs()for CWL returns{}→ visit is a no-op (no inputs to visit)
- Result:
state.inputs = {},step_errors = {}
- Calls
step.state = state— line 3155
- Calls
After populate_module_and_state, each CWL step has step.state.inputs = {}.
workflow_run_config_to_request
/lib/galaxy/workflow/run_request.py:544
For every step:
serializable_runtime_state = step.module.encode_runtime_state(step, step.state)
step_state = WorkflowRequestStepState()
step_state.value = serializable_runtime_state
encode_runtime_state() — /lib/galaxy/workflow/modules.py:682:
def encode_runtime_state(self, step, runtime_state: DefaultToolState):
return runtime_state.encode(Bunch(inputs=self.get_runtime_inputs(step)), self.trans.app)
DefaultToolState.encode() — /lib/galaxy/tools/__init__.py:831:
def encode(self, tool, app, nested=False):
value = params_to_strings(tool.inputs, self.inputs, app, nested=nested)
value["__page__"] = self.page
value["__rerun_remap_job_id__"] = self.rerun_remap_job_id
return value
For CWL: tool.inputs = {} (from get_runtime_inputs()), self.inputs = {}.
params_to_strings({}, {}, app) — /lib/galaxy/tools/parameters/__init__.py:330:
- Iterates over
param_values.items()— empty dict — so returns{} - Then
value["__page__"] = 0,value["__rerun_remap_job_id__"] = None
Result: WorkflowRequestStepState.value = {"__page__": 0, "__rerun_remap_job_id__": None}
This currently WORKS when state.inputs is empty (no NoReplacement values). But it would FAIL if compute_runtime_state populated any NoReplacement values into state.inputs.
When does NoReplacement appear? In the replace_default_values=True path (line 640-641 in compute_runtime_state):
if replace_default_values and step:
state.inputs = step.state.inputs
This is called with replace_default_values=True during EXECUTION (line 2642-2643 in ToolModule.execute). But during invocation request creation, replace_default_values defaults to False, so NoReplacement values are NOT injected during this phase — the issue described in the research doc would only arise if compute_runtime_state was called differently.
Actually, re-reading the code more carefully: the update_value callback at line 646 returns NO_REPLACEMENT when no step input or no default. But this is used with no_replacement_value=NO_REPLACEMENT in visit_input_values — which means those values are NOT stored in state.inputs when the return value matches the sentinel. So the state stays clean of NoReplacement during this phase. The encode_runtime_state crash described in the research doc must happen in a different scenario — perhaps when replace_default_values=True is used with default values that resolve to NoReplacement, or in a case where CWL step inputs DO have some Galaxy runtime inputs.
Let me re-examine: for CWL tools, get_runtime_inputs() returns {}, so visit_input_values() visits nothing. The only way NoReplacement could appear is if state.inputs already contained them from a prior operation. Since state.inputs = {} for CWL, and visit_input_values with empty params is a no-op, the encode path currently works for CWL steps with empty state.
The real question is: if/when state.inputs contains actual values (e.g., from step defaults that were loaded differently), those would be passed through params_to_strings with empty params, and json.dumps would be called on them. Most values would survive this, but complex objects or sentinels would not.
Phase 3: Workflow Execution
Entry Point — Scheduler Picks Up Invocation
schedule() — /lib/galaxy/workflow/run.py:66
- Creates
WorkflowInvoker(trans, workflow, workflow_run_config, workflow_invocation)— line 91 invoker.invoke()— line 100
WorkflowInvoker.invoke() — Recovering State
remaining_steps() — /lib/galaxy/workflow/run.py:434:
-
module_injector.inject_all(workflow, ...)— line 445- For each step:
module_factory.from_workflow_step(trans, step)— creates module,recover_state(step.tool_inputs, ...)→state.inputs = {}
- For each step:
-
module_injector.compute_runtime_state(step, step_args=step_args)— line 449- Same as Phase 2: for CWL,
state.inputs = {}
- Same as Phase 2: for CWL,
-
Decode the persisted runtime state — line 455-457:
runtime_state = step_states[step_id].value # from WorkflowRequestStepState step.state = step.module.decode_runtime_state(step, runtime_state)
ToolModule.decode_runtime_state() — line 2578:
state = super().decode_runtime_state(step, runtime_state)
WorkflowModule.decode_runtime_state() — line 686:
state = DefaultToolState()
state.decode(runtime_state, Bunch(inputs=self.get_runtime_inputs(step)), self.trans.app)
return state
DefaultToolState.decode() — line 840:
values = safe_loads(values) or {}
self.page = values.pop("__page__") if "__page__" in values else 0
self.rerun_remap_job_id = values.pop("__rerun_remap_job_id__") ...
self.inputs = params_from_strings(tool.inputs, values, app, ignore_errors=True)
For CWL: runtime_state = {"__page__": 0, "__rerun_remap_job_id__": None}. After popping those keys, values = {}. params_from_strings({}, {}, app) returns {}.
Result: step.state.inputs = {} after decode.
_invoke_step → ToolModule.execute()
_invoke_step() — line 364:
invocation_step.workflow_step.module.execute(trans, progress, invocation_step, ...)
ToolModule.execute() — line 2626:
First, it recomputes runtime state with replace_default_values=True:
self.state, _ = self.compute_runtime_state(
trans, step, step_updates=progress.param_map.get(step.id), replace_default_values=True
)
step.state = self.state
This time replace_default_values=True, so line 640-641: state.inputs = step.state.inputs (still {}). The visit_input_values with empty runtime inputs is still a no-op. step.state.inputs remains {}.
Then checks use_cwl_path:
use_cwl_path = tool.tool_type in CWL_TOOL_TYPES and not tool.has_galaxy_inputs
CWL path (line 2657-2698):
-
cwl_input_dict = build_cwl_input_dict(step, progress, trans)— line 2660- Builds dict from live step connections (not from persisted state!)
- For each
input_connections_by_name, resolves the replacement fromprogress - Fills defaults from
step.inputs[].default_value - Result:
{"input1": {"src": "hda", "id": 5}, "threshold": 0.5, ...}
-
evaluate_cwl_value_from_expressions(...)— line 2663 -
find_cwl_scatter_collections(...)→ scatter expansion — line 2666 -
Build
param_combinations— line 2680-2693:- Each slice_dict is a dict like
{"input1": {"src": "hda", "id": 5}, "__when_value__": true}
- Each slice_dict is a dict like
-
Creates MappingParameters WITHOUT validated_param_combinations:
mapping_params = MappingParameters(param_template, param_combinations)— line 2872
-
Calls
execute(trans, tool, mapping_params, ...)— line 2882
execute() → _execute() → job creation
/lib/galaxy/tools/execute.py:176
In execute_single_job() — line 213:
if execution_slice.validated_param_combination:
tool_state = execution_slice.validated_param_combination.input_state
job.tool_state = tool_state
Since validated_param_combinations was not provided (None), validated_param_combination is None for every slice. job.tool_state is never set — remains None.
Job Execution — CwlToolEvaluator
set_compute_environment() — /lib/galaxy/tools/evaluation.py:167
Since UserToolEvaluator.param_dict_style = "json" (not “regular”), takes the else branch at line 227:
tool_state: Optional[JobInternalToolState] = None
if job.tool_state: # <-- None, so False
tool_state = JobInternalToolState(job.tool_state)
self.param_dict = self.build_param_dict(
incoming, inp_data, out_data,
output_collections=out_collections,
validated_tool_state=tool_state, # <-- None
)
UserToolEvaluator.build_param_dict() — line 1122:
if validated_tool_state is not None:
# runtimeify() path — the correct new path
...
else:
# deprecated to_cwl() fallback
from galaxy.workflow.modules import to_cwl
log.info("Building CWL style inputs using deprecated to_cwl function...")
cwl_style_inputs = to_cwl(incoming, hda_references=hda_references, compute_environment=compute_environment)
Result: Falls back to deprecated to_cwl() path because job.tool_state was never set.
3. Assessment of CWL_WORKFLOW_STATE_VALIDATION_RESEARCH.md
Accuracy of the Research Doc
The research doc’s analysis is largely correct with some refinements:
1. The encode_runtime_state crash
The doc states NoReplacement in state.inputs causes the crash. Based on my trace:
-
During invocation request creation (
populate_module_and_statethenworkflow_run_config_to_request),compute_runtime_stateis called withreplace_default_values=False. For CWL tools,get_runtime_inputs()returns{}, sovisit_input_valuesis a no-op andstate.inputsstays as{}. NoNoReplacementsentinels appear. -
The current code actually does not crash for CWL steps with empty state. The crash would only happen if:
- CWL steps somehow got
NoReplacementvalues in their state (perhaps via a different code path I haven’t traced) - Or if
state.inputscontained complex non-JSON-serializable objects
- CWL steps somehow got
If the encode path currently works for CWL steps, then the first problem is actually just that WorkflowRequestStepState.value contains {"__page__": 0, "__rerun_remap_job_id__": None} — a meaningless placeholder. This is harmless but wasteful.
However, if future changes populate CWL step state with actual values (or if some CWL workflows already do this through paths I haven’t traced), the crash IS a real risk.
2. The missing validated_param_combinations
The doc is exactly right. ToolModule.execute() CWL path (line 2872) creates MappingParameters(param_template, param_combinations) without validated_param_combinations. This means job.tool_state is never set, and CwlToolEvaluator falls back to to_cwl().
3. The state flow diagram
The doc’s “Workflow Path (broken)” diagram is accurate. The CWL workflow path:
- Does NOT produce
JobInternalToolStateinstances - Does NOT set
job.tool_state - Falls back to deprecated
to_cwl(incoming)inCwlToolEvaluator
Impact on Path B Proposal
The research doc’s Path B proposal is sound. Based on my trace, here are additional details:
What step.tool_inputs contains for CWL
The persisted step.tool_inputs = {} (empty dict). The CWL step proxy’s to_dict() creates a tool_state variable but does NOT include it in the returned dict. So d.get("tool_state") in from_dict() returns None, which becomes {} after or {} in step_state_to_tool_state.
This means no CWL parameter state is persisted in tool_inputs — it’s entirely reconstructed from step connections at execution time by build_cwl_input_dict().
Additional considerations for Path B
-
compute_runtime_stateis a no-op for CWL —get_runtime_inputs()returns{}, sovisit_input_valuesnever visits anything. Path B’sencode_runtime_state/decode_runtime_stateoverrides don’t need to worry about values coming fromcompute_runtime_state— the state will be{}or whatever was instep.state.inputsfromrecover_state. -
step.state.inputsvs actual execution inputs — There’s a fundamental disconnect:step.state.inputsis always{}for CWL, while the actual inputs are built dynamically bybuild_cwl_input_dict()from step connections. TheWorkflowRequestStepState.valuefor CWL steps is effectively unused. Path B should acknowledge that the persisted state for CWL is informational/empty, and the real execution inputs come from the workflow graph. -
The
__when_value__key — The CWL execution path adds__when_value__toparam_combinationsat line 2692. IfJobInternalToolState.validate()doesn’t allow extra keys, this needs to be stripped before validation. The legacy path also adds this key at line 2829. -
to_cwl()fallback forincomingparam — In the legacy fallback path (line 1156-1162),incomingcomes fromjob.parameters(line 175 inset_compute_environment). This is populated by thehandle_single_execution→ tool action chain. For CWL workflow jobs,incomingcontains Galaxy-style refs, whichto_cwl()converts to CWL File/Directory objects. The new runtimeify path gets the same data fromjob.tool_state. Ifjob.tool_stateis set correctly, theto_cwl()fallback is eliminated.
Additional State Read/Write Locations
The doc doesn’t mention these:
-
check_and_update_state—ToolModule.check_and_update_state()(line 2438) callsself.tool.check_and_update_param_values(self.state.inputs, ...). For CWL tools withtool.inputs = {}, this should be a no-op but is worth verifying. Called during inject at line 3158. -
add_dummy_datasets—module.add_dummy_datasets(connections=step.input_connections, steps=steps)at line 3124. For CWL tools withtool.inputs = {},visit_input_valuesis a no-op. No issue. -
get_config_form—ToolModule.get_config_form()(line 2431) callsparams_to_incoming(incoming, self.tool.inputs, self.state.inputs, self.trans.app). With emptytool.inputsand emptystate.inputs, this is a no-op. Used for editor rendering, not execution. -
Workflow export —
save_to_step→get_state()persistsstate.inputs = {}totool_inputs. When re-exporting, this empty dict is preserved. CWL workflows can be round-tripped through Galaxy’s native format, but the CWL-specific information (step inputs, defaults, valueFrom) lives inWorkflowStepInputobjects, nottool_inputs. -
step.stateattribute — This is a non-persistent, transient attribute (line 8818:self.state: Optional[DefaultToolState]). It’s only populated during module injection and used during the current request/scheduling cycle. It is NOT stored in the database directly —tool_inputs(the persistent column) is separate.
4. Unresolved Questions
-
Is the
NoReplacementcrash actually reachable today? Based on my trace, CWL steps getstate.inputs = {}throughout invocation request creation, soencode_runtime_statewith empty inputs should not crash. Under what scenario doesNoReplacementend up in CWL step state? Perhaps when a CWL workflow has runtime-settable parameters? -
What about CWL
parameter_inputsteps? These are created bycwl_input_to_galaxy_step()(line 734) and useParameterModule, notToolModule. They have their ownget_runtime_inputs()andencode/decodepaths. Do they also hit problems? -
What does
tool.check_and_update_param_values({}, ...)do? For CWL tools with emptyinputs, this might still try to add default values or produce upgrade messages. Could cause unexpected behavior during inject. -
Are there CWL tools where
has_galaxy_inputs = True? If a CWL tool somehow gets Galaxy inputs populated (perhaps through a hybrid tool definition), it would take the legacy path instead ofuse_cwl_path, and the analysis would be different. -
What happens with CWL subworkflow steps?
SubworkflowStepProxy.to_dict()also doesn’t includetool_state. TheSubWorkflowModuleclass has its ownsave_to_step,recover_state, etc. The state handling for subworkflows within CWL workflows may have similar or different issues.
File References
| File | Key Lines | Purpose |
|---|---|---|
lib/galaxy/tool_util/cwl/parser.py | 697-719 | WorkflowProxy.to_dict() — CWL → Galaxy dict |
lib/galaxy/tool_util/cwl/parser.py | 1103-1126 | ToolStepProxy.to_dict() — CWL step → Galaxy step dict (tool_state NOT included) |
lib/galaxy/tool_util/cwl/parser.py | 1021-1036 | InputProxy.to_dict() — step input metadata (scatter, valueFrom, default) |
lib/galaxy/managers/workflows.py | 622-675 | normalize_workflow_format() — CWL import entry point |
lib/galaxy/managers/workflows.py | 808-944 | _workflow_from_raw_description() — creates Workflow + steps from dict |
lib/galaxy/managers/workflows.py | 1880-1980 | __module_from_dict() — creates WorkflowStep + module from dict |
lib/galaxy/workflow/modules.py | 481-486 | WorkflowModule.from_dict() — calls recover_state(d.get("tool_state")) |
lib/galaxy/workflow/modules.py | 489-493 | WorkflowModule.from_workflow_step() — calls recover_state(step.tool_inputs) |
lib/galaxy/workflow/modules.py | 497-499 | WorkflowModule.save_to_step() — step.tool_inputs = self.get_state() |
lib/galaxy/workflow/modules.py | 523-535 | WorkflowModule.get_state() — returns state.inputs when get_inputs() is empty |
lib/galaxy/workflow/modules.py | 543-570 | WorkflowModule.recover_state() — takes else branch for CWL (empty inputs) |
lib/galaxy/workflow/modules.py | 627-680 | WorkflowModule.compute_runtime_state() — no-op for CWL (empty runtime inputs) |
lib/galaxy/workflow/modules.py | 682-690 | encode_runtime_state / decode_runtime_state — uses params_to/from_strings |
lib/galaxy/workflow/modules.py | 2313-2314 | ToolModule.get_inputs() — returns tool.inputs (empty for CWL) |
lib/galaxy/workflow/modules.py | 2556-2557 | ToolModule.get_runtime_inputs() — returns get_inputs() (empty for CWL) |
lib/galaxy/workflow/modules.py | 2626-2944 | ToolModule.execute() — CWL path builds inputs from connections |
lib/galaxy/workflow/modules.py | 245-288 | build_cwl_input_dict() — builds live CWL input dict from step connections |
lib/galaxy/workflow/modules.py | 3093-3160 | WorkflowModuleInjector — inject, compute_runtime_state |
lib/galaxy/workflow/modules.py | 3163-3186 | populate_module_and_state() — called during invocation request creation |
lib/galaxy/workflow/run_request.py | 544-651 | workflow_run_config_to_request() — serializes step states |
lib/galaxy/workflow/run.py | 128-151 | queue_invoke() — API entry point for workflow invocation |
lib/galaxy/workflow/run.py | 434-463 | remaining_steps() — recovers state via decode_runtime_state |
lib/galaxy/workflow/run.py | 364-372 | _invoke_step() — calls module.execute() |
lib/galaxy/tools/__init__.py | 810-857 | DefaultToolState — encode/decode via params_to/from_strings |
lib/galaxy/tools/__init__.py | 1717-1728 | parse_inputs() — sets has_galaxy_inputs |
lib/galaxy/tools/execute.py | 77-91 | MappingParameters — includes optional validated_param_combinations |
lib/galaxy/tools/execute.py | 254-256 | execute_single_job() — sets job.tool_state only if validated |
lib/galaxy/tools/evaluation.py | 167-237 | set_compute_environment() — recovers JobInternalToolState from job.tool_state |
lib/galaxy/tools/evaluation.py | 1122-1163 | UserToolEvaluator.build_param_dict() — runtimeify vs to_cwl fallback |
lib/galaxy/tools/evaluation.py | 1194-1200 | CwlToolEvaluator — extends UserToolEvaluator |
lib/galaxy/tools/parameters/__init__.py | 330-353 | params_to_strings() — fails on non-JSON values when params is empty |
lib/galaxy/model/__init__.py | 8762 | WorkflowStep.tool_inputs — JSONType column |
lib/galaxy/model/__init__.py | 10471-10492 | WorkflowRequestStepState — persists encoded runtime state |