Dashboard

Pr 18758 Tool Execution Typing And Decomposition

Adds type aliases documenting tool state lifecycle through execution from request to job completion

Raw
Revised:
2026-05-16
Revision:
6
GitHub PR:
#18758
Related Notes:
Component - Tool State Dynamic Models, Component - Tool State Specification, Component - YAML Tool Runtime, PR 18641 - Parameter Model Improvements Research, PR 20935 - Tool Request API, PR 21828 - YAML Tool Hardening and Tool State, PR 21842 - Tool Execution Migrated to api jobs, PR 21932 - History Graph API, PR 22706 - Workflow Extraction by IDs, Problem - YAML Tool Post-Hoc State Divergence

PR #18758: More Typing, Docs, and Decomposition Around Tool Execution

PR: https://github.com/galaxyproject/galaxy/pull/18758 Title: More typing, docs, and decomposition around tool execution Status: Merged

Overview

PR #18758 introduced structured type annotations for the tool state lifecycle and decomposed monolithic tool execution methods into focused, well-typed functions. This PR is foundational to the structured tool state work — it creates the vocabulary of type aliases that describe how tool state transforms through the execution pipeline.

Key Changes

1. lib/galaxy/tools/_types.py — Tool State Type Aliases (NEW FILE)

Created a new module defining type aliases for each stage of tool state transformation. While all are Dict[str, Any] at runtime, the type aliases serve as documentation markers describing what processing has occurred.

Type Lifecycle Table (from the module docstring):

TypeState ForObject ReferencesValidated?
ToolRequestTrequestsrc dicts of encoded idsno
ToolStateJobInstanceTa jobsrc dicts of encoded idsno
ToolStateJobInstancePopulatedTa jobmodel objs loaded from dbcheck_param
ToolStateDumpedToJsonTa jobsrc dicts of encoded ids (normalized)yes
ToolStateDumpedToJsonInternalTa jobsrc dicts of decoded ids (normalized)yes
ToolStateDumpedToStringsTa jobsrc dicts dumped to strs (normalized)yes
ParameterValidationErrorsTerrorsnested dict of str/Exceptionn/a
InputFormatTformat flagLiteral[“legacy”, “21.01”]n/a

Current location: lib/galaxy/tools/_types.py (69 lines)

Key insight: The lifecycle is ToolRequestT → (expand) → ToolStateJobInstanceT → (populate/check_param) → ToolStateJobInstancePopulatedT → (dump) → ToolStateDumpedToJson*T / ToolStateDumpedToStringsT

2. lib/galaxy/tools/__init__.py — expand_incoming() Decomposition

The monolithic expand_incoming() method (~40 lines of inline logic) was decomposed into focused methods:

Before (single method):

def expand_incoming(self, trans, incoming, request_context, input_format="legacy"):
    # inline: decode rerun_remap_job_id
    # inline: expand meta parameters
    # inline: validate expansion
    # inline: loop over expanded, populate each
    ...

After (decomposed):

a) expand_incoming() — orchestrator, typed signature:

def expand_incoming(
    self, request_context: WorkRequestContext, incoming: ToolRequestT, input_format: InputFormatT = "legacy"
) -> Tuple[List[ToolStateJobInstancePopulatedT], List[ParameterValidationErrorsT], Optional[int], Optional[MatchingCollections]]

Note: trans parameter removed — now uses request_context directly.

b) _rerun_remap_job_id() — module-level function extracted:

def _rerun_remap_job_id(trans, incoming, tool_id: Optional[str]) -> Optional[int]

c) _ensure_expansion_is_valid() — validation guard:

def _ensure_expansion_is_valid(self, expanded_incomings: List[ToolStateJobInstanceT], rerun_remap_job_id: Optional[int]) -> None

d) _populate() — per-job parameter population:

def _populate(self, request_context, expanded_incoming: ToolStateJobInstanceT, input_format: InputFormatT) -> Tuple[ToolStateJobInstancePopulatedT, ParameterValidationErrorsT]

e) completed_jobs() — job caching lookup extracted from handle_input():

def completed_jobs(self, trans, use_cached_job: bool, all_params: List[ToolStateJobInstancePopulatedT]) -> Dict[int, Optional[model.Job]]

This was also called from workflow/modules.py with duplicated code — the extraction removed that duplication.

3. lib/galaxy/tools/execute.py — Execution Framework Typing

a) MappingParameters NamedTuple — typed fields:

class MappingParameters(NamedTuple):
    param_template: ToolRequestT                           # was ToolParameterRequestT
    param_combinations: List[ToolStateJobInstancePopulatedT]  # was ToolParameterRequestInstanceT

Renamed from the old ToolParameterRequestT/ToolParameterRequestInstanceT aliases (which were deleted from execute.py and moved to _types.py with new names).

b) ExecutionSlice — typed param_combination:

param_combination: ToolStateJobInstancePopulatedT  # was ToolParameterRequestInstanceT

c) ExecutionTracker — class-level attribute type annotations added:

execution_errors: List[ExecutionErrorsT]
successful_jobs: List[model.Job]
output_datasets: List[Tuple[str, model.HistoryDatasetAssociation]]
output_collections: List[Tuple[str, model.HistoryDatasetCollectionAssociation]]
implicit_collections: Dict[str, model.HistoryDatasetCollectionAssociation]

d) ExecutionErrorsT — new type alias:

ExecutionErrorsT = Union[str, Exception]

e) Null safety for collection_info — throughout ExecutionTracker, self.collection_info accesses were guarded with assert collection_info or if collection_info is not None checks, replacing direct attribute access on potentially-None objects.

4. lib/galaxy/tools/actions/ — Typed Action execute() Methods

All ToolAction subclass execute() methods had their incoming parameter retyped:

  • Before: incoming: Optional[ToolParameterRequestInstanceT]
  • After: incoming: Optional[ToolStateJobInstancePopulatedT]

Affected files:

  • actions/__init__.pyToolAction (abstract), DefaultToolAction
  • actions/data_manager.pyDataManagerToolAction
  • actions/history_imp_exp.pyImportHistoryToolAction, ExportHistoryToolAction
  • actions/metadata.pySetMetadataToolAction
  • actions/model_operations.pyModelOperationToolAction
  • actions/upload.pyUploadToolAction

Also added get_output_name() as an @abstractmethod on ToolAction.

5. lib/galaxy/tools/parameters/__init__.py — New Functions and Typed Signatures

a) ToolInputsT — new type alias:

ToolInputsT = Dict[str, Union[Group, ToolParameter]]

b) params_to_json_internal() — new convenience function:

def params_to_json_internal(params: ToolInputsT, param_values: ToolStateJobInstancePopulatedT, app) -> ToolStateDumpedToJsonInternalT

Wraps params_to_strings() with nested=True, use_security=False → decoded IDs.

c) params_to_json() — new convenience function:

def params_to_json(params: ToolInputsT, param_values: ToolStateJobInstancePopulatedT, app) -> ToolStateDumpedToJsonT

Wraps params_to_strings() with nested=True, use_security=True → encoded IDs.

d) params_to_strings() — enhanced signature and docs:

def params_to_strings(
    params: ToolInputsT, param_values: ToolStateJobInstancePopulatedT, app, nested=False, use_security=False
) -> Union[ToolStateDumpedToJsonT, ToolStateDumpedToJsonInternalT, ToolStateDumpedToStringsT]

e) populate_state() — typed parameters:

def populate_state(
    request_context, inputs: ToolInputsT, incoming: ToolStateJobInstanceT,
    state: ToolStateJobInstancePopulatedT, errors: Optional[ParameterValidationErrorsT] = None,
    ..., input_format: InputFormatT = "legacy"
)

6. lib/galaxy/tools/parameters/grouping.py — Constructor Refactoring

All Group subclasses changed to require name in constructor:

Before:

group = Repeat()
group.name = "r"

After:

group = Repeat("r")

Affected classes: Group, Repeat, Section, UploadDataset, Conditional

Also added class-level type annotations:

  • Group.name: str
  • Repeat.inputs: ToolInputsT, Repeat.min: int, Repeat.max: float
  • Section.inputs: ToolInputsT
  • UploadDataset.inputs: ToolInputsT
  • Conditional.cases: List[ConditionalWhen], Conditional.value_ref: Optional[str]

Repeat.min defaults to 0 and Repeat.max defaults to inf (from math.inf), replacing None.

7. Other Typed Improvements

a) lib/galaxy/tools/parameters/meta.py:

  • ExpandedT = Tuple[List[ToolStateJobInstanceT], Optional[matching.MatchingCollections]]
  • expand_meta_parameters(trans, tool, incoming: ToolRequestT) -> ExpandedT

b) lib/galaxy/managers/jobs.py:

  • by_tool_input() method typed with ToolStateJobInstancePopulatedT and ToolStateDumpedToJsonInternalT
  • New type aliases: JobStateT = str, JobStatesT = Union[JobStateT, List[JobStateT]]

c) lib/galaxy/webapps/galaxy/api/jobs.py:

  • search() endpoint updated to use proxy_work_context_for_history() instead of constructing WorkRequestContext directly
  • expand_incoming() call site updated for new signature (no trans param)

d) lib/galaxy/work/context.py:

  • proxy_work_context_for_history() now has explicit -> WorkRequestContext return type

e) lib/galaxy/workflow/modules.py:

  • Duplicated completed_jobs loop replaced with tool.completed_jobs(trans, use_cached_job, param_combinations)

f) lib/galaxy/tools/parameters/basic.py:

  • ToolParameter.name: str class-level annotation added

g) test/unit/app/tools/test_evaluation.py:

  • Test code updated for new Group constructor signatures

Current Codebase State (Post-PR Evolution)

Cross-referencing PR #18758 with the current structured_tool_state branch:

All PR Changes Intact

Every change from PR #18758 is present in the current codebase at the expected locations.

Significant Evolution Since PR

1. Async Variants Added:

  • expand_incoming_async() — async version of expand_incoming() in Tool class
  • _populate_async() — async version of _populate() in Tool class
  • populate_state_async() — async version in parameters module
  • These support the tool request/tasks API for async tool execution

2. MappingParameters Enhanced:

class MappingParameters(NamedTuple):
    param_template: ToolRequestT
    param_combinations: List[ToolStateJobInstancePopulatedT]
    validated_param_template: Optional[RequestInternalDereferencedToolState] = None
    validated_param_combinations: Optional[List[JobInternalToolState]] = None

    def ensure_validated(self): ...

Added optional schema-validated state fields for the structured tool state execution path.

3. ExecutionSlice Enhanced:

  • Added validated_param_combination: Optional[JobInternalToolState] field
  • Supports both legacy and schema-validated execution modes

4. _ensure_expansion_is_valid() Updated:

def _ensure_expansion_is_valid(
    self,
    expanded_incomings: Union[List[JobInternalToolState], List[ToolStateJobInstanceT]],
    rerun_remap_job_id: Optional[int],
) -> None

Union type now includes JobInternalToolState for schema-validated paths.

Relationship to Structured Tool State

PR #18758 is the bridge PR between the old untyped tool execution and the structured tool state system:

  1. Type aliases as documentation — Even though all types are Dict[str, Any] at runtime, the aliases document what has happened to the state at each point in the pipeline
  2. Decomposition enables insertion points — Breaking expand_incoming() into parts created clean insertion points for the async/schema-validated variants added later
  3. MappingParameters as dual carrier — The later addition of validated_param_template/validated_param_combinations shows MappingParameters became the bridge carrying both legacy and schema-validated state through execution
  4. Foundation for _types.py — This module is imported across the tools package and serves as the central type vocabulary for tool state

File Index

FileLines (PR)Current LinesStatus
lib/galaxy/tools/_types.py65 (new)69Intact
lib/galaxy/tools/__init__.pymajor changes~5000+Intact + async variants
lib/galaxy/tools/execute.pymajor changes~700+Intact + validated state fields
lib/galaxy/tools/actions/__init__.pytyped execute()~1000+Intact
lib/galaxy/tools/actions/data_manager.pytyped execute()~80+Intact
lib/galaxy/tools/actions/history_imp_exp.pytyped execute()~200+Intact
lib/galaxy/tools/actions/metadata.pytyped execute()~200+Intact
lib/galaxy/tools/actions/model_operations.pytyped execute()~200+Intact
lib/galaxy/tools/actions/upload.pytyped execute()~200+Intact
lib/galaxy/tools/parameters/__init__.pytyped + new fns~700+Intact + async populate
lib/galaxy/tools/parameters/grouping.pyconstructor refactor~800+Intact
lib/galaxy/tools/parameters/meta.pytyped expand~250+Intact
lib/galaxy/tools/parameters/basic.pyname annotation~2500+Intact
lib/galaxy/managers/jobs.pytyped by_tool_input~500+Intact
lib/galaxy/webapps/galaxy/api/jobs.pyupdated call site~500+Intact
lib/galaxy/workflow/modules.pydeduplicated completed_jobs~2500+Intact
lib/galaxy/work/context.pyreturn type~200+Intact

Incoming References (11)