Workflow Extraction in Galaxy - Overview
Frontend Code
User Entry Point
File: /client/src/components/History/HistoryOptions.vue (lines 210-217)
The “Extract Workflow” option appears in the history dropdown menu:
<BDropdownItem
v-if="historyStore.currentHistoryId === history.id"
:disabled="isAnonymous"
:title="userTitle('Convert History to Workflow')"
@click="iframeRedirect(`/workflow/build_from_current_history?history_id=${history.id}`)">
<FontAwesomeIcon fixed-width :icon="faFileExport" />
<span v-localize>Extract Workflow</span>
</BDropdownItem>
Key points:
- Only available for the current history (not other histories in multiview)
- Requires user to be logged in
- Uses
iframeRedirectto load the legacy Mako-based extraction UI
Legacy Mako Template
File: /templates/build_from_current_history.mako
This is the actual extraction UI - a server-rendered Mako template (not Vue). It provides:
- Workflow name input - Pre-filled with “Workflow constructed from history ‘{history_name}’”
- Job listing table with columns:
- Tool name (with checkbox to include/exclude)
- Output datasets created by that job
- Input dataset marking - Datasets can be marked as workflow inputs with custom names
- Tool compatibility warnings - Non-workflow-compatible tools shown in gray/disabled
- Version warnings - Alerts when tool version differs from extraction version
Form submission posts back to same endpoint with selected job_ids, dataset_ids, workflow_name.
Web Controller (serves the Mako template)
File: /lib/galaxy/webapps/galaxy/controllers/workflow.py (lines 182-230)
Method: build_from_current_history()
Two-phase handling:
-
GET request (initial load):
- Calls
summarize(trans, history)to analyze history - Returns jobs dict and warnings
- Renders
build_from_current_history.makotemplate
- Calls
-
POST request (form submission):
- Calls
extract_workflow()with selected job_ids, dataset_ids, workflow_name - Returns success message with links to edit/run the new workflow
- Calls
if (job_ids is None and dataset_ids is None) or workflow_name is None:
jobs, warnings = summarize(trans, history)
return trans.fill_template("build_from_current_history.mako", jobs=jobs, warnings=warnings, history=history)
else:
stored_workflow = extract_workflow(trans, user=user, history=history, job_ids=job_ids, ...)
# Returns success message with edit/run links
Backend Code
Core Extraction Module
File: /lib/galaxy/workflow/extract.py (463 lines)
Main Functions
-
extract_workflow()(lines 34-77)- Entry point for workflow extraction
- Takes trans, user, history, job_ids, dataset_ids, dataset_collection_ids, workflow_name
- Returns a stored workflow object
-
extract_steps()(lines 80-197)- Builds workflow steps from history content
- Handles job-to-step mapping
- Manages input/output connections
-
summarize()(lines 233-240)- Called by web controller to prepare data for Mako template
- Returns
(jobs, warnings)tuple - Creates
WorkflowSummaryinstance internally
Key Classes
WorkflowSummary(lines 243-389)- Analyzes history for extractable content
- Identifies jobs, datasets, collections
- Maps job IDs to representative jobs (for implicit collection jobs)
- Tracks
hda_hid_in_historyandhdca_hid_in_historyfor HID lookups __summarize()method builds the jobs dict shown in the extraction UI
API Controller
File: /lib/galaxy/webapps/galaxy/api/workflows.py
- Create workflow endpoint (lines 196-317)
- Handles
POST /api/workflows - Supports
from_history_idparameter for extraction mode
- Handles
Manager Layer
File: /lib/galaxy/managers/workflows.py
- Coordinates between API and extraction logic
- Handles permissions and storage
APIs
Workflow Extraction Endpoint
Route: POST /api/workflows
Key Parameters for Extraction:
from_history_id(required for extraction) - History ID to extract fromjob_ids(optional) - Specific jobs to includedataset_ids(optional) - Specific datasets to includedataset_collection_ids(optional) - Specific collections to includeworkflow_name(optional) - Name for extracted workflow
Response:
{
"id": "workflow_encoded_id",
"name": "Extracted Workflow",
"steps": {...},
"annotation": "...",
...
}
Error Handling:
- 400: Invalid parameters or unextractable content
- 403: Permission denied on history
- 404: History not found
Roundtrip Workflow Extraction Tests
File: /lib/galaxy_test/api/test_workflow_extraction.py (687 lines)
What “Roundtrip” Means
Roundtrip testing validates that:
- A workflow can be run on input data
- The resulting history can have a workflow extracted from it
- The extracted workflow matches the original (structurally)
- The extracted workflow can be run again with equivalent results
Test Scenarios
- Basic extraction - Simple tool chains
- Copied content handling - Datasets copied between histories
- Collection mapping - Tools that map over collections
- Collection reduction - Tools that reduce collections to single outputs
- Subcollection nesting - Nested collection structures
- Output collections - Tools producing collection outputs
- Multiple inputs - Workflows with multiple input datasets
- Conditional steps - Steps with when clauses
- Subworkflows - Nested workflow extraction
Key Test Patterns
# Typical roundtrip test pattern
def test_extract_workflow_xxx(self):
# 1. Run a workflow
history_id = self.dataset_populator.new_history()
self._run_workflow(workflow_def, history_id=history_id)
# 2. Extract workflow from history
extracted = self._extract_workflow(history_id)
# 3. Verify structure matches
self._assert_workflow_structure(extracted, expected_structure)
# 4. Optionally run extracted workflow
self._run_workflow(extracted, new_history_id)
Helper Methods
_extract_workflow(history_id, **kwargs)- Calls extraction API_assert_workflow_structure()- Validates extracted workflow_run_workflow()- Executes workflow for testing_wait_for_history()- Waits for history to settle
Test Limitations Noted
- Some complex nested subworkflow scenarios may not roundtrip perfectly
- Collection type inference can be imperfect in edge cases
- Tool version changes between extraction and re-run can cause issues