GRAPH_WORKFLOW_EXTRACTION_PLAN

Graph-Driven / Notebook-Driven Workflow Extraction — Plan (Blocked)

Date: 2026-05-16 (corrected 2026-05-17 — see “Correction” below) Status: BLOCKED — do not start. Blocked on: EXTRACT_TOOL_REQUEST_STATE_PLAN (the gate). Implemented on graph_workflow_extract (commit 5699c2c324) but not yet merged to dev — still blocked until it lands. Resume only after extraction and the History Graph share the structured ToolRequest.request model. Predecessor chain: ICJ_NATIVE_PLANEXTRACT_ICJ_PLANEXTRACT_TOOL_REQUEST_STATE_PLAN → this. Tracking issue: TBD Related research:

  • vault/research/PR 21932 - History Graph API.md
  • vault/research/PR 21935 - Workflow Extraction Vue Conversion.md
  • vault/projects/history_markdown/HISTORY_MARKDOWN_ARCHITECTURE.md

Why this exists

The North Star: a Galaxy Notebook narrates an analysis (what inputs to pick, what the outputs mean) and the provenance graph walked backward from the referenced content is what gets exported as a workflow. The notebook replaces the legacy extraction form-as-a-page; the graph (read-only, selectable) is the extraction surface.

Two framings settled during ideation and carried forward as constraints:

Why blocked: every variant needs extraction and the graph to speak one identity space. Until EXTRACT_TOOL_REQUEST_STATE_PLAN lands they do not (graph is ToolRequest-native; extraction was JobParameter-native). After it lands, the graph’s {src,id} walk and the extractor’s are literally the same code — selection→payload becomes a thin translator instead of a bridge across diverging engines.

Correction (2026-05-17 — post-gate-implementation review)

The gate commit (5699c2c324) is implemented; reviewing it against this plan surfaced three places this plan over-claimed:

None of these block resuming after the gate merges; they correct this plan’s estimates of how thin steps 1–2 are.

Resume prompt (for the implementing agent)

Once EXTRACT_TOOL_REQUEST_STATE_PLAN is merged (extraction reads structured ToolRequest.request; History Graph and extraction share the ref-walk), proceed roughly in this order, each independently shippable:

  1. Selection-aware Graph/* primitives — multi-select state + @select emit + backward-closure highlight on the existing read-only GraphView/GraphNode (PR 21932). No editing. ~80/20 of “editable”.
  2. GraphSelection → WorkflowExtractionByIdsPayload translator — pure function: selected graph node ids → a valid #22706 payload. Not as trivial as originally framed (see Correction): the gate commit added the forward ImplicitCollectionJobs.tool_request helper, but the graph speaks tool_request_id (node type "tool_request") and WorkflowExtractionByIdsPayload has no tool_request_ids field (only job_ids / implicit_collection_jobs_ids). The translator still owns building the reverse tool_request_id → (ICJ id | job id) mapping (raw materials: ToolRequest.jobs, ToolRequest.implicit_collections, Job.implicit_collection_jobs_association). Roundtrip-testable against existing extraction tests.
  3. Notebook propose_workflow agent tool — one new history-agent tool (pattern: the existing 5 tools incl. resolve_hid) mapping notebook-referenced HIDs → translator → extract-by-ids.
  4. Extraction lineage stamp — analog to Page.source_invocation_id: record which history subgraph/seed an extracted workflow came from, so notebook ↔ workflow round-trips.
  5. linked:false cross-product map-over — the deferred follow-up from the gate plan: model cross-product extraction (currently hard-failed). Likely needs workflow-step connection semantics for MULTIPLIED inputs.

Carried-forward open questions

Out of scope until unblocked

Everything. This file exists to carry the vision and the resume prompt across the gate, not to be worked before it.