01-design-spec-synthesis

Draft Workflows — Design Spec Synthesis

1. Format summary

A draft Format2 workflow is gxformat2 with surgical, wrapper-tier relaxations and a _plan_* family of free-text planning fields layered onto tool steps. Topology — workflow inputs, outputs, step set, edges, branches, when: guards — is concrete and final at the draft stage; only wrapper/parameter decisions are deferred (spec L29-31, L41).

Allowed TODO sentinels (wrapper-tier, tool steps only) (spec L34-39):

_plan_* fields (spec L43-54): per-tool-step, free text, all optional but expected when wrapper is deferred. MUST disappear before the workflow is treated runnable.

Tier boundary (spec L52, L41):

Two invariants: every TODO/TODO_* and every _plan_* field MUST disappear before runnable; the connection graph must still resolve syntactically even when port names are sentinels (spec L39).

2. Validation surface

A gxwf draft-validate MUST be a strict superset of gxwf state-validate --no-tool-state for topology, and a relaxation of the structural validator for wrapper-tier fields.

MUST check

MAY check (warn, don’t fail)

MUST NOT enforce

Open / strictness questions

3. Concrete subset semantics (gxwf _draft-concrete-subset)

Goal: extract the maximal subgraph that is a runnable gxformat2 workflow from a partially concretized draft.

Which steps qualify as “concrete”

A step is concrete iff all of:

Propagation rule (the hard question)

A concrete step that consumes from a TODO step is not actually runnable — its in: references a port that doesn’t exist yet. Two policy options:

Recommend (A) strict, with --loose flag for (B). The downstream Foundry agent loop’s purpose for this command is “show me what I can actually execute today”, which is (A). Implementation: topological closure — concrete_set = { s | s is concrete AND every input edge sources from (workflow_input ∪ concrete_set) }. Iterate to fixpoint.

User note 2026-05-22: the DAG topology should make this case rare in practice — if draft-next-step always picks the topologically-earliest unresolved step, the agent loop fills bottom-up and never produces a concrete step depending on a TODO. The propagation logic is a safety net, not a workhorse.

Outputs with step/TODO_* outputSource

If a workflow outputs[].outputSource references a port that’s still a TODO_* sentinel or a step that got dropped, drop that output entry from the subset. Emit a report of dropped outputs so the agent loop sees what’s missing.

Order preservation

Subworkflow support

Output shape

Emit gxformat2 YAML on stdout (or -o). Optionally emit a sidecar JSON report describing what was dropped and why (--report-json).

4. draft-next-step semantics

Goal: tell the agent loop which step to concretize next and surface the planning context.

”Next step” definition

Step labeling under subworkflows

Spec teaser: step: [outer_label, subworkflow_label, ...]. Recommend a JSON Pointer-ish array:

step: ["align_reads", "filter_subworkflow", "samtools_filter"]

work[] shape

User decision 2026-05-22: v1 ships raw strings in work[] — just the TODO text or _plan_* values verbatim. Matches the original sketch. Downstream agent parses.

Reference v1 raw-string contents for one step:

{
  "draft": true,
  "step": ["fastp"],
  "work": [
    "tool_id: TODO",
    "in.TODO_input",
    "out.TODO_trimmed_paired",
    "out.TODO_html_report",
    "_plan_state: adapter trimming on, quality cutoff ~Q20, min length ~50.\npreserve paired-end pairing for downstream alignment.",
    "_plan_context: upstream: nf-core FASTP module. conda: bioconda::fastp=0.23.4 ...",
    "_plan_in: single semantic port `reads`: feeds workflow `reads` (list:paired). ...",
    "_plan_out: need a paired output that preserves list:paired shape ..."
  ]
}

(Exact string-shape conventions for sentinel work items — "in.TODO_input" vs "in[TODO_input]" vs just "TODO_input" — are an implementation-level decision; see Open Question 14.)

A future v2 may upgrade to structured { kind, location, hint, plan_text? } once worked examples motivate it (spec L106). v1 keeps it minimal.

Idempotence / stability

Output

JSON to stdout (machine-readable). --format markdown for human-readable. JSON shape lives in report-models.ts (per GXWF_AGENT.md convention — Python and TS report models stay in sync).

5. Connection / dataflow validation in draft mode

The spec’s core promise (L39, L97): the graph is well-formed even with TODO_* port names. The validator can do syntactic edge resolution but not type-aware checks.

Syntactic edge check (MUST, v1)

For every step.in[<consumer_key>]: <source_ref>:

Failure modes:

Partial connection validation on the concrete subgraph (MAY, v2)

Once a meaningful subset of steps is concretized, the existing connection_validation.py engine (CURRENT_STATE.md) could run on the subgraph defined by concrete-subset semantics:

Recommend: defer to v2. v1 ships syntactic only. This is consistent with the Python side’s tiered strictness (--strict-state vs --strict-structure); a draft-mode tier sits even below --strict-structure.

What we lose

These all wait until the wrapper is picked; that’s exactly the point of the tier separation.

6. Open questions

Spec-level (resolved 2026-05-22 interview rounds 1 + 2)

  1. Subworkflow drafts: are nested run: blocks allowed to be drafts themselves? Yes, recursive — step path is [outer, sub, ...].
  2. Strict vs loose _plan_* placement on resolved steps: Strict — error if _plan_* on a fully-resolved step.
  3. Tests file in draft mode: v1-skip. Tests come after concretization.
  4. Bare TODO as a port name: Warn-level in v1. TODO_<hint> is canonical; bare TODO for ports is unusual but not blocking.
  5. _plan_* on non-tool steps (subworkflow / pause / pick_value): Allowed in v1. Simplifies the schema-salad modeling; may tighten later once we see usage.
  6. Schema strategy (spec L107): Model upstream in gxformat2 with an explicit TodoSentinel schema-salad type and _plan_* as optional fields on WorkflowStep. Publish a sibling format2-draft.schema.json alongside the existing format2.schema.json for non-Python consumers (VS Code, external tooling). Regenerate the TS Effect schema via make sync.
  7. Concrete-subset propagation policy default: Strict — cascade-drop transitive dependents with a warning.
  8. gxwf validate (concrete validator) on a draft file: Fails as today. No auto-route, no hint logic. Draft files must be invoked through draft-validate explicitly.
  9. gxwf lint on draft files: Skip with a clear “this is a draft” message in v1. Many best-practice rules don’t make sense pre-wrapper.
  10. _draft-extract exit code when output is empty: 0 — empty extract is a valid stage of the agent loop.

Implementation-level (we choose)

v2-deferred (out of scope)

The minimum to unblock the Foundry agent loop (template Mold → per-step implementation Mold):

Ship in v1

  1. gxwf draft-validate <file>
    • Sibling Effect schema (or augmented existing schema; see Open Q6) that relaxes wrapper-tier; allows _plan_*.
    • All MUST checks from §2 (structural skeleton + concrete topology + syntactic edge resolution).
    • MAY-checks for _plan_* placement (warn-level), with strict-mode error if _plan_* on concrete step per user decision.
    • Strip-readiness preview enumerating remaining TODOs + _plan_*.
    • JSON + text output. Markdown via the same Jinja/Nunjucks infra used for state-validate.
  2. gxwf draft-next-step <file>
    • Topological walk, first needs-work step.
    • work[] as raw strings (v1 shape per user decision).
    • Subworkflow-aware step: path arrays.
    • JSON output (markdown nice-to-have).
    • Idempotent, deterministic tie-break.
  3. gxwf _draft-concrete-subset <file>
    • Strict propagation policy (drop dependents-on-TODO with warning).
    • Subworkflow recursion in scope (per user decision on subworkflows v1).
    • Drops dangling outputs[] entries.
    • Emits gxformat2 YAML to stdout; sidecar JSON report of drops.

Defer

Cross-cutting

Why this minimum unblocks the loop

The Foundry agent loop is: template Mold emits a draft → loop picks next-step → per-step Mold fills it (via discover-shed-tool + tool-cache + wrapper resolution) → repeat → eventually full-concrete → hand off to runnable gxformat2. Steps 1, 2, and 4 (validate, pick next, hand off) are the three v1 commands. Step 3 (the actual filling) is owned by the per-step Mold and discover-shed-tool, not by gxwf. Concrete-subset is the bonus that lets the agent run the partially-filled workflow against real Galaxy mid-loop to catch shape errors early — high-leverage, low-cost.