E-cli-draft-extract

Workstream E — CLI command gxwf _draft-extract

Goal

Wire extractConcreteSubset (workstream B) behind a hidden gxwf _draft-extract CLI command. Emit the trimmed workflow (YAML to stdout or -o file) and an optional sidecar JSON report listing drops + rewrites. The command is the agent-loop “freeze what’s concrete so far” step.

Inputs

Locked decisions (this subplan)

DecisionOutcome
Hidden commandYes — leading underscore in name (_draft-extract). Requires extending SpecCommand with hidden?: boolean and suppressing from --help in build-program.ts.
Handler keydraftExtract (camelCase). Filename: packages/cli/src/commands/_draft-extract.ts (leading underscore to match command).
_plan_* stripYes — via a new clean.ts option stripPlanFields: true. Strips _plan_state, _plan_context, _plan_in, _plan_out from every step (recursively into draft subworkflows) and from the workflow root. Lives in packages/schema/src/workflow/clean.ts so it’s reusable.
Class flipConditional. If, after extract + plan-field strip, the resulting dict has zero _plan_* and zero TODO sentinels remaining, flip class: GalaxyWorkflowDraftclass: GalaxyWorkflow. Recursively apply to inline subworkflow run: blocks that were already draft and are now fully concrete. Otherwise leave as GalaxyWorkflowDraft.
Empty extractExit 0. Per INDEX L181 — an empty extract is a valid step in the agent loop, not an error.
Subworkflow recursionAlready handled in B. E does not redo it; just passes the workflow through.
Output formatYAML default. Auto-detect from -o extension (.ga → JSON; .gxwf.yml → YAML). --format <fmt> overrides.
Sidecar report--report-json [file] writes a JSON SingleDraftExtractReport (drops + rewrites + class-flip flag). Filename omitted → stdout. See “stdout-sink collision” below--report-json to stdout conflicts with the trimmed workflow output (which also defaults to stdout); reject the combination with exit 2, matching the rule C established in draft-validate.
Cross-checkRun inside E’s tests, not as a CLI flag. When the result is class: GalaxyWorkflow, decode it against GalaxyWorkflowSchema and fail loudly if it doesn’t validate. This is plan test-step 9 (deferred from B).
Exit code0 always when input parses + extract runs. 2 only for parse/read failure.
Format flag--format <fmt> for input format detection alignment with other commands. Reject native (drafts are format2-only).
Diff modeNot in v1. clean.ts --diff exists; we could mirror it but skip for v1.

Pipeline gaps E closes

1. Spec-types extension — packages/cli/src/meta/spec-types.ts

export interface SpecCommand {
  // ...existing fields...
  /** When true, suppress from `--help` output. Useful for experimental / agent-internal commands. */
  hidden?: boolean;
}

2. build-program.ts — honor hidden

When the spec sets hidden: true, mark the commander subcommand as hidden (commander API: .command(...).addHelpText(...) / program.command(name).command(...) with .helpCommand(false) is wrong; the correct call is .command(name, { hidden: true }) per commander docs). Verify against the installed commander version (packages/cli/package.json).

3. New clean option — packages/schema/src/workflow/clean.ts

export interface CleanWorkflowOptions {
  // ...existing...
  /**
   * When true, strip `_plan_state` / `_plan_context` / `_plan_in` / `_plan_out`
   * from every step (recursively into inline draft subworkflows) and from the
   * workflow root. Used by `gxwf _draft-extract` after `extractConcreteSubset`.
   */
  stripPlanFields?: boolean;
}

Either:

If the latter, E imports stripPlanFields directly from @galaxy-tool-util/schema and the new CleanWorkflowOptions.stripPlanFields option is OPTIONAL (only useful for callers wanting one-stop clean+strip).

4. Class flip helper — also lives in clean.ts (or a new promote-draft.ts)

/**
 * Recursively flip `class: GalaxyWorkflowDraft` → `class: GalaxyWorkflow` on
 * any (sub)workflow that is now fully concrete: zero `_plan_*` fields, zero
 * TODO sentinels anywhere in steps / inputs / outputs / refs.
 * Leaves drafts that still carry work as-is.
 *
 * Returns the (possibly mutated) workflow and a list of step paths whose
 * inner workflow was promoted.
 */
export function promoteFullyConcreteDrafts(workflow: unknown): {
  workflow: unknown;
  promotedPaths: StepPath[];
};

Reuses detectDraft to check “is anything still drafty?” at each level. Pure function.

5. Spec entry — packages/cli/spec/gxwf.json

{
  "name": "_draft-extract",
  "description": "Extract the concrete subset of a draft workflow (agent-loop internal)",
  "handler": "draftExtract",
  "hidden": true,
  "args": [{ "raw": "<file>", "description": "Draft workflow file (.gxwf.yml)" }],
  "options": [
    { "flags": "-o, --output <file>", "description": "Write extracted workflow to file (default: stdout)" },
    { "flags": "--report-json [file]", "description": "Write extraction report JSON (drops, rewrites, class flip)" },
    { "flags": "--format <fmt>", "description": "Input format: format2 (default; native is rejected)" }
  ]
}

6. Handler — packages/cli/src/commands/_draft-extract.ts

export interface DraftExtractOptions {
  output?: string;
  reportJson?: string | boolean;
  format?: string;
}

export async function runDraftExtract(file: string, opts: DraftExtractOptions): Promise<void>;

Pipeline:

  1. Stdout-sink collision check (first thing in the handler). When -o is absent (workflow → stdout) AND --report-json is set with no filename or =- (report → stdout), reject with exit 2 and an explicit error message. Use or extend findStdoutSinkConflict from packages/cli/src/commands/report-output.ts (introduced during C’s review fixups). The current helper signature only knows about json / reportHtml / reportMarkdown; either generalize it to take a list of named sinks + their dest values, or wrap a tiny E-local check around the same idea.
  2. readWorkflowFile(file) → parse failure → exit 2.
  3. resolveFormat — native rejected.
  4. const extract = extractConcreteSubset(parsed);
  5. let trimmed = stripPlanFields(extract.workflow).workflow; (or via cleanWorkflow({ stripPlanFields: true }) if E chose that path).
  6. const promote = promoteFullyConcreteDrafts(trimmed); trimmed = promote.workflow;
  7. Serialize trimmed to YAML via serializeWorkflow (or JSON if -o ends in .ga/.json).
  8. Write output via writeWorkflowOutput (existing helper in workflow-io.ts).
  9. If --report-json: build a SingleDraftExtractReport (see report model) and write it.
  10. Exit 0.

7. Handler registry — packages/cli/src/programs/gxwf.ts

import { runDraftExtract } from "../commands/_draft-extract.js";
// ...
const handlers: HandlerRegistry = {
  // ...
  draftExtract: runDraftExtract,
};

8. Report model — report-models.ts

export interface DraftExtractDropReport {
  path: string[];       // step path (or workflow path for outputs)
  label?: string;       // for output drops
  reason: DropReason;   // re-uses B's union
}

export interface DraftExtractRewriteReport {
  path: string[];
  in_key: string;
  removed_refs: string[];
  surviving_refs: string[];
}

export interface SingleDraftExtractReport {
  workflow: string;                            // input file path
  output: string | null;                       // -o path, or null for stdout
  dropped_steps: DraftExtractDropReport[];
  dropped_outputs: DraftExtractDropReport[];
  rewritten_step_inputs: DraftExtractRewriteReport[];
  promoted_paths: string[][];                  // (sub)workflows flipped to concrete
  class_after: "GalaxyWorkflowDraft" | "GalaxyWorkflow";
  summary: string;
}

export function buildSingleDraftExtractReport(...): SingleDraftExtractReport;

Test plan (vitest)

Tests live in packages/cli/test/_draft-extract.test.ts (sibling-style; C settled on flat packages/cli/test/<command>.test.ts, no commands/ subdir). Schema-side tests for stripPlanFields + promoteFullyConcreteDrafts live in packages/schema/test/workflow/clean.test.ts and packages/schema/test/draft-checks.test.ts (or a new promote-draft.test.ts).

Reuse the CLI draft fixture dir at packages/cli/test/fixtures/draft/ (established in C). Use the shared createCliTestContext harness from packages/cli/test/helpers/cli-test-context.ts exactly like draft-validate.test.ts does.

Red-to-green:

  1. Schema: stripPlanFields removes the four _plan_* keys from a step (and workflow root) and reports removedPaths.
  2. Schema: stripPlanFields recurses into draft subworkflow run: but NOT into concrete run: and NOT into string-form run:.
  3. Schema: promoteFullyConcreteDrafts flips class when zero TODOs / zero _plan_* remain; leaves draft otherwise.
  4. Schema: promoteFullyConcreteDrafts recurses into inline draft subworkflows that are now fully concrete; flips them too. Leaves still-drafty inner workflows alone.
  5. CLI: happy path — fully-concrete draft (no TODOs, only _plan_* planning context) → exits 0, stdout is YAML with class: GalaxyWorkflow, no _plan_* keys present.
  6. CLI: cascade casedraft-extract-cascade.yml-style → stdout YAML has only the surviving step subset; sidecar report (when --report-json /tmp/x.json) lists both the directly-drafty drop and the cascaded drop.
  7. CLI: B’s test-9 cross-check — when promotion fires (output class is GalaxyWorkflow), decode the serialized output against GalaxyWorkflowSchema; must pass without errors. Failing this is a regression.
  8. CLI: empty extract — workflow with every step drafty → stdout is a valid draft workflow with empty steps; exit 0; report lists all the drops.
  9. CLI: hidden from —help — running gxwf --help does NOT mention _draft-extract. (Useful for verifying the hidden: true plumbing.)
  10. CLI: —report-json sidecar — file written; valid JSON; matches SingleDraftExtractReport shape.
  11. CLI: stdout-sink collision — no -o (workflow → stdout) AND --report-json with no filename (report → stdout) → exit 2, stderr cites the collision, neither artifact written. Mirror draft-validate.test.ts’s “rejects —json + —report-html (stdout) with exit 2” test.

Out of scope for E

Acceptance criteria

Sequencing inside E

commit 1  schema: stripPlanFields helper                                                  (+ unit tests)
commit 2  schema: promoteFullyConcreteDrafts helper                                       (+ unit tests + test-9 cross-check on promoted output)
commit 3  cli/meta: SpecCommand.hidden + build-program honors it                          (+ test)
commit 4  cli: _draft-extract handler + spec entry + handler registry                     (+ command tests)
commit 5  schema: SingleDraftExtractReport + buildSingleDraftExtractReport                (+ unit tests)
commit 6  cli: --report-json wiring                                                        (+ tests)
commit 7  Changesets + regen gxwf-cli skill doc (verify hidden command is hidden in skill)

(commits 1+2 could collapse if the helpers are small; commits 4+6 likewise.)

Post-C carry-overs

What landed during C (including review fixups) that E should inherit verbatim:

Open questions for E