UC1 → paper integration proposal (MRSA mobile-AMR, issue #12)

What this is: ideas for integrating the UC1 work into the Galaxy Notebooks paper (vault/papers/galaxy-notebooks/manuscript.md), generated by feeding the UC1 debrief pair (UC1_DEBRIEF.md + UC1_DEBRIEF_2.md) and the paper draft to a review subagent. The paper itself was not modified — this is a proposal for the author to apply (or not). Line numbers reference manuscript.md as of 2026-06-14.

Framing: what UC1 unlocks in the current draft

The manuscript repeatedly hedges its central claim — notebook-driven workflow extraction — as unproven:

Extraction section (~line 90): extraction “should be presented as implemented behavior only once the end-to-end vignette has been captured from the current Galaxy branch; otherwise it should be framed as a prototype design path.”
Evaluation Plan (~line 146): workflow handoff “can be presented as a figure sequence.”
Discussion (~line 178): “the notebook-driven extraction story still needs a polished demonstration… Without that evidence, the paper should retreat to a narrower application-note framing.”

UC1 is that captured, verified, byte-identical end-to-end vignette. The highest-value move is to use UC1 to flip these hedges from conditional to delivered. That is the must-include thread; everything else supports it.

MUST-INCLUDE

M1. Concrete worked-example in the Extraction section (load-bearing)

Where: new paragraph after the “implementation status” paragraph (~line 90). Drop-in paraphrase:

We validated this path on a real comparative-genomics analysis. A four-isolate S. aureus mobile-resistome study was documented as a Galaxy Notebook in which every analytical step — ARG detection, insertion-sequence scanning, integron finding, coordinate reformatting, and ARG↔IS distance computation — ran as a collection map-over across the four isolates, and both result figures were on-graph tool outputs rather than pasted images. Backward extraction from the notebook’s referenced artifacts recovered a 14-step workflow (one input collection plus 13 tool steps), every input connection resolved (zero dangling), nine workflow outputs, eight collection map-over steps, and a seeded workflow report requiring no manual repair (zero leftover dataset-instance identifiers). The recovered workflow is sample-agnostic.

Then soften the ~line 90 conditional: report the vignette has been captured on the current branch (page-based extraction, PR #22860 merged), keeping the “polished/contributed vignette” aspiration for the richer UC2/UC3 cases.

M2. “Byte-identical science” claim

Where: Extraction section (after M1) + echoed in Discussion “reuse” benefit (~line 174).

The extracted workflow reproduces the original exactly: re-running it produced an ARG↔IS distance collection byte-identical to the validated original across all four isolates, with both figure matrices byte-identical (one differing only in cosmetic row order). Extraction recovers the same computation, not an approximate reconstruction.

UC1 is the best evidence of the three for this soundness claim; substantiates the existing “deliberately not free-text workflow synthesis” contrast (~line 88).

M3. The `remove_short_is` gotcha → evidence that on-graph artifacts are auditable

Where: Design Goals “reference artifacts, not just describe them” (~lines 29–30) and/or Discussion limits (~line 176).

Because referenced outputs are real on-graph tool results, parameter choices that affect them remain auditable. A single tool default — the IS scanner’s remove_short_is flag — silently altered a figure (25 vs 17 element calls for one isolate, spurious zero-distance overlaps); because the figure was on-graph, the discrepancy was traceable through provenance and corrected. A pasted image would have hidden it.

New, concrete argument for the core thesis; pre-empts “notebooks just paste prettier figures.” Frame as made the error auditable, not prevented it (see T1).

M4. Ground the Evaluation Plan in the real vignette

Where: Evaluation Plan evidence layers 2–3 (~lines 144–146), currently generic.

The worked vignette is a four-isolate S. aureus mobile-AMR comparison (BioProject PRJDB8599); the notebook embeds two on-graph heatmaps and a comparative finding, and extraction recovers a 14-step, nine-output, sample-agnostic workflow with a clean seeded report.

UC1 turns layers 2 (worked vignette) and 3 (workflow handoff) into reported results. It does not cover layer 1 (test counts) or layer 4 (agent authorship) — flag those need separate sourcing.

FIGURES / TABLES

F1 (must) — notebook-beside-workflow panel: rendered clean notebook (page eafb646da3b7aac5, two ggplot2_heatmap2 heatmaps) beside the extracted 14-step graph (33b43b4e7093c91f). This is the concrete instance of planned Figure 1 (“history → notebook → graph → extracted workflow”) and Figure 3 (“referenced outputs → backward walk → workflow/report”), which currently lack real screenshots.
F2 (strong) — the two heatmaps as the embedded-output exemplar (Fig1 29e36fb8642bf5ed, Fig2 579ae69ccbd17e45); caption: extractable tool outputs, not pasted images (visual payoff for M3).
F3 (nice) — extraction summary table: rows = the three UCs; cols = steps / map-over steps / exposed outputs / dangling / report warnings / science-identical. UC1 row: 14 / 8 / 9 / 0 / 0 / byte-identical (clean baseline vs UC2/UC3).

TENSIONS / HONESTY

T1. remove_short_is — the system made the error auditable, a human caught it; never frame as automatic validation. Aligns with the existing “edit_source is provenance, not quality control” limit (~line 176).
T2. Figures-must-be-on-graph: the first UC1 build’s Python-computed pasted matrices extracted as dead-ends; clean extraction needed the matrices rebuilt as on-graph tools. State this honestly (strengthens credibility); note UC2 hit/solved the same “computed outside Galaxy” pattern.
T3. The headline numbers come from a deliberate clean rebuild structured for extractability (consistent map-over, on-graph figures) — don’t imply any arbitrary history extracts this cleanly. Frame as design guidance.
T4. Collection-display rendering noise (staramr misc_info/stderr shown) is environment-specific dev finding — omit from paper claims or a single limits sentence at most.
T5. Bakta is deliberately outside the extractable core — if the biology is mentioned, state the extractable workflow is IS-family-level; Bakta is optional enrichment (don’t let richer biology imply a richer extracted graph than 14 steps).

Where UC1 is best vs UC2/UC3

UC1 best for: the clean extraction baseline (14/0/9/8, byte-identical), the “on-graph figures are auditable” argument (M3), and the headline notebook-beside-workflow figure (F1). The happy-path existence proof — no code change needed.
UC2 stronger for: a harder case that forced a code fix (_original_hda) + robustness.
UC3 stronger for: PDF/figure-rendering breadth.
Neither covers agent-authorship (layer 4) or test counts (layer 1).

Suggested arc: lead extraction evidence with UC1 (clean baseline), then UC2/UC3 as the harder cases that exercised real fixes.

Key IDs for figure capture: notebook eafb646da3b7aac5, workflow 33b43b4e7093c91f, Fig1 29e36fb8642bf5ed, Fig2 579ae69ccbd17e45, history 48916fac0de9a85d; extraction = page-based PR #22860.

UC1_PAPER_INTEGRATION