UC3_DEBRIEF_2

UC3 debrief 2 — clean extractable rebuild, live rendering, contributions (2026-06-14)

Second debrief for use case 3 (differential ATAC-seq accessibility, issue #14). The first debrief (UC3_DEBRIEF.md) covers the original build, the page-extraction audit (the “worst of three” — notebook referenced re-uploaded PNGs, seeded nothing), and appends the clean rebuild + the renderer-fix recipe. This document is the standalone session-2 writeup focused on the renderer feature it forced, the new directive, extractability, live rendering, and paper relevance.

One-line takeaway

UC3 is the figure-rendering case: it exposed the most damaging notebook anti-pattern (a notebook displaying only off-graph re-uploaded screenshots seeds nothing on extraction) and drove two reusable Galaxy contributions — server-side PDF-as-image rasterization, and a new history_dataset_as_pdf notebook directive with page control — so a PDF-emitting tool’s real output can be referenced directly and both display and seed extraction.

Clean rebuild & extraction (verified)

Reusable Galaxy contributions this UC forced

  1. PDF-as-image rendering. Pdf.handle_dataset_as_image rasterizes a PDF’s first page to PNG via PyMuPDF (optional import, graceful fallback, size-clamped); the markdown report renderer delegates to the datatype; the extraction collector already records history_dataset_as_image references regardless of format. Net: referencing the real PDF output both displays (rasterized in the baked report) and seeds extraction. Fixes the entire class of PDF-emitting R/Bioconductor tools, not just UC3. Unit-tested + Selenium E2E test; pymupdf declared in packages/data + pinned.
  2. New history_dataset_as_pdf notebook directive with page control (history_dataset_as_pdf(history_dataset_id=ID, page=N)). Registered across parser, server renderers (report rasterizes page N; collector seeds; live no-op), the Pdf datatype (arbitrary 1-based page), and the client (new HistoryDatasetAsPdf.vue embedding the page with viewer chrome hidden) + directives/help/toolbox/templates. Lets a multi-page PDF (e.g. the 5-page DESeq2 diagnostics) show a chosen page (PCA = page 1) as a figure. Unit + Selenium tests; review-blocker (_ReportLabelRewriter missed-subclass) caught and fixed.

Live notebook rendering review (Playwright, dev client)

Paper relevance

Open refinements / next

Rasterize-to-<img> live path for fully single-page rendering (unify live + report; kill the multi-page peek); inline ${galaxy} form support for the PDF directive (currently fenced-block only — documented); donor-aware DESeq2 design; nearest-gene annotation.