UC1_MRSA_issue

UC1 / MRSA mobile-AMR context — interview input

Pulled-down GitHub issue used as the effective result of the INTERVIEW → GALAXY interview (the live interview mechanics are harness-owned and precede pipeline phase 1). Source: https://github.com/jmchilton/galaxy-brain/issues/12 Paired aspirational target: UC1_MRSA_extracted.ga (extracted from a Galaxy history; not human-validated).


Purpose

Develop a Galaxy Notebooks paper/demo vignette using existing Galaxy Training Network material as scaffolding, but with a new analysis shape: comparative mobile-AMR interpretation across related MRSA isolates. This is for the galaxy-brain Galaxy Notebooks work, not a proposal to add new GTN training material by default.

Objective

Show how a Galaxy Notebook can move from existing one-isolate AMR/annotation workflows to a focused comparative question: which resistance genes appear in mobile genomic contexts such as plasmids, insertion sequences, integrons, or isolate-specific loci?

Why this is a useful demo deviation

The existing training material already covers KUN1163/DRR187559 assembly-derived annotation and AMR detection. The notebook should reuse those available tools and data patterns, but add a more paper-worthy story: compare 3-4 related MRSA isolates and summarize how ARG content and mobile context differ.

This is small enough to avoid new Galaxy plumbing, but significant enough to demonstrate the notebook as an interpretive layer over Galaxy histories rather than a direct tutorial rewrite.

Existing analysis anchors

Public data candidates

Fallback: create a small Zenodo bundle with combined chromosome+plasmid FASTA per selected isolate plus metadata TSV if direct INSDC FASTA import is unstable.

Notebook workflow plan

  1. Import 3-4 complete isolate FASTA files, one combined assembly per isolate.
  2. Build a Galaxy dataset collection named MRSA isolate assemblies.
  3. Run staramr over the collection and collect summary.tsv, detailed_summary.tsv, resfinder.tsv, plasmidfinder.tsv, and mlst.tsv.
  4. Run Bakta on selected isolates, or on a representative subset if runtime is too high.
  5. Run ISEScan and IntegronFinder on selected assemblies.
  6. Convert AMR, plasmid, IS, and integron outputs into interval tables or GFF3 using existing GTN table-processing patterns.
  7. Classify ARG context as plasmid-located, IS-adjacent, integron-associated, SCCmec-region candidate, or unclassified.
  8. Build a JBrowse view for one representative mobile-AMR locus and one contrasting isolate.
  9. Export summary TSVs and notebook-ready figures.

Expected paper/demo artifacts

Scope and risks

Tasks