ABSTRACT - Galaxy Brain

Achieving Reproducibility in the Age of Agents

Agent-driven science is changing the shape of how science is done generally and computational science particularly. The scope of what agents are accomplishing is unfathomable and the amount of data and code and documentation being generated is so overwhelming the word slop has become synonymous with agentic artifacts. I will argue if agents aren’t strongly encouraged to do science in a reproducible way - the democratization of data analysis will lead to more tangles of bash scripts and more results without provenance or metadata. When reproducibility is easier than ever - we may see the reproducibility crisis worsen.

In this presentation, we will forcefully argue that the things we’ve been doing, are doing, and want to do to encourage humans to do reproducible data analyses - are the exact things we want to do to encourage reproducibility of agent-assisted and agent-lead data analyses. We will do so in the context of two powerful new ways to build Galaxy workflows. These are features that were scoped out and planned for humans but which agents can take advantage of to super power an analysis.