Week 3 Progress
Date: 2026-03-29
Branch (Galaxy): wf_tool_state — 20 consolidated commits, 71 files changed, ~10,500 lines added
Branch (gxformat2): abstraction_applications — PRs #159 and #164
gxformat2 Foundation
Most of this week’s Galaxy-side progress depended on two gxformat2 PRs landing first. PR #159 migrated Planemo’s best-practice linting into gxformat2, added a gxformat2.examples module with a cataloged fixture library and Sphinx docs, introduced Pydantic schema validation as a lint step, and wired IWC integration tests into CI. PR #164 was the bigger structural shift — it rewrote every major gxformat2 consumer (lint.py, abstract.py, cytoscape.py, normalize.py) against the typed normalized models from #161, moved connection resolution ($link, connect) from conversion time into normalization so every consumer reads connections from the same place, consolidated the circular to_format2.py/to_native.py/_expanded.py split into a single _conversion.py with polymorphic ensure_format2()/ensure_native() entry points, and added cross-format subworkflow expansion (a Format2 workflow can run: a native .ga URL and vice versa, with recursive type-safe inlining). Together these two PRs gave us the typed model API and callback protocol (state_encode_to_format2, state_encode_to_native on ConversionOptions) that the Galaxy workflow_state package now builds on.
Galaxy-side Work
On the Galaxy side, the week started with migrating the entire workflow_state package from raw dict access to gxformat2’s normalized Pydantic models (NormalizedNativeWorkflow, NormalizedNativeStep, NormalizedFormat2) and wiring up the callback protocols for schema-aware format conversion. From there the work branched in several directions: extracting shared infrastructure (a walk_format2_state() walker parallel to the existing native walker, a unified _state_merge.py for ConnectedValue injection replacing duplicate logic in convert.py and validation_format2.py, a shared validate_format2_state() used by both conversion and direct validation), building three-layer legacy detection (legacy_encoding.py classifies tool_state encoding age, legacy_parameters.py classifies ${...} replacement parameters, precheck.py scans whole workflows and skips gracefully), eliminating all __current_case__ usage in favor of the walker’s test-value-based branch resolution, deduplicating CLI boilerplate into _cli_common.py with ToolCacheOptions/build_base_parser()/cli_main(), renaming from galaxy-workflow-* to the gxwf-* namespace to unify with gxformat2’s structural commands, adding gxwf-to-native-stateful and gxwf-lint-stateful as new CLIs, building a two-level JSON Schema validation backend (structural workflow shape + per-step tool state against exported Pydantic schemas) for external tooling that can’t use the Python runtime, adding golden cache tests with real ToolShed fixtures as a cross-language contract, converting roundtrip models to Pydantic with structured markdown/JSON reports, and finally splitting all operations into single-file and tree variants via a generic _tree_orchestrator.py that replaces six copy-pasted directory-walk implementations with a process_one/aggregate/format callback pattern.
Net Result
The package went from 5 CLI commands under galaxy-workflow-* to 12 under gxwf-* (6 single-file + 6 tree variants) plus galaxy-tool-cache with new structural-schema and schema subcommands. The JSON Schema export pipeline — structural workflow schema from gxformat2’s models plus per-tool WorkflowStepToolState schemas — is the foundation for the VS Code extension work (D9). The three-layer legacy detection means workflows with old encoding or replacement parameters get skipped gracefully instead of producing false failures. The 69 incremental commits from two weeks of development were rebased into 20 cohesive commits telling a clean story: normalized model migration, CLI dedup, helper consolidation, roundtrip model rework, continued model migration, walker extraction, legacy detection, state merge unification, __current_case__ elimination, prechecking, format2 validation unification, CLI namespace unification, JSON Schema validation, cleanup, connection validation pipeline unification, golden cache tests, and tree orchestrator extraction.