YAML_SCHEMA_PLAN

Plan: narrow the YAML tool parameter schema

Progress (2026-04-05)

Implemented on branch yaml_schema_harden in the Galaxy worktree. All eight steps closed. Review pass recorded in YAML_SCHEMA_REVIEW.md; its flagged correctness issues and style nits have been folded back in.

Review correctness findings addressed in the same pass:

Real discrepancy discovered during that investigation, out of scope for this pass: _from_input_source_galaxy in lib/galaxy/tool_util/parameters/factory.py:134-143 reads the boolean test parameter’s default from the XML-era checked key, while the narrow YamlBooleanParameter reads value. For a boolean-test conditional, the XML-path re-parse can therefore compute a different is_default_when flag than to_internal(). Separate follow-up — either add checked support in factory’s YAML path or document value as the canonical YAML key. Left as an open question.

Tests: test/unit/tool_util/test_yaml_parameters.py (34) and test/unit/tool_util/test_parsing.py (83) all green.


Plan: narrow the YAML tool parameter schema

Companion to YAML_SCHEMA_ISSUE.md. Goal: introduce a YAML-facing parameter model layer used by UserToolSource and AdminToolSource, with a mapping into the existing internal XML metamodel so execution, state transforms, and runtime JSON generation are unchanged.

Decisions already locked in

v1 supported parameter set

Leaf types: text, integer, float, boolean, select (static options only), color, data, data_collection.

Structural groups: conditional, repeat, section.

Everything else is rejected at parse time.

Narrow field set per type

Base (all types): name, type, label, help, optional. No argument, is_dynamic, hidden, parameter_type.

All models: model_config = ConfigDict(extra="forbid", populate_by_name=True).

Architecture

┌────────────────────────────────────────────────────┐
│ YAML authoring layer   (NEW)                       │
│   yaml_parameters.py                               │
│   YamlGalaxyToolParameter  (RootModel, narrow)     │
│   UserToolSource.inputs:  List[YamlGalaxy...]      │
│   AdminToolSource.inputs: List[YamlGalaxy...]      │
└──────────────────┬─────────────────────────────────┘
                   │ to_internal_parameter()

┌────────────────────────────────────────────────────┐
│ Internal metamodel (UNCHANGED)                     │
│   GalaxyParameterT / ToolParameterT                │
│   state transforms, runtimeify, runtime_model,     │
│   input_models_for_tool_source, XML parser path    │
└────────────────────────────────────────────────────┘

The YAML layer is purely authoring + publication. Everything below it continues to see the existing internal models, so execution, state validation, job persistence, and the client runtime_model pipeline need no changes of shape.

Steps

Step 1 — New module lib/galaxy/tool_util_models/yaml_parameters.py

Step 2 — Wire into tool source models

In lib/galaxy/tool_util_models/__init__.py:

Step 3 — Mapping layer to_internal_parameter()

Either a method on each YamlFooParameter or a visitor function in yaml_parameters.py. Returns the corresponding internal model (e.g. YamlBooleanParameter.to_internal() -> BooleanParameterModel with truevalue/falsevalue left at their defaults — they are never consulted by the YAML runner anyway).

Structural groups recurse: YamlConditional.to_internal() builds a ConditionalParameterModel whose test_parameter is the mapped internal bool/select and whose whens contain mapped child parameters.

Call site: wherever we currently take a validated UserToolSource and hand it off for execution. Candidates to inspect:

Note: runtime_model currently does payload.representation.model_dump(by_alias=True)YamlToolSource(root_dict=...)input_models_for_tool_source. Because YamlToolSource.parse_input_pages re-reads the raw dict, the internal-facing structure is still built by the XML-era parser from the dict. Two options:

a. Leave the current path alone. The YAML layer’s only job is validation at the API boundary — once UserToolSource has accepted the payload, the raw dict passed to YamlToolSource is guaranteed narrow, and input_models_for_tool_source happens to produce a subset of the internal metamodel that matches the YAML layer. b. Short-circuit: if we already have a parsed UserToolSource, build the ToolParameterBundleModel directly via the mapping layer, skipping the re-parse through YamlToolSource.

Start with (a) — lowest blast radius. Add (b) later if we want a single code path.

Step 4 — Regenerate ToolSourceSchema.json

Run client/src/components/Tool/rebuild.py. Commit the new narrower ToolSourceSchema.json. This is the user-visible payoff: Monaco immediately stops advertising truevalue, falsevalue, argument, is_dynamic, hidden, parameter_type, and the deferred parameter types.

Step 5 — Tests (red → green)

Location: test/unit/tool_util/ (new file test_yaml_parameters.py).

Red cases (should raise ValidationError):

Green cases (round-trip YAML → UserToolSourceto_internal() → internal metamodel → create_job_runtime_model(...).model_json_schema(...)):

Snapshot test: ToolSourceSchema.json does not contain any of the blacklist substrings (truevalue, falsevalue, argument, is_dynamic, parameter_type, hierarchy, data_ref, genomebuild, group_tag, baseurl). Prevents silent regressions as the internal metamodel grows.

API integration (lib/galaxy_test/api/test_unprivileged_tools.py):

Step 6 — Lock down runtimeify

In lib/galaxy/tool_util/parameters/convert.py, have runtimeify’s visitor callback explicitly enumerate the v1 supported parameter types and raise on anything else encountered in a tool originating from a YAML source. Right now it silently passes through unsupported types via VISITOR_NO_REPLACEMENT, hiding gaps. Since the tightened UserToolSource can no longer produce those types, this assertion should never fire in practice — it exists to catch mapping bugs.

This may need a way to know the tool is YAML-origin. If that’s awkward, make it a separate yaml_runtimeify wrapper and leave the XML path alone.

Step 7 — Compatibility for existing stored tools

user_dynamic_tool_association rows may already hold YAML blobs with fields the new schema rejects (e.g. tools created during the PR 19434 beta that happened to include truevalue).

Approach:

Decision needed: is the beta user population small enough to skip the lenient load path entirely and just migrate the rows? (See open questions.)

Step 8 — Docs

Files touched

New:

Modified:

Not touched (intentionally):

Unresolved questions