PLANEMO_STRUCTURE_PLAN

Planemo CLI Structure and Output Schemas

Target worktree: /Users/jxc755/projects/worktrees/planemo/branch/structure

Goal: make Planemo automation-friendly for downstream agents and build/casting systems without adding skill/runtime Python dependencies beyond planemo being on PATH.

Core Direction

Planemo should grow two structured surfaces upstream:

Foundry should consume these structured surfaces at Astro build and cast time. Generated skills should not import Planemo Python modules; they should invoke planemo as a subprocess and validate Planemo outputs against bundled schemas.

Research Findings

CLI command structure

Planemo commands are dynamically discovered from planemo/commands/cmd_*.py.

Relevant files:

Current docs are generated from help text, not structured introspection. That is useful for humans but too lossy for agents.

Important gap: planemo_option() rewrites defaults and uses callbacks for global config/env handling. If we introspect only raw Click fields, logical defaults and config/env metadata may be missing unless Planemo attaches its own metadata to the Click Option objects when creating them.

Likely bug found during research: planemo/commands/cmd_database_list.py appears to describe database_list, but its Click decorator declares @click.command("database_create"). Add a command-name consistency test before making metadata a contract.

Machine-readable outputs

Planemo already emits several JSON-ish outputs, but they are loose dicts without schemas.

Relevant files:

Important output gaps:

Tests and CI

Relevant files:

Relevant tests to reuse:

Architecture Proposal

Runtime boundary

Runtime users and generated skills should only need:

They should not need:

Build/cast boundary

Build and casting processes may use Python to extract Planemo metadata and schemas.

Allowed at build/cast time:

Not allowed at runtime/skilltime:

Structured surfaces to add upstream

Add these durable surfaces in Planemo:

Names are provisional. output_schema could be schema, json_schema, or metadata if Planemo has naming conventions I missed. Avoid overloading docs.

Work Plan

Current Branch State

Updated: 2026-05-03

Planemo branch /Users/jxc755/projects/worktrees/planemo/branch/structure has base commit a0d765fd Add structured CLI and output schemas plus uncommitted updates for invocation manifests, report normalization, runtime/input validation, and docs.

Implemented:

Verified:

.venv/bin/pytest tests/test_planemo.py tests/test_cli_metadata.py tests/test_output_schema.py tests/test_cmd_test.py::CmdTestTestCase::test_cwltool_tool_test tests/test_run.py::RunTestCase::test_run_cat_cwltool
.venv/bin/flake8 planemo/cli.py planemo/config.py planemo/cli_metadata.py planemo/output_models.py planemo/output_schemas.py planemo/test/models.py planemo/commands/cmd_cli_metadata.py planemo/commands/cmd_output_schema.py planemo/commands/cmd_database_list.py tests/test_cli_metadata.py tests/test_output_schema.py tests/test_cmd_test.py tests/test_run.py scripts/commands_to_rst.py
.venv/bin/pytest tests/test_cli_metadata.py tests/test_output_schema.py tests/test_cmd_merge_reports.py tests/test_cmd_test_reports.py
.venv/bin/pytest tests/test_cmd_download_invocation_export.py::CmdTestTestCase::test_download_run_output
.venv/bin/flake8 planemo/commands/cmd_invocation_download.py planemo/output_models.py planemo/output_schemas.py planemo/galaxy/test/actions.py planemo/test/results.py planemo/reports/build_report.py tests/test_output_schema.py tests/test_cmd_download_invocation_export.py tests/test_cli_metadata.py tests/test_cmd_merge_reports.py tests/test_cmd_test_reports.py
.venv/bin/pytest tests/test_output_schema.py tests/test_cmd_test_reports.py tests/test_cmd_merge_reports.py tests/test_run.py::RunTestCase::test_run_cat_cwltool tests/test_cmd_test.py::CmdTestTestCase::test_cwltool_tool_test
.venv/bin/pytest tests/test_cmds_with_workflow_id.py::CmdsWithWorkflowIdTestCase::test_serve_workflow
make lint-docs

Note: make lint-docs regenerated command/API docs but currently fails on pre-existing Sphinx warnings about standards/docs/best_practices and an ambiguous type cross-reference.

Still remaining:

Phase 0: Guardrails and bug-finding tests

Purpose: add cheap tests that expose current drift before layering metadata on top.

Work items:

  1. Done in tests/test_cli_metadata.py: added command-name consistency test.
  2. Done: each planemo/commands/cmd_<name>.py imports and asserts command.name == name.
  3. Done: moved current internal command policy from scripts/commands_to_rst.py into planemo.cli.INTERNAL_COMMANDS; added test documenting create_gist, shed_download.
  4. Done: guardrail exposed cmd_database_list.py; fixed decorator to database_list.
  5. Done: added manifest helper/schema tests and updated Galaxy E2E test for invocation_download --output_json.

Expected red failures:

Fix scope:

Tests to run:

pytest tests/test_planemo.py
pytest tests/test_cmd_download_invocation_export.py

Phase 1: CLI metadata extraction

Purpose: create a structured, automation-friendly view of Planemo commands from the actual Click objects.

New module:

Status: implemented.

Suggested API:

def iter_command_names(include_internal: bool = False) -> list[str]: ...
def load_command_metadata(command_name: str) -> dict: ...
def load_planemo_metadata(include_internal: bool = False) -> dict: ...
def serialize_click_command(command_name: str, command: click.Command) -> dict: ...
def serialize_click_param(param: click.Parameter) -> dict: ...
def serialize_click_type(param_type: click.ParamType) -> dict: ...

Suggested root shape:

{
  "schema_version": "0.1",
  "program": "planemo",
  "planemo_version": "...",
  "commands": [
    {
      "name": "test",
      "module": "planemo.commands.cmd_test",
      "help": "Run specified tool or workflow tests within Galaxy.",
      "short_help": "Run specified tool or workflow tests within Galaxy.",
      "usage": "planemo test [OPTIONS] TOOL_PATH",
      "internal": false,
      "hidden": false,
      "params": []
    }
  ],
  "aliases": { "t": "test", "s": "serve", "l": "lint", "o": "open" }
}

Suggested parameter shape:

{
  "kind": "option",
  "name": "test_output_json",
  "opts": ["--test_output_json"],
  "secondary_opts": [],
  "help": "Output test report (planemo json) defaults to tool_test_output.json.",
  "required": false,
  "multiple": false,
  "nargs": 1,
  "type": { "name": "path", "exists": false, "file_okay": true, "dir_okay": false },
  "default": "tool_test_output.json",
  "is_flag": false,
  "flag_value": null,
  "envvar": null,
  "planemo_config": {}
}

Click metadata to preserve:

Planemo-specific metadata to add:

New command:

Status: implemented.

Command behavior:

planemo cli_metadata --format json
planemo cli_metadata --command test --format json
planemo cli_metadata --include-internal --format json

Rules:

Tests:

Tests to run:

pytest tests/test_planemo.py tests/test_cli_metadata.py

Phase 2: Output model layer

Purpose: define public parse contracts for Planemo JSON outputs before downstream systems rely on them.

New module options:

Recommendation: start under planemo/test/models.py for test reports, and add a broader module only when run/download schemas need shared non-test naming.

Preferred model source:

Initial Pydantic models:

Potential supporting models:

Do not over-model Galaxy internals in the first pass. Keep compatibility fields permissive:

Status enum:

Compatibility requirements:

Suggested schema command:

planemo output_schema --format json
planemo output_schema --schema test-report --format json
planemo output_schema --schema run-outputs --format json
planemo output_schema --schema invocation-download-manifest --format json

Status: implemented as planemo output_schema --format json; currently exports test-report, run-outputs, and invocation-download-manifest.

Suggested root output:

{
  "schema_version": "0.1",
  "planemo_version": "...",
  "schemas": {
    "test-report": { "$schema": "https://json-schema.org/draft/2020-12/schema", "...": "..." },
    "run-outputs": { "...": "..." },
    "invocation-download-manifest": { "...": "..." }
  }
}

Questions for implementation:

Phase 3: Validate existing outputs at boundaries

Purpose: introduce schemas without breaking existing users abruptly.

Work items:

  1. Done: added model validation in tests first, not runtime enforcement.
  2. Done: validates tests/data/issue381.json against PlanemoTestReport.
  3. Not done directly: no dedicated RunResponse.structured_data() unit validation yet.
  4. Done: validates fast CWL tool_test_output.json generated by planemo test; planemo run test-style report; and workflow_test_on_invocation report output.
  5. Done: validates fast CWL run --output_json against PlanemoRunOutputs.
  6. Done: implemented and validated invocation_download --output_json as PlanemoInvocationDownloadManifest.

Runtime enforcement path:

Avoid immediate strict rejection for old external JSON files unless the command is explicitly producing the file in the same run.

Central write points:

Phase 4: Normalize report behavior

Purpose: make all commands that consume or produce report JSON share one contract.

Work items:

  1. Done: updated merge_reports() to emit a full PlanemoTestReport with version, tests, summary, and exit_code, not just { "tests": [...] }.
  2. Done: StructuredData.calculate_summary_data() and reports/build_report.py use the same status vocabulary.
  3. Done: fixed skip vs skipped counting.
  4. Done: test_reports can read old reports without summary and calculate summary for rendering.
  5. Done: added compatibility note in docs for merged report output shape changes.

Tests:

Phase 5: Docs generator migration

Purpose: stop text-scraping help output as the only command documentation source.

Do not rewrite docs generation first. After cli_metadata is stable:

  1. Refactor scripts/commands_to_rst.py to reuse shared command listing/internal-command policy.
  2. Add a parity test between cli_metadata and generated docs command list.
  3. Later, generate RST from structured metadata plus callback docstrings instead of parsing planemo <command> --help output.

Benefits:

Phase 6: Foundry integration after upstream branch stabilizes

Purpose: consume upstream Planemo structure without vendoring permanent snapshots.

Foundry build/cast behavior:

Initial Foundry Planemo pages:

Initial Foundry schema notes:

These should be temporary if Planemo exports schemas directly. Foundry can start with provisional schemas, but the plan should be to replace them with Planemo-derived schemas.

Runtime generated-skill behavior:

Initial Vertical Slice

Recommended first PR target in the Planemo structure branch:

  1. Done: added command-name consistency tests and fixed discovered mismatch.
  2. Done: added planemo cli_metadata --format json for command metadata.
  3. Mostly done: metadata tests cover test, run, workflow_test_on_invocation, and root command list includes workflow_test_init and invocation_download; dedicated detailed assertions for workflow_test_init and invocation_download remain.
  4. Done: added PlanemoTestReport Pydantic model and schema export.
  5. Done: validates existing tests/data/issue381.json and generated fast CWL tool_test_output.json in tests.
  6. Done: model accepts legacy skipped and normalizes to skip; report summary/rendering now count canonical skip.

This gives downstream consumers one command metadata surface and one high-value output schema without trying to model every Planemo output at once.

Second Vertical Slice

  1. Done: added permissive PlanemoRunOutputs schema.
  2. Done: validates planemo run --output_json in fast CWL test.
  3. Add or normalize Galaxy collection output compatibility if existing Galaxy tests expose shape variants.
  4. Ensure planemo run still emits test-style reports through --test_output_json and those reports validate as PlanemoTestReport.

Third Vertical Slice

  1. Done: implemented invocation_download --output_json manifest.
  2. Done: added PlanemoInvocationDownloadManifest schema.
  3. Done: added unit-level tests and updated tests/test_cmd_download_invocation_export.py E2E fixture.
  4. Done: manifest contains enough for agents:
    • invocation_id
    • output_directory
    • output_json path if applicable
    • output labels/IDs
    • downloaded file paths
    • missing outputs if ignored or encountered

Proposed CLI Metadata Schema Details

Command metadata should answer agent/build questions without invoking --help again:

Minimum required command fields:

Minimum required parameter fields:

Planemo-specific extensions:

Proposed Output Schemas

PlanemoTestReport

Current practical shape:

{
  "version": "0.1",
  "tests": [
    {
      "id": "string",
      "has_data": true,
      "data": {
        "status": "success|failure|error|skip",
        "inputs": {},
        "job": {},
        "invocation_details": {},
        "problem_log": "string",
        "output_problems": ["string"],
        "execution_problem": "string",
        "start_datetime": "ISO string",
        "end_datetime": "ISO string"
      },
      "doc": "string|null",
      "test_type": "galaxy_tool|galaxy_workflow|cwl_tool|cwl_workflow"
    }
  ],
  "summary": {
    "num_tests": 0,
    "num_failures": 0,
    "num_skips": 0,
    "num_errors": 0
  },
  "exit_code": 0
}

Allow legacy per-test shape:

{
  "has_data": false,
  "data": null
}

PlanemoRunOutputs

Current practical shape for planemo run --output_json:

{
  "output_id": {
    "path": "local file path",
    "basename": "filename"
  }
}

Keep value permissive initially because CWL and Galaxy collection outputs can differ.

PlanemoInvocationDownloadManifest

Proposed shape:

{
  "invocation_id": "...",
  "output_directory": "...",
  "outputs": {
    "label_or_id": {
      "path": "...",
      "basename": "...",
      "class": "File"
    }
  },
  "missing_outputs": []
}

This should be generated from the same RunResponse / output collection path used by collect_outputs() where feasible.

Red-to-Green Test Strategy

Fast first:

pytest tests/test_planemo.py tests/test_cli_metadata.py
pytest tests/test_cmd_test_reports.py
pytest tests/test_cmd_test.py::CmdTestTestCase::test_cwltool_tool_test
pytest tests/test_run.py::RunTestCase::test_run_cat_cwltool

Broader quick:

PLANEMO_SKIP_SLOW_TESTS=1 PLANEMO_SKIP_GALAXY_TESTS=1 pytest tests
tox -e py3.10-unit-quick

Lint/type/docs:

tox -e py3.10-lint
tox -e py3.10-mypy
tox -e py3.10-lint-docs

Galaxy-specific only after fast paths are green:

pytest tests/test_cmds_with_workflow_id.py
pytest tests/test_cmd_download_invocation_export.py
pytest tests/test_run.py::RunTestCase::test_run_export_invocation

Risks

Foundry Follow-Up

Once Planemo branch has a usable metadata/schema export:

  1. Add Foundry build/cast helper to invoke planemo cli_metadata --format json.
  2. Add Foundry build/cast helper to invoke planemo output_schema --format json.
  3. Add curated Planemo manpages as operational overlays, not complete option lists.
  4. Update run-workflow-test, implement-galaxy-workflow-test, debug-galaxy-workflow-output, and debug-cwl-workflow-output molds to reference exact Planemo command pages and schemas.
  5. Cast run-workflow-test and test against a tiny workflow.

Open Questions