PLAN_SEARCH_CLI

Plan: wire @galaxy-tool-util/search into the gxwf CLI for a find-shed-tool skill

Goal

Expose the new @galaxy-tool-util/search package via CLI so an agent skill can do instance-agnostic Tool Shed tool discovery: text → candidate tools → enough metadata to pick one → cache + retrieve the parsed tool definition for use in a Format-2 workflow.

The CLI is the only surface a skill needs to depend on. The skill never imports the package directly; it shells out to commands that emit JSON.

What’s already in place

packages/search/src/:

packages/cli/src/bin/:

Already wired: ToolSource[] config (core config.ts, galaxy.workflows.toolSources), ToolInfoService with caching, DEFAULT_TOOLSHED_URL = https://toolshed.g2.bx.psu.edu.

What’s missing (and constrains design)

From the Tool Shed research (Component - Tool Shed Search and Indexing.md):

A skill that calls into this needs to expect noisy results and a multi-call dance (search → versions → metadata) per candidate.

Iterative CLI plan

Each stage is a shippable increment. Stages 1–3 are the minimum viable surface for a find-shed-tool skill; later stages improve quality of selection, batch use, and offline workflow authoring.

Status (2026-04-25): Stages 1, 2, 3, 7 are shipped. Stage 5 removed by decision. Stages 4 (QoL filters, repo-search) and 6 (--enrich) remain as optional follow-ups, deferred until skill testing demonstrates need. The minimum viable CLI surface for find-shed-tool is complete.

Stage 1 — gxwf tool-search (single-shot text search, JSON-first) ✅ shipped

New file: packages/cli/src/commands/tool-search.ts. New subcommand on gxwf. Flat-verb naming (tool-search, tool-versions, tool-revisions) matches the existing validate-tree/lint-tree/clean-tree style in the same bin rather than introducing the first noun-namespace.

gxwf tool-search <query>
  --page-size <n>             # default 20
  --max-results <n>           # default 50
  --json                      # machine-readable
  [--cache-dir <dir>]         # for any future cache reuse; ignored at this stage

The CLI targets a single Tool Shed (DEFAULT_TOOLSHED_URL = https://toolshed.g2.bx.psu.edu). No --toolshed flag, no multi-source config. The package’s ToolSearchService retains its multi-source capability for other consumers; the CLI just doesn’t expose it.

Why a wrapper envelope: Stage 1 ships an array of hits only; later stages may add truncated/nextPage. Lock the envelope shape now.

Tests (mirror existing CLI test pattern): test/commands.test.ts style — run the binary against a recorded ToolShed response (test/helpers already has fixtures). Use a stub fetcher injected through a small test-only flag, OR — simpler — record a fixture and serve it via vitest mock.

Skill consumption pattern after Stage 1:

gxwf tool-search "fastqc" --json
→ pick hit by inspection of name/description/owner/repo

Stage 2 — gxwf tool-versions (resolve picks to installable versions) ✅ shipped

Search hits don’t carry versions. Skill needs a way to ask “what versions of this tool exist?” The search package already wraps getTRSToolVersions.

gxwf tool-versions <trs-id|owner/repo/tool_id>
  --json
  --latest                    # only print the latest version

Why a separate command, not flags on tool-search: getTRSToolVersions is one HTTP call per hit. Putting it on tool-search would either (a) be N+1 over every result (slow, often wasteful) or (b) require a --resolve-versions flag that complicates the JSON shape. Skills can call tool-versions selectively for the candidate(s) they care about.

Stage 3 — galaxy-tool-cache add already covers fetch, but expose a thin discovery→cache shortcut ✅ shipped (no new code; documented loop)

Once the skill has picked (trsToolId, version), it needs the parsed tool to wire it into a workflow. That’s already what galaxy-tool-cache add does — and it lands the result in the same on-disk cache that gxwf validate/lint/convert --stateful consult. Don’t duplicate.

Skill loop, end-to-end:

gxwf tool-search "<query>" --json                     → hits
gxwf tool-versions <trsId> --latest --json            → version
galaxy-tool-cache add <trsId> --version <v>           → cached ParsedTool
galaxy-tool-cache info <trsId>                        → for show-user step
galaxy-tool-cache schema <trsId> --representation workflow_step
                                                      → JSON Schema for editing the workflow

The only new piece in Stage 3 is documentation tying these together (skill-side, see find-shed-tool skeleton below).

Optional convenience if testing shows the four-step dance is awkward: add gxwf tool-fetch <query> that searches → picks top-1 → resolves latest → adds to cache, returning the cache key. Defer this until skill testing demonstrates need; it muddies the responsibility split.

Stage 4 — gxwf tool-search quality-of-life: filtering, repo search, paging — deferred

Once Stages 1–3 are in use and skill testing reveals where ranking falls short, layer on:

Repo search is materially better for “find me a package about X” queries (richer fields, popularity boost). Tool search is better for “find me a specific tool by exact name.” A skill can branch on whether the user said “I want a tool that does X” (tool search) vs. “I want the X package” (repo search).

Stage 5 — (removed) multi-toolshed config

Decision: there is only one Tool Shed. The CLI does not expose a --toolshed flag and does not read galaxy.workflows.toolSources. The find-shed-tool skill never prompts the agent to consider mirrors or alternative sheds. The package’s ToolSearchService retains its multi-source capability for non-CLI consumers.

Stage 6 — search-time enrichment for skill latency — deferred

ToolSearchService.searchTools(..., { enrich: true }) resolves each hit’s ParsedTool via ToolInfoService (and caches it). Surface as --enrich on gxwf tool-search. Off by default (cost: one fetch per hit).

When the skill knows it’ll only pick the top 1–3 results, enriching saves a follow-up galaxy-tool-cache add. JSON output gains a parsedTool field per hit when enriched.

Defer until the skill is exercised against real workflows — premature for Stage 1.

Stage 7 — repo→installable-revisions resolution for compose-time correctness ✅ shipped

The (owner, repo, tool_id) from a search hit doesn’t tell you which changeset_revisions of the repo actually contain that tool. Galaxy installs at (name, owner, changeset_revision) granularity. To produce a workflow that’s portably reinstallable, the skill needs the changeset.

Shipped form: gxwf tool-revisions <tool-id> [--tool-version <v>] [--latest] [--json]

The flag is --tool-version rather than --version because commander’s program-level --version flag intercepts. Implementation lives in packages/search/src/client/revisions.ts (getToolRevisions) and uses the 3-call workaround documented in PLAN_SEARCH_CLI_STAGE7.md: /api/repositories?owner=&name= → encoded repo id → /api/repositories/{id}/metadata?downloadable_only=true + get_ordered_installable_revisions (in parallel) → filter + sort.

Output JSON envelope: { trsToolId, version?, revisions: [{ changesetRevision, toolVersion }] }. Exit codes: 0 hits, 2 empty, 3 fetch error.

Tests: 6 search-client tests + 7 CLI tests, mock-fetcher style with real ToolShed fixtures captured by scripts/regen-toolshed-fixtures.mjs.

Tool Shed enhancement requests (P0–P6 in PLAN_SEARCH_CLI_STAGE7.md) are tabled — the workaround is good enough for the skill loop. Revisit if the metadata-blob fetch becomes a real latency problem.

find-shed-tool skill skeleton (informs the CLI design)

Single-skill, no router. Lives at skills/find-shed-tool/.

SKILL.md          # procedure: parse intent → search → narrow → resolve → cache
README.md         # quick start
references/
  toolshed-fields.md   # what name/description/help mean, what they don't
  ranking-caveats.md   # wildcard wrapping, case asymmetry, dead approved boost
examples/
  fastqc-pick.md       # walked example: query "fastq quality control"
  ambiguous-bwa.md     # walked example: same tool_id in multiple repos

Procedure (pseudocode):

1. From user prose, extract a search term (1–4 words; concrete tool name preferred).
2. Run `gxwf tool-search "<term>" --json` (Stage 1).
3. Triage hits:
   - Discard hits whose name/description don't plausibly match.
   - When multiple owners publish the same tool_id, prefer iuc/devteam/bgruening
     (configurable allowlist, mirror nf-to-galaxy convention).
   - Surface ambiguity to the user when no clear winner.
4. For the chosen hit, run `gxwf tool-versions <trsId> --latest --json` (Stage 2).
5. `galaxy-tool-cache add <trsId> --version <v>` (Stage 3).
6. Confirm: `galaxy-tool-cache info <trsId>` — show name/version/description to user.
7. (Workflow-authoring follow-up, owned by other skills:)
   `galaxy-tool-cache schema <trsId> --representation workflow_step`
   → JSON Schema the workflow-authoring skill uses to fill `tool_state`.

The skill is CLI-only, machine-readable JSON, no MCP/instance dependence. It is instance-agnostic: a Tool Shed is the registry; whether any Galaxy server has the tool installed is a separate question handled elsewhere.

Surfaces shared by both CLIs

Test strategy

Versioning

Each stage = one Changesets minor on @galaxy-tool-util/cli and (when new exports are added) on @galaxy-tool-util/search. The empty .changeset/violet-bags-search.md already in the worktree is the placeholder for Stage 1.

Unresolved questions