TOOL_CACHE

Tool Cache: Lazy Per-Workflow Resolution

Goal

When a workflow document is opened, automatically resolve any uncached tools rather than waiting for the user to manually run “Populate Tool Cache”. Report failures clearly without blocking the user.

The current behavior: a cache miss during validation emits an Information diagnostic:

“Tool ‘x’ is not in the local cache. Run ‘Populate Tool Cache’ to enable tool state validation.”

The new behavior: on document open, the server proactively fetches uncached tools from ToolShed. If successful, re-validate and the diagnostic disappears. If the fetch fails, report the failure in an output channel and keep the existing Information diagnostic (or update its text to reflect “could not be resolved”).


Architecture Overview

The server side is identical in native and web environments — both run GalaxyWorkflowLanguageServerImpl via ToolCacheService. The difference is the LSP transport (IPC vs. Web Worker) and, critically, whether network access from the server process works.

Design decision: auto-resolution is an opt-in capability the client declares at initialization. Native clients declare it; web clients do not. The server only auto-resolves for clients that declare the capability.


Component Design

1. Client capability flag

Add a custom client capability (in InitializeParams.capabilities.experimental) signaling whether the client supports/wants server-initiated tool resolution:

experimental: { toolAutoResolution: true }

The server reads this flag once during initialize() and stores it. The existing buildBasicLanguageClientOptions() in client/src/common/index.ts is the right place to inject this.

2. Server-side trigger — documents.onDidOpen

GalaxyWorkflowLanguageServerImpl.trackDocumentChanges() currently only wires up onDidChangeContent and onDidClose. Add documents.onDidOpen to trigger resolution without re-running validation immediately (validation already fires from onDidChangeContent which is also emitted on open).

documents.onDidOpen(event => {
  if (autoResolutionEnabled) {
    ToolCacheService.scheduleResolution(event.document);
  }
})

onDidChangeContent is still the validation trigger; onDidOpen is solely the resolution trigger, so tool fetches don’t happen on every keystroke.

3. Batching / debounce in ToolCacheService

Multiple files may open at startup (e.g. a workspace with 10 workflows). Batch them:

An in-flight Set prevents duplicate fetches when the same tool appears in multiple open documents.

4. Re-validation after fetch

After populateCache() resolves:

5. Failure reporting

populateCache() already returns { fetched, alreadyCached, failed[] }. On completion:

6. Diagnostic text update

ToolStateValidationService currently emits a single Information message on cache miss. After this change it should distinguish:

The simplest implementation: add a resolutionFailed: Set<string> to ToolCacheService that ToolStateValidationService can query to pick the right message. No new diagnostic codes needed.


Web Environment

When experimental.toolAutoResolution is absent/false, no change to existing behavior:

A future improvement could add a galaxy-instance proxy or an iframe-based auth flow for web ToolShed access, but that’s out of scope here.


New LSP Additions

TypeNameDirectionPurpose
NotificationTOOL_RESOLUTION_FAILEDserver → clientReports per-tool fetch failures after a resolution batch
(Reused)POPULATE_TOOL_CACHEclient → serverAlready exists; no change needed
(Reused)GET_WORKFLOW_TOOL_IDSclient → serverWalk already in ToolCacheService; extract to shared helper

No new requests. The auto-resolution loop is entirely server-initiated; the client only receives a notification after the fact.


Implementation Steps

  1. Extract tool-ID-walk helper from ToolCacheService.onGetWorkflowToolIds() into a standalone function extractToolRefsFromDocument(doc: DocumentContext): ToolRef[] (exported from toolCacheService.ts).

  2. Add toolAutoResolution client initialization option in client/src/common/index.ts (buildBasicLanguageClientOptions now accepts optional initializationOptions). Set to true for the gxFormat2 client in native extension.ts. Read from params.initializationOptions?.toolAutoResolution in server.ts initialize(). Note: used initializationOptions rather than capabilities.experimental for simplicity.

  3. Add documents.onDidOpen handler in GalaxyWorkflowLanguageServerImpl. Calls ToolCacheService.scheduleResolution(). Handler fires after onDidChangeContent (which populates the cache) because both are registered in that order.

  4. Implement debounced batch resolution in ToolCacheService: 300 ms debounce timer, _pending Map (key → {toolRef, documentUris}), _inFlight Set for deduplication, populateCache() call, try/catch for unhandled rejection, re-validation of affected docs.

  5. Add TOOL_RESOLUTION_FAILED notification to requestsDefinitions.ts (LSNotificationIdentifiers.TOOL_RESOLUTION_FAILED + ToolResolutionFailedParams). Sent from ToolCacheService._flushResolution() when failures occur.

  6. Client notification handler: registered on gxFormat2Client in initExtension(). Writes each failure to “Galaxy Workflows — Tool Resolution” output channel. Shows a one-time warning toast per session (intentional — all failures still logged to output).

  7. Update diagnostic text in ToolStateValidationService: cache-miss now checks toolRegistryService.hasResolutionFailed()DiagnosticSeverity.Warning + “Could not resolve tool ’…’ from ToolShed — see Output panel.” when true. hasResolutionFailed / markResolutionFailed added to ToolRegistryService interface and implemented in ToolRegistryServiceImpl with an in-memory _resolutionFailed Set.

  8. Tests:

    • Unit: debounce batching, deduplication against in-flight set
    • Unit: re-validation triggered after populate completes
    • Unit: TOOL_RESOLUTION_FAILED notification sent on failure
    • Unit: Warning diagnostic when hasResolutionFailed returns true
    • Integration: open a document with an uncached tool → verify populateCache called → verify diagnostic updates

Open Questions

  1. ✅ resolved — deferred to follow-up. Should onDidOpen also re-trigger when a step with a new tool_id is added to an already-open document? (Requires diffing tool ID sets between validation runs.) A TODO comment is in server.ts near the onDidOpen handler.

  2. Should auto-resolution respect a user-facing setting (galaxyWorkflows.toolCache.autoResolve: boolean, default true)? Or is it always-on for native?

  3. ✅ resolved — Warning severity. On resolution failure, the diagnostic now uses DiagnosticSeverity.Warning + “Could not resolve tool ’…’ from ToolShed”.

  4. ✅ resolved — tracked separately. ToolInfoService fetch-based work for web: https://github.com/jmchilton/galaxy-tool-util-ts/issues/44