VSCode Extension Phase 2: Tool Registry Service — Detailed Plan
Parent plan: VSCODE_INTEGRATION_PLAN.md Phase 2
Extension repo: /Users/jxc755/projects/repositories/galaxy-workflows-vscode
Fork: jmchilton/galaxy-workflows-vscode
Goal
Extension resolves tool definitions from the local filesystem cache (~/.galaxy/tool_info_cache) and fetches on-demand from ToolShed 2.0 TRS API. No proxy server required. Provides a “Populate Tool Cache” command and a status bar indicator.
This phase is infrastructure — it doesn’t change completions or validation yet. Phases 3-4 consume this service.
Existing Architecture (Relevant Details)
DI container: Inversify. Services registered via symbols in TYPES object at:
server/packages/server-common/src/languageTypes.ts(lines ~301-312)
Service registration: server/packages/server-common/src/inversify.config.ts
- Singleton bindings:
container.bind<T>(TYPES.T).to(TImpl).inSingletonScope() - Format2 server inherits base container and adds its own bindings
Settings: ConfigService reads from "galaxyWorkflows" namespace via LSP workspace.getConfiguration(). Per-document caching. Settings declared in root package.json under contributes.configuration.
Current settings shape:
interface ExtensionSettings {
cleaning: { cleanableProperties: string[] };
validation: { profile: "basic" | "iwc" };
}
Command pattern: CustomCommand abstract class in client/src/commands/. Each command has an identifier string and execute() method. Registered in client/src/commands/setup.ts. Declared in package.json contributes.commands.
Custom LSP requests: Defined in shared/src/requestsDefinitions.ts. Client sends via client.sendRequest(), server handles via connection.onRequest().
AST access to tool_id: Steps are AST nodes with properties. Existing code extracts tool_id via:
const tool_id = step.properties.find((p) => p.keyNode.value === "tool_id");
const value = tool_id?.valueNode?.value?.toString();
(See server/packages/server-common/src/providers/validation/rules/TestToolShedValidationRule.ts)
Cache Format (from galaxy-tool-util monorepo)
The galaxy-tool-cache CLI writes to ~/.galaxy/tool_info_cache/:
index.json—{ entries: { [cacheKey]: CacheIndexEntry } }{cacheKey}.json— fullParsedToolJSON (id, version, name, description, inputs, outputs, etc.)
CacheIndexEntry:
{ tool_id: string, tool_version: string, source: string, source_url: string, cached_at: string }
cacheKey format: SHA-256-based hash of (toolshedUrl, trsToolId, version).
ParsedTool.inputs is the parameter tree used to generate per-tool JSON Schemas via createFieldModel() + JSONSchema.make().
Steps
Step 1: Add Settings
File: package.json (root)
Add to contributes.configuration.properties:
"galaxyWorkflows.toolCache.directory": {
"type": "string",
"default": "~/.galaxy/tool_info_cache",
"description": "Directory containing cached tool definitions (shared with galaxy-tool-cache CLI)."
},
"galaxyWorkflows.toolShed.url": {
"type": "string",
"default": "https://toolshed.g2.bx.psu.edu",
"description": "ToolShed URL for fetching tool definitions on cache miss."
}
File: server/packages/server-common/src/configService.ts
Extend ExtensionSettings interface:
interface ExtensionSettings {
cleaning: CleaningSettings;
validation: ValidationSettings;
toolCache: { directory: string };
toolShed: { url: string };
}
Update defaultSettings:
const defaultSettings: ExtensionSettings = {
cleaning: { cleanableProperties: ["position", "uuid", "errors", "version"] },
validation: { profile: "basic" },
toolCache: { directory: "~/.galaxy/tool_info_cache" },
toolShed: { url: "https://toolshed.g2.bx.psu.edu" },
};
Test: Verify settings are read correctly by ConfigService (unit test with mock connection).
Step 2: Define ToolRegistryService Interface + TYPES Symbol
File: server/packages/server-common/src/languageTypes.ts
Add symbol:
// In TYPES object
ToolRegistryService: Symbol.for("ToolRegistryService"),
Add interface:
/** Metadata about a cached tool's parameter tree. */
export interface CachedToolInfo {
toolId: string;
toolVersion: string;
/** Raw ParsedTool JSON — inputs array is the parameter tree. */
parsedTool: ParsedToolJson;
source: "cache" | "fetched";
}
/** Lightweight representation of ParsedTool — no Effect dependency. */
export interface ParsedToolJson {
id: string;
version: string;
name: string;
description: string | null;
inputs: ToolParameterJson[];
outputs: ToolOutputJson[];
[key: string]: unknown;
}
export interface ToolParameterJson {
name: string;
argument: string | null;
type: string;
label: string;
help: string | null;
optional: boolean;
value: unknown;
[key: string]: unknown;
}
export interface ToolOutputJson {
name: string;
[key: string]: unknown;
}
export interface ToolRegistryService {
/** Resolve tool info by ID + version. Returns null if not found anywhere. */
getToolInfo(toolId: string, toolVersion?: string): Promise<CachedToolInfo | null>;
/** Check if tool is available (cache or fetchable). Fast — checks cache only. */
hasCached(toolId: string, toolVersion?: string): boolean;
/** List all tools in the local cache. */
listCached(): CachedToolEntry[];
/** Pre-fetch and cache all tools referenced in a set of tool_id/version pairs. */
populateCache(tools: Array<{ toolId: string; toolVersion?: string }>): Promise<PopulateCacheResult>;
/** Number of tools in local cache. */
readonly cacheSize: number;
}
export interface CachedToolEntry {
cacheKey: string;
toolId: string;
toolVersion: string;
source: string;
cachedAt: string;
}
export interface PopulateCacheResult {
fetched: number;
alreadyCached: number;
failed: Array<{ toolId: string; error: string }>;
}
Note: These types are deliberately plain JSON — no Effect/Schema dependency in the extension. The extension reads cached ParsedTool JSON files directly without the Effect decode step. If the JSON is malformed it just returns null.
Step 3: Implement ToolRegistryServiceImpl
New file: server/packages/server-common/src/providers/toolRegistry.ts
@injectable()
export class ToolRegistryServiceImpl implements ToolRegistryService {
private memoryCache = new Map<string, CachedToolInfo>();
private cacheDir: string;
private toolShedUrl: string;
private indexData: Record<string, CachedToolEntry> | null = null;
constructor() {
// Defaults — overridden by configure() after settings load
this.cacheDir = path.join(os.homedir(), ".galaxy", "tool_info_cache");
this.toolShedUrl = "https://toolshed.g2.bx.psu.edu";
}
/** Called after ConfigService loads settings. */
configure(settings: { cacheDir: string; toolShedUrl: string }): void;
async getToolInfo(toolId: string, toolVersion?: string): Promise<CachedToolInfo | null>;
// 1. Check memoryCache
// 2. Check filesystem (read {cacheKey}.json, parse as JSON, wrap in CachedToolInfo)
// 3. On miss: fetch from ToolShed TRS API
// 4. On successful fetch: write to filesystem cache + memory cache
// 5. Return null on all failures (log warning)
hasCached(toolId: string, toolVersion?: string): boolean;
// Check memoryCache, then check index.json entries
listCached(): CachedToolEntry[];
// Read index.json, return entries
async populateCache(tools): Promise<PopulateCacheResult>;
// For each tool: getToolInfo() with concurrency limit (5 parallel fetches)
// Track fetched/cached/failed counts
// --- Private helpers ---
private resolveCoordinates(toolId: string, toolVersion?: string): { toolshedUrl, trsToolId, version, cacheKey };
// Mirrors ToolCache.resolveToolCoordinates() logic from monorepo
// Parses toolshed URL from tool_id string (e.g. "toolshed.g2.bx.psu.edu/repos/iuc/bedtools/...")
// Falls back to configured toolShedUrl for bare tool IDs
private async readFromFilesystem(cacheKey: string): Promise<ParsedToolJson | null>;
// Read {cacheDir}/{cacheKey}.json, JSON.parse, return or null on error
private async fetchFromToolShed(trsToolId: string, version: string, toolshedUrl: string): Promise<ParsedToolJson>;
// HTTP GET {toolshedUrl}/api/tools/{trsToolId}/versions/{version}
// 30s timeout, Accept: application/json
// Parse response as JSON, return
private async writeToCache(cacheKey: string, tool: ParsedToolJson, meta: CachedToolEntry): Promise<void>;
// Write {cacheDir}/{cacheKey}.json
// Update index.json (read-modify-write)
// mkdir -p if needed
private loadIndex(): Record<string, CachedToolEntry>;
// Read {cacheDir}/index.json synchronously, cache in this.indexData
private computeCacheKey(toolshedUrl: string, trsToolId: string, version: string): string;
// SHA-256 hash matching monorepo's cacheKey() function
}
Key implementation details:
-
Tool ID parsing: A tool_id like
toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_intersectbed/2.31.1encodes: toolshed URL (toolshed.g2.bx.psu.edu), TRS ID (iuc~bedtools~bedtools_intersectbedafter repos→tilde conversion), version (2.31.1). The parsing regex is straightforward and already implemented in the monorepo atpackages/core/src/cache/tool-id.ts. Replicate the logic (~30 lines). -
Cache key computation: Must match
packages/core/src/cache/cache-key.tsexactly so the extension reads cache files written bygalaxy-tool-cache. Check the monorepo’s implementation — it’s a simple hash of{toolshedUrl}/{trsToolId}/{version}. -
HTTP fetch: Use
globalThis.fetch(available in Node 18+, which VSCode ships). For the web extension build,fetchis natively available. No new dependencies needed. -
Concurrency limit in populateCache: Simple semaphore pattern —
Promise.allwith a pool of 5. -
~/ expansion: The
cacheDirsetting may contain~— expand toos.homedir()at configure time.
Test strategy:
- Unit test
resolveCoordinates()with various tool_id formats - Unit test filesystem read/write with temp directory
- Unit test
getToolInfo()with mock fetch (cache miss → fetch → cache hit) - Integration test: write a cache file manually, verify
getToolInfo()reads it - Verify cache key compatibility: run
galaxy-tool-cache addfor a tool, then verify the extension reads it
Step 4: Register in Inversify + Wire to Server
File: server/packages/server-common/src/inversify.config.ts
container
.bind<ToolRegistryService>(TYPES.ToolRegistryService)
.to(ToolRegistryServiceImpl)
.inSingletonScope();
File: server/packages/server-common/src/server.ts
In GalaxyWorkflowLanguageServerImpl:
@inject(TYPES.ToolRegistryService)
public readonly toolRegistryService: ToolRegistryService;
In the initialize() method, after config is loaded:
// Configure tool registry with loaded settings
const settings = await this.configService.getDocumentSettings("");
const toolRegistry = this.toolRegistryService as ToolRegistryServiceImpl;
toolRegistry.configure({
cacheDir: settings.toolCache.directory,
toolShedUrl: settings.toolShed.url,
});
In onConfigurationChanged(), re-configure if settings changed.
File: server/packages/server-common/src/server.ts — also expose on the interface:
Add to GalaxyWorkflowLanguageServer interface:
toolRegistryService: ToolRegistryService;
This makes it available to language services via this.server.toolRegistryService (set in LanguageServiceBase.setServer()).
Step 5: “Populate Tool Cache” Command
New file: client/src/commands/populateToolCache.ts
export class PopulateToolCacheCommand extends CustomCommand {
static readonly identifier = "galaxy-workflows.populateToolCache";
readonly identifier = PopulateToolCacheCommand.identifier;
async execute(): Promise<void> {
// 1. Collect tool_ids from all open workflow documents
// Send a new LSP request: GET_WORKFLOW_TOOL_IDS
// Server extracts tool_id + tool_version from all steps in all open workflows
//
// 2. Show progress notification:
// window.withProgress({ location: ProgressLocation.Notification, title: "Populating tool cache..." })
//
// 3. Send POPULATE_TOOL_CACHE request to server with the tool list
// Server calls toolRegistryService.populateCache()
//
// 4. Show result: "Fetched X tools, Y already cached, Z failed"
//
// 5. Update status bar
}
}
New LSP request identifiers in shared/src/requestsDefinitions.ts:
export const GET_WORKFLOW_TOOL_IDS = "galaxy-workflows/getWorkflowToolIds";
export const POPULATE_TOOL_CACHE = "galaxy-workflows/populateToolCache";
export const GET_TOOL_CACHE_STATUS = "galaxy-workflows/getToolCacheStatus";
Server-side handlers — new service in server/packages/server-common/src/services/toolCacheService.ts:
@injectable()
export class ToolCacheService extends ServiceBase {
activate(connection: Connection, server: GalaxyWorkflowLanguageServer): void {
connection.onRequest(GET_WORKFLOW_TOOL_IDS, () => {
return this.extractToolIds(server);
});
connection.onRequest(POPULATE_TOOL_CACHE, async (params: { tools: ToolRef[] }) => {
return server.toolRegistryService.populateCache(params.tools);
});
connection.onRequest(GET_TOOL_CACHE_STATUS, () => {
return { cacheSize: server.toolRegistryService.cacheSize };
});
}
private extractToolIds(server: GalaxyWorkflowLanguageServer): ToolRef[] {
// Walk all cached documents
// For each workflow document, iterate steps
// Extract tool_id + tool_version from step AST properties
// Deduplicate
const toolRefs: ToolRef[] = [];
for (const doc of server.documentsCache.all()) {
// Use the existing nodeManager to walk step nodes
// step.properties.find(p => p.keyNode.value === "tool_id")
// step.properties.find(p => p.keyNode.value === "tool_version")
}
return toolRefs;
}
}
Register ToolCacheService in inversify alongside other services.
File: package.json — declare command:
{
"command": "galaxy-workflows.populateToolCache",
"title": "Populate Tool Cache",
"category": "Galaxy Workflows"
}
File: client/src/commands/setup.ts — register:
new PopulateToolCacheCommand(context, gxFormat2LanguageClient).register();
Step 6: Status Bar Indicator
New file: client/src/statusBar.ts
export class ToolCacheStatusBar {
private item: StatusBarItem;
private client: LanguageClient;
private refreshInterval: NodeJS.Timeout | null = null;
constructor(client: LanguageClient) {
this.client = client;
this.item = window.createStatusBarItem(StatusBarAlignment.Right, 100);
this.item.command = "galaxy-workflows.populateToolCache";
this.item.tooltip = "Galaxy Tool Cache — click to populate";
}
async refresh(): Promise<void> {
try {
const status = await this.client.sendRequest<{ cacheSize: number }>(
GET_TOOL_CACHE_STATUS
);
this.item.text = `$(database) Tools: ${status.cacheSize}`;
this.item.show();
} catch {
this.item.hide();
}
}
startPolling(intervalMs = 30_000): void {
this.refresh();
this.refreshInterval = setInterval(() => this.refresh(), intervalMs);
}
dispose(): void {
if (this.refreshInterval) clearInterval(this.refreshInterval);
this.item.dispose();
}
}
Initialize in client/src/common/index.ts initExtension():
const statusBar = new ToolCacheStatusBar(gxFormat2LanguageClient);
statusBar.startPolling();
context.subscriptions.push(statusBar);
Step 7: Workspace Auto-Discovery (Optional, can defer)
On extension activation, if workspace contains workflow files:
- Extract tool_ids from all
.gxwf.ymland.gafiles in workspace - Check how many are cached vs not
- If >50% uncached, show notification: “X tools not cached. Populate now?” with “Yes” / “Later” buttons
- “Yes” triggers the populate command
This is a nice-to-have. Can be a separate small PR after the core service lands.
File Summary
| Action | File | What |
|---|---|---|
| Edit | package.json | Add settings + command declaration |
| Edit | server/packages/server-common/src/languageTypes.ts | Add TYPES symbol + interfaces |
| Edit | server/packages/server-common/src/configService.ts | Extend ExtensionSettings |
| New | server/packages/server-common/src/providers/toolRegistry.ts | ToolRegistryServiceImpl |
| Edit | server/packages/server-common/src/inversify.config.ts | Bind ToolRegistryService |
| Edit | server/packages/server-common/src/server.ts | Inject + expose on interface |
| Edit | shared/src/requestsDefinitions.ts | Add 3 request identifiers |
| New | server/packages/server-common/src/services/toolCacheService.ts | LSP request handlers |
| New | client/src/commands/populateToolCache.ts | Populate command |
| Edit | client/src/commands/setup.ts | Register command |
| New | client/src/statusBar.ts | Cache status indicator |
| Edit | client/src/common/index.ts | Initialize status bar |
New files: 4
Edited files: 7
Testing Plan
Unit Tests
-
Tool ID parsing — various formats:
- Full ToolShed ID:
toolshed.g2.bx.psu.edu/repos/iuc/bedtools/bedtools_intersectbed/2.31.1 - Bare tool ID:
cat1 - TRS-style ID:
iuc~bedtools~bedtools_intersectbed - Missing version → returns null
- Full ToolShed ID:
-
Cache key compatibility — generate cache keys and verify they match files written by
galaxy-tool-cache add:- Run
galaxy-tool-cache add toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.74+galaxy0 - Verify the extension’s
computeCacheKey()produces the same filename
- Run
-
Filesystem cache read — write a known
{key}.json+index.jsonto temp dir, verifygetToolInfo()returns it -
Memory cache — call
getToolInfo()twice, verify second call doesn’t hit filesystem -
Fetch fallback — mock
fetch, verify:- Cache miss triggers fetch
- Successful fetch writes to filesystem + memory cache
- Failed fetch returns null (doesn’t throw)
- Timeout (>30s) returns null
-
populateCache — mock fetch, pass 10 tools, verify concurrency ≤5, verify result counts
Integration Tests
-
Settings propagation — configure
toolCache.directoryvia mock client settings, verify ToolRegistryService uses it -
Populate command — open a workflow fixture, run
PopulateToolCacheCommand, verify tools appear in cache -
Cross-compatibility — use
galaxy-tool-cacheCLI to populate cache for a workflow, then verify extension reads the same cache correctly (critical for shared-cache story)
Red-to-Green Order
- Write test for
resolveCoordinates()→ implement - Write test for
computeCacheKey()→ implement, verify against monorepo’s output - Write test for filesystem read → implement
readFromFilesystem() - Write test for
getToolInfo()with pre-populated cache → implement cache-hit path - Write test for
getToolInfo()with mock fetch → implement fetch + write-back - Write test for
populateCache()→ implement with concurrency - Write test for settings propagation → wire ConfigService
- Write integration test for populate command → implement command + LSP handlers
Deferred
- Proxy/gxwf-web integration: Phase 7. Not needed for desktop — the extension reads local cache and fetches directly from ToolShed.
- Galaxy instance as source: Would require API key settings, CORS handling. Future enhancement.
- Workspace auto-discovery notification: Nice UX, but not blocking. Small follow-up PR.
- Web extension support:
fetchworks in web context, but filesystem cache does not. Web path needs IndexedDB or server-backed storage (Phase 7 territory).
Unresolved Questions
- Cache key function:
Need to verifyVerified. It’sSHA-256('{toolshedUrl}/{trsToolId}/{toolVersion}')as hex — 3 lines inpackages/core/src/cache/cache-key.ts. Tool ID parsing is ~30 lines inpackages/core/src/cache/tool-id.ts: splits on/repos/, extractsowner~repo~toolas TRS ID, prefixeshttps://if missing. Both trivial to replicate. Usenode:cryptocreateHash (available in both Node and VSCode’s electron). tool_versionextraction from Format2: In format2, the version may be embedded intool_id(after the last/) rather than a separatetool_versionfield. Need to handle both patterns.- Settings scope: Should
toolCache.directorybe workspace-scoped or global? Probably global (one cache for all workspaces), but workspace override could be useful for isolated projects. - Index.json locking: If
galaxy-tool-cacheCLI and the extension both write to the cache simultaneously, index.json could get corrupted. Should we use file locking, or just tolerate rare races (the extension is mostly a reader)?