DROP_TRICA_PLAN

Drop ToolRequestImplicitCollectionAssociation; tighten TES back-pops to uselist=False

Goal

Remove the last producer-side row anchored on ToolRequest. Outputs of a tool execution are recoverable end-to-end via the TES → ICJ → HDCA chain that this branch already establishes. While doing so, tighten the four TES back-pop relationships from list[...] to Optional[...] — TES is a per-execution-event row, so every back-pop is genuinely 1..[0,1].

Why

TRICA was added when ImplicitCollectionJobs was not minted before constituent jobs ran, so the only way to find a tool-request’s output HDCAs at queued/grey/empty time was through TRICA. On this branch, precreate_output_collections (lib/galaxy/tools/execute.py:646) unconditionally mints an ICJ, attaches icj.tool_execution_state = tes (lines 647–648), wires hdca.implicit_collection_jobs = icj (line 678), and sets hdca.implicit_output_name (line 675) — all in the same transaction as the TRICA append. TRICA’s (tool_request_id, dataset_collection_id, output_name) shape is fully redundant with HDCA.implicit_collection_jobs_id + HDCA.implicit_output_name plus ICJ.tool_execution_state_id.

Scope

Unreleased branch — the four existing migrations have not shipped. Add a fifth revision on top of 395148707459; the chain will be rebased together at a later point.

Audits already done (this conversation)

Verdict: drop is safe.

Schema

TES back-pop tightening

All four TES back-pops are 1..[0,1] in the writer. Change relationship shape on ToolExecutionState:

Back-popCurrentNewJustification
tool_requeststool_requestlist[ToolRequest]Optional[ToolRequest]services/jobs.py::create mints exactly one TR per TES. TR↔TES is 1:1 in the writer.
jobsjoblist[Job]Optional[Job]Per ICJ-supersedes invariant, a TES has at most one direct Job FK (the simple-job case). Under an ICJ, Jobs have NULL TES.
implicit_collection_jobsimplicit_collection_jobs (rename optional)list[ICJ]Optional[ICJ]precreate_output_collections mints one ICJ per call.
workflow_invocation_stepsworkflow_invocation_steplist[WIS]Optional[WIS]_capture_workflow_tool_request_state mints one WIS per step execution; TES is per-execution-event.

DB enforcement

Partial unique constraint on tool_execution_state_id for each of tool_request, job, implicit_collection_jobs, workflow_invocation_step. PostgreSQL + SQLite both allow multiple NULLs under unique, so the import-path NULL-TES rows stay legal. The constraint ADD will fail loudly on duplicates if any legacy data violates the invariant — surface and fix before promoting.

Migration

New revision <rev>_drop_trica_and_tighten_tes_backpops.py, down_revision = "395148707459".

Upgrade:

  1. op.drop_table("tool_request_implicit_collection_association").
  2. op.create_unique_constraint("uq_tool_request_tool_execution_state_id", "tool_request", ["tool_execution_state_id"]).
  3. Same for job, implicit_collection_jobs, workflow_invocation_step.

Downgrade:

  1. Drop the four unique constraints.
  2. Recreate tool_request_implicit_collection_association (mirror the original 1d1d7bf6ac02 create_table). Backfill from (hdca.implicit_collection_jobs_id → icj.tool_execution_state_id → tes.tool_requests[0]) joined with hdca.implicit_output_name. Skip rows where any link is NULL.

Downgrade backfill query sketch:

INSERT INTO tool_request_implicit_collection_association
  (tool_request_id, dataset_collection_id, output_name)
SELECT tr.id, hdca.id, hdca.implicit_output_name
FROM history_dataset_collection_association hdca
JOIN implicit_collection_jobs icj ON icj.id = hdca.implicit_collection_jobs_id
JOIN tool_execution_state tes ON tes.id = icj.tool_execution_state_id
JOIN tool_request tr ON tr.tool_execution_state_id = tes.id
WHERE hdca.implicit_output_name IS NOT NULL;

Writers

SiteChange
lib/galaxy/tools/execute.py:680-685Delete the if tool_request: assoc = ToolRequestImplicitCollectionAssociation(); ... block. HDCA→ICJ wiring at line 678 and implicit_output_name=output_name at line 675 already carry the data.

That’s it for production writers.

Readers

SiteChange
lib/galaxy/managers/history_graph.py:409-425Drop the TRICA join. Collection-side producer query becomes HDCA → ICJ → TES → ToolSource. Unifies shape with the job-side producer query (both walk to TES).
lib/galaxy/workflow/extract.py:638output_hdcas = tool_request.output_collections (helper, see below).
lib/galaxy/workflow/extract.py:797Replace o.tool_request_association.tool_request detection. Two shape options below.
lib/galaxy/workflow/extract.py:914Same shape as :638.
lib/galaxy/webapps/galaxy/services/workflows.py:316-325, 348Auth walk: hdca.implicit_collection_jobs.tool_execution_state.tool_request.history (after uselist=False, tool_request is Optional[TR] not a list).
lib/galaxy/webapps/galaxy/services/base.py:230-244 tool_request_detailed_to_modelBuild ToolRequestImplicitCollectionReference[] from (hdca.id, hdca.implicit_output_name) off icj.output_dataset_collection_instances. Wire-transparent.

extract.py:797 TR-backed detection — two shape options

OPTION_DETECT_AT_HDCA: keep per-HDCA detection. Walk hdca.implicit_collection_jobs.tool_execution_state.tool_request — non-None → TR-backed. Mirrors current shape; localized change.

OPTION_DETECT_AT_SERVICE: lift detection to the service layer. The wire payload already distinguishes tool_request_ids vs implicit_collection_jobs_ids — pass that distinction down to extract. Simpler downstream; no per-HDCA walk needed.

Recommend OPTION_DETECT_AT_SERVICE: the service already knows which input bucket the ID came from, and threading that through removes the only place extract.py needs to peek through HDCA back-pops. The current per-HDCA detection exists because TRICA’s presence was the only signal.

ToolRequest.output_collections helper

Add @property on ToolRequest:

@property
def output_collections(self) -> list["HistoryDatasetCollectionAssociation"]:
    tes = self.tool_execution_state
    icj = tes.implicit_collection_jobs if tes else None
    return icj.output_dataset_collection_instances if icj else []

Replaces the three readers that today walk tool_request.implicit_collections. Keeps the walk in one place; uselist=False on both tes.implicit_collection_jobs and tes.tool_request makes the chain read naturally.

Tests

Validation

  1. pytest test/unit/app/managers/test_HistoryGraphBuilder.py — history-graph fixtures touched.
  2. pytest test/unit/workflows/test_extract_tool_request_state.py and test_extract_by_ids_validation.py — extract walks.
  3. pytest test/unit/webapps/test_tool_request_payload_tolerance.py — tolerance reader.
  4. pytest test/unit/data/model/ — model relationships, migrations.
  5. Migration round-trip (upgrade/downgrade) via manage_db.sh.
  6. API integration: pytest test/integration/test_tool_requests.py (or equivalent if named differently) — wire payload still matches ToolRequestDetailedModel schema.

Visual update

Update both docs after this lands:

Out of scope

Unresolved questions

  1. OPTION_DETECT_AT_SERVICE vs OPTION_DETECT_AT_HDCA for the TR-backed ICJ detection at extract.py:797?