dependency-injection

Dependency Injection in Galaxy

Learning Questions

Learning Objectives

Big Interconnected App Python 2

A God object

“a God object is an object that knows too much or does too much. The God object is an example of an anti-pattern and a code smell.”

https://en.wikipedia.org/wiki/God_object

Not only does app know and do too much, it is also used way too many places. Every interesting component, every controller, the web transaction, etc. has a reference to app.

Big Interconnected App Python 3 - no right

Problematic Dependency Graph

When managers depend directly on UniverseApplication:

class DatasetCollectionManager:

     def __init__(self, app: UniverseApplication):
        self.type_registry = DATASET_COLLECTION_TYPES_REGISTRY
        self.collection_type_descriptions = COLLECTION_TYPE_DESCRIPTION_FACTORY
        self.model = app.model
        self.security = app.security

        self.hda_manager = hdas.HDAManager(app)
        self.history_manager = histories.HistoryManager(app)
        self.tag_handler = tags.GalaxyTagHandler(app.model.context)
        self.ldda_manager = lddas.LDDAManager(app)

UniverseApplication creates a DatasetCollectionManager for the application and DatasetCollectionManager imports and annotates the UniverseApplication as a requirement. This creates an unfortunate dependency loop.

Dependencies should form a DAG (directed acyclic graph).

Why an Interface?

By using StructuredApp interface instead of UniverseApplication:

class DatasetCollectionManager:

     def __init__(self, app: StructuredApp):
        self.type_registry = DATASET_COLLECTION_TYPES_REGISTRY
        self.collection_type_descriptions = COLLECTION_TYPE_DESCRIPTION_FACTORY
        self.model = app.model
        self.security = app.security

        self.hda_manager = hdas.HDAManager(app)
        self.history_manager = histories.HistoryManager(app)
        self.tag_handler = tags.GalaxyTagHandler(app.model.context)
        self.ldda_manager = lddas.LDDAManager(app)

Dependencies now closer to a DAG - DatasetCollectionManager no longer annotated with the type UniverseApplication! Imports are cleaner.

Big Interconnected App with Python 3 Types

Benefits of Typing

Design Problems with Handling Dependencies Directly

Using app to construct a manager for dealing with dataset collections.

Testing Problems with Handling Dependencies Directly

Design Benefits of Injecting Dependencies

class DatasetCollectionManager:
    def __init__(
        self,
        model: GalaxyModelMapping,
        security: IdEncodingHelper,
        hda_manager: HDAManager,
        history_manager: HistoryManager,
        tag_handler: GalaxyTagHandler,
        ldda_manager: LDDAManager,
    ):
        self.type_registry = DATASET_COLLECTION_TYPES_REGISTRY
        self.collection_type_descriptions = COLLECTION_TYPE_DESCRIPTION_FACTORY
        self.model = model
        self.security = security

        self.hda_manager = hda_manager
        self.history_manager = history_manager
        self.tag_handler = tag_handler
        self.ldda_manager = ldda_manager

Constructing the Object Is Still Brittle

DatasetCollectionManager(
    self.model,
    self.security,
    HDAManager(self),
    HistoryManager(self),
    GalaxyTagHandler(self.model.context),
    LDDAManager(self)
)

What is Type-based Dependency Injection?

A dependency injection container keeps tracks of singletons or recipes for how to construct each type. By default when it goes to construct an object, it can just ask the container for each dependency based on the type signature of the class being constructed.

If an object declares it consumes a dependency of type X (e.g. HDAManager), just query the container recursively for an object of type X.

Object Construction Simplification

Once all the dependencies have been type annotated properly and the needed singletons have been configured.

Before:

dcm = DatasetCollectionManager(
    self.model,
    self.security,
    HDAManager(self),
    HistoryManager(self),
    GalaxyTagHandler(self.model.context),
    LDDAManager(self)
)

After:

dcm = container[DatasetCollectionManager]

Picking a Library

Many of the existing DI libraries for Python predate widespread Python 3 and don’t readily infer things based on types. The benefits of typing and DI are both enhanced by the other - so it was important to pick one that could do type-based injection.

We went with Lagom, but we’ve built abstractions that would make it very easy to switch.

Lagom

Lagom Website

https://lagom-di.readthedocs.io/en/latest/

Tips for Designing New Galaxy Backend Components

DI in FastAPI Controllers

def get_tags_manager() -> TagsManager:
    return TagsManager()


@cbv(router)
class FastAPITags:
    manager: TagsManager = Depends(get_tags_manager)
    ...

Dependency injection allows for type checking but doesn’t use type inference (requires factory functions, etc.)

https://fastapi.tiangolo.com/tutorial/dependencies/

DI and Controllers - FastAPI Limitations

Also we have two different controller styles and only the new FastAPI allowed dependency injection.

def get_tags_manager() -> TagsManager:
    return TagsManager()


@cbv(router)
class FastAPITags:
    manager: TagsManager = Depends(get_tags_manager)
    ...

class TagsController(BaseAPIController):

    def __init__(self, app):
        super().__init__(app)
        self.manager = TagsManager()

DI and Controllers - Unified Approach

-def get_tags_manager() -> TagsManager:
-    return TagsManager()
-
-
 @cbv(router)
 class FastAPITags:
-    manager: TagsManager = Depends(get_tags_manager)
+    manager: TagsManager = depends(TagsManager)

     @router.put(
         '/api/tags',
@@ -58,11 +54,8 @@ def update(
      self.manager.update(trans, payload)


-class TagsController(BaseAPIController):
-
-    def __init__(self, app):
-        super().__init__(app)
-        self.manager = TagsManager()
+class TagsController(BaseGalaxyAPIController):
+    manager: TagsManager = depends(TagsManager)

Building dependency injection into our application and not relying on FastAPI allows for dependency injection that is less verbose, available uniformly across the application, works for the legacy controllers identically.

DI in Celery Tasks

From lib/galaxy/celery/tasks.py:

from lagom import magic_bind_to_container
...

def galaxy_task(func):
    CELERY_TASKS.append(func.__name__)
    app = get_galaxy_app()
    if app:
        return magic_bind_to_container(app)(func)
    return func

magic_bind_to_container binds function parameters to a specified Lagom DI container automatically.

DI in Celery Tasks - Examples

@celery_app.task(ignore_result=True)
@galaxy_task
def purge_hda(hda_manager: HDAManager, hda_id):
    hda = hda_manager.by_id(hda_id)
    hda_manager._purge(hda)
@celery_app.task
@galaxy_task
def set_metadata(
    hda_manager: HDAManager,
    ldda_manager: LDDAManager,
    dataset_id,
    model_class='HistoryDatasetAssociation'
):
    if model_class == 'HistoryDatasetAssociation':
        dataset = hda_manager.by_id(dataset_id)
    elif model_class == 'LibraryDatasetDatasetAssociation':
        dataset = ldda_manager.by_id(dataset_id)
    dataset.datatype.set_meta(dataset)

Dependencies are automatically injected based on type annotations!

Decomposed App

Key Takeaways