Dashboard

Dependency Pydantic Dynamic Models

Pydantic create_model API dynamically builds BaseModel subclasses with field definitions

Raw
Revised:
2026-05-22
Revision:
4
Related Notes:
Component - Tool State Dynamic Models, Component - Tool State Specification, Dependency - Pydantic Discriminated Unions, PR 18641 - Parameter Model Improvements Research, PR 20935 - Tool Request API, PR 21828 - YAML Tool Hardening and Tool State, PR 22615 - UserToolSource Pydantic Semantic Validation

Pydantic v2 Dynamic Models Reference

Reference document covering Pydantic v2 dynamic model features used in Galaxy’s tool state system and related patterns.

1. create_model() API

Dynamically creates a BaseModel subclass at runtime.

from pydantic import BaseModel, Field, create_model

# Signature
create_model(
    __model_name: str,
    *,
    __config__: ConfigDict | None = None,
    __doc__: str | None = None,
    __base__: type[BaseModel] | tuple[type[BaseModel], ...] | None = None,
    __module__: str = __name__,
    __validators__: dict[str, classmethod] | None = None,
    __cls_kwargs__: dict[str, Any] | None = None,
    __qualname__: str | None = None,
    **field_definitions,    # field_name=(type, default) or field_name=type
) -> type[BaseModel]

Field Definition Syntax

Fields are passed as keyword arguments. Three forms:

# Form 1: Bare type (required field, no default)
M = create_model('M', name=str)

# Form 2: Tuple of (type, default_value)
M = create_model('M', name=(str, "default"))

# Form 3: Tuple of (type, FieldInfo)
M = create_model('M', name=(str, Field(description="Name field")))

# Form 4: Annotated type with FieldInfo default
M = create_model('M', name=(Annotated[str, Field(gt=0)], ...))

The ... (Ellipsis) as default means the field is required:

M = create_model('M',
    required_field=(int, ...),     # required
    optional_field=(int, None),    # optional, defaults to None
    defaulted_field=(int, 42),     # optional, defaults to 42
)

Base Class Inheritance

class MyBase(BaseModel):
    model_config = ConfigDict(extra="forbid")
    common_field: str = "shared"

DerivedModel = create_model(
    'DerivedModel',
    __base__=MyBase,
    extra_field=(int, ...),
)
# DerivedModel inherits MyBase's config, validators, and fields

Multiple bases:

DerivedModel = create_model('D', __base__=(Base1, Base2), ...)

Config with __config__

from pydantic import ConfigDict, create_model

model_config = ConfigDict(
    extra="forbid",             # reject unknown fields
    protected_namespaces=(),    # allow model_* field names
    populate_by_name=True,      # allow field name or alias
)
M = create_model('M', __config__=model_config, name=(str, ...))

If __base__ is also provided, the __config__ merges with/overrides the base config.

Validators with __validators__

from pydantic import create_model, field_validator

def check_positive(cls, v):
    assert v > 0, "must be positive"
    return v

M = create_model(
    'M',
    __validators__={
        "check_value": field_validator("value")(check_positive),
    },
    value=(int, ...),
)

Validator keys must be unique across the model. When assembling from multiple parameters, namespace them: f"{field_name}_{validator_name}".


2. Discriminated Unions

String Discriminator (Simple)

All union members share a field with Literal values:

from typing import Literal, Union
from pydantic import BaseModel, Field

class Cat(BaseModel):
    pet_type: Literal['cat']
    meows: int

class Dog(BaseModel):
    pet_type: Literal['dog']
    barks: float

class Model(BaseModel):
    pet: Union[Cat, Dog] = Field(discriminator='pet_type')

Pydantic reads pet_type from the input dict and routes validation to the matching model. Fast O(1) dispatch. Clear error messages.

Callable Discriminator with Tag

When union members have different field structures (no shared discriminator field):

from typing import Annotated, Any, Union
from pydantic import BaseModel, Discriminator, Tag

class Apple(BaseModel):
    fruit: str

class Pumpkin(BaseModel):
    filling: str

def pie_discriminator(v: Any) -> str:
    if isinstance(v, dict):
        if 'fruit' in v:
            return 'apple'
        if 'filling' in v:
            return 'pumpkin'
    # Handle model instances too (used during serialization)
    if hasattr(v, 'fruit'):
        return 'apple'
    return 'pumpkin'

Pie = Annotated[
    Union[
        Annotated[Apple, Tag('apple')],
        Annotated[Pumpkin, Tag('pumpkin')],
    ],
    Discriminator(pie_discriminator),
]

Key points:

  • The callable receives raw dict during validation and model instances during serialization
  • Always handle both cases (isinstance(v, dict) and getattr)
  • Return value must match one of the Tag(...) strings
  • Returning a non-matching string or None causes a ValidationError

Custom Error Messages

Discriminator(
    my_discriminator_func,
    custom_error_type='invalid_union_member',
    custom_error_message='Input does not match any expected type',
    custom_error_context={'discriminator': 'type_field'},
)

Nested Discriminators

Stack discriminators for hierarchical type selection:

Cat = Annotated[
    Union[Annotated[BlackCat, Tag('black')], Annotated[WhiteCat, Tag('white')]],
    Discriminator(color_discriminator),
]

Pet = Annotated[
    Union[Annotated[Cat, Tag('cat')], Annotated[Dog, Tag('dog')]],
    Discriminator(species_discriminator),
]

Dynamic Discriminated Union Construction

Build unions at runtime (Galaxy’s primary pattern):

from pydantic import Discriminator, Tag, create_model
from typing import Annotated, Union

def build_discriminated_union(branches: dict[str, type]) -> type:
    """branches maps tag_string -> model_class"""
    tagged = [Annotated[model, Tag(tag)] for tag, model in branches.items()]

    def discriminate(v):
        if isinstance(v, dict):
            return v.get('type', '')
        return getattr(v, 'type', '')

    return Annotated[Union[tuple(tagged)], Discriminator(discriminate)]

3. model_rebuild() for Forward References

When It’s Needed

Pydantic resolves type annotations when the model class is created. If a model references a type that isn’t defined yet (forward reference), schema generation is deferred. Call model_rebuild() after all referenced types are defined.

from __future__ import annotations  # makes all annotations strings (forward refs)
from pydantic import BaseModel

class Node(BaseModel):
    value: int
    children: list[Node]  # self-reference

Node.model_rebuild()  # resolve the forward reference to Node

API

@classmethod
def model_rebuild(
    cls,
    *,
    force: bool = False,          # rebuild even if already complete
    raise_errors: bool = True,    # raise on resolution failure
    _parent_namespace_depth: int = 2,
    _types_namespace: MappingNamespace | None = None,  # custom namespace for resolution
) -> bool | None

Returns:

  • None if schema was already complete and force=False
  • True if rebuild succeeded
  • False if rebuild failed and raise_errors=False

Mutual References

class Parent(BaseModel):
    children: list['Child']

class Child(BaseModel):
    parent: 'Parent'

# Must rebuild after BOTH are defined
Parent.model_rebuild()
Child.model_rebuild()

Cascading Rebuilds

Rebuilding a model may cascade to rebuild its dependencies. If those dependencies have unresolved references, errors can occur. Define all types before calling model_rebuild().

Module-Level Pattern

Standard pattern — define all models, then rebuild at module level:

class A(BaseModel):
    b: Optional['B'] = None

class B(BaseModel):
    a: Optional['A'] = None

# After all definitions
A.model_rebuild()
B.model_rebuild()

4. JSON Schema Generation

model_json_schema()

schema = MyModel.model_json_schema(
    mode='validation',                    # or 'serialization'
    schema_generator=GenerateJsonSchema,  # custom generator class
    ref_template='#/$defs/{model}',       # reference template
)

Custom Schema Generator

from pydantic.json_schema import GenerateJsonSchema

class CustomSchema(GenerateJsonSchema):
    def generate(self, schema, mode='validation'):
        json_schema = super().generate(schema, mode=mode)
        json_schema['$schema'] = self.schema_dialect  # add $schema field
        return json_schema

schema = Model.model_json_schema(schema_generator=CustomSchema)

Dynamic Models and JSON Schema

Dynamic models created via create_model() support model_json_schema() just like static models. Discriminated unions produce oneOf with discriminator metadata in the schema.

TypeAdapter for Non-Model Types

from pydantic import TypeAdapter

adapter = TypeAdapter(Union[Cat, Dog])
schema = adapter.json_schema(mode='validation')

5. __get_pydantic_core_schema__ Custom Types

On the Type Itself

from pydantic_core import CoreSchema, core_schema
from pydantic import GetCoreSchemaHandler

class PositiveInt(int):
    @classmethod
    def __get_pydantic_core_schema__(
        cls, source_type: type, handler: GetCoreSchemaHandler
    ) -> CoreSchema:
        return core_schema.int_schema(gt=0)

As Annotation Metadata

from dataclasses import dataclass

@dataclass(frozen=True)
class Uppercase:
    def __get_pydantic_core_schema__(self, source_type, handler):
        schema = handler(source_type)
        return core_schema.no_info_after_validator_function(
            lambda v: v.upper(), schema
        )

# Usage: Annotated[str, Uppercase()]

Handler Methods

  • handler(type) — call next in chain or Pydantic’s default for type
  • handler.generate_schema(type) — generate schema for an unrelated type
  • handler.field_name — access current field name (v2.4+)

Common core_schema Functions

core_schema.str_schema()
core_schema.int_schema(gt=0)
core_schema.no_info_after_validator_function(fn, inner_schema)
core_schema.to_string_ser_schema()
core_schema.union_schema([schema1, schema2])
core_schema.tagged_union_schema(discriminator, choices)

6. RootModel for Single-Value Models

from pydantic import RootModel, Field, Discriminator

class MyUnion(RootModel):
    root: Union[Cat, Dog] = Field(discriminator='type')

Useful for:

  • Wrapping discriminated unions as standalone models
  • Creating list/container models: root: List[SomeType]
  • Using as field types in create_model (the root value is unwrapped on access)

7. TypeAdapter for Non-Model Validation

from pydantic import TypeAdapter

# Validate complex union types without a model wrapper
adapter = TypeAdapter(Annotated[Union[A, B], Field(discriminator='src')])
result = adapter.validate_python({'src': 'a', 'value': 1})

Module-level TypeAdapter instances are created once and reused (no lru_cache needed).


8. Caching Patterns for Dynamic Models

Problem

create_model() is relatively expensive — it generates pydantic-core schemas, compiles validators, and creates a new class. Calling it per-request is wasteful.

lru_cache on Factory Functions

from functools import lru_cache

@lru_cache(maxsize=256)
def get_model_for_tool(tool_id: str, state_repr: str) -> type[BaseModel]:
    bundle = load_tool_parameters(tool_id)
    return create_field_model(bundle.parameters, f"Model_{tool_id}", state_repr)

Requirements for caching:

  • Arguments must be hashable (strings, tuples, frozensets — not lists or dicts)
  • The tool parameter definition must be immutable for the cache to be valid
  • Consider maxsize based on number of unique tools x state representations

Galaxy’s Approach

Galaxy currently does NOT cache dynamic models. Models are created fresh on each validation call. This is acceptable because:

  • Tool definitions are read at startup and rarely change
  • Validation calls are relatively infrequent (per-job, not per-request in hot paths)
  • The model creation overhead is small compared to actual job execution

If caching were needed, the natural cache key would be (tool_id, tool_version, state_representation).


9. Common Pitfalls

1. Forgetting model_rebuild() for Forward References

Symptom: PydanticUndefinedAnnotation at validation time. Fix: Call model_rebuild() after all referenced types are defined.

2. Callable Discriminator Not Handling Model Instances

Pydantic calls the discriminator function during serialization too, where v is a model instance not a dict:

# WRONG
def discriminate(v):
    return v.get('type')  # fails on model instances

# CORRECT
def discriminate(v):
    if isinstance(v, dict):
        return v.get('type', '')
    return getattr(v, 'type', '')

3. Non-Unique __validators__ Keys

All validator keys must be unique within a model. When combining validators from multiple parameters, namespace them:

# WRONG
validators = {"check": field_validator("a")(fn), "check": field_validator("b")(fn)}

# CORRECT
validators = {"a_check": field_validator("a")(fn), "b_check": field_validator("b")(fn)}

4. extra="forbid" with Aliases

When using extra="forbid", fields accessed by alias will reject the non-alias name and vice versa unless populate_by_name=True is set:

ConfigDict(extra="forbid", populate_by_name=True)

5. Dynamic Literal with Empty Options

Union[()] (empty tuple) is invalid. Handle the zero-options case:

if len(options) > 0:
    py_type = union_type([Literal[o] for o in options])
else:
    py_type = type(None)  # no valid options -> only None accepted

6. protected_namespaces Warning

Pydantic warns if fields start with model_. Suppress with:

ConfigDict(protected_namespaces=())

7. Mutating Types After Model Creation

create_model() captures types at call time. Mutating the type objects after creation has no effect on already-created models.


10. Performance Notes

  • create_model() cost is dominated by pydantic-core schema compilation
  • Pydantic v2 internally caches some schema generation results
  • Discriminated unions with string discriminators are faster than callable discriminators (direct dict lookup vs function call)
  • TypeAdapter instances should be created once at module level, not per-call
  • For dynamic models that are created frequently with the same structure, lru_cache on the factory function provides significant speedup
  • model_rebuild() triggers a full schema recompilation; avoid calling it more than once per model unless force=True is needed

References

Incoming References (7)