Models
The primary means of defining objects in Pydantic is via models. Models are simply classes which inherit from
pydantic.BaseModel
.
You can think of models as similar to structs in languages like C, or as the requirements of a single endpoint in an API.
Models share many similarities with Python's dataclasses, but have been designed with some subtle-yet-important differences that streamline certain workflows related to validation, serialization, and JSON schema generation. You can find more discussion of this in the Dataclasses section of the docs.
Untrusted data can be passed to a model and, after parsing and validation, Pydantic guarantees that the fields of the resultant model instance will conform to the field types defined on the model.
Note
Pydantic is primarily a parsing and transformation library, not a validation library. Validation is a means to an end: building a model which conforms to the types and constraints provided.
In other words, Pydantic guarantees the types and constraints of the output model, not the input data.
This might sound like an esoteric distinction, but it is not. If you're unsure what this means or how it might affect your usage you should read the section about Data Conversion below.
Although validation is not the main purpose of Pydantic, you can use this library for custom validation.
Basic model usage¶
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str = 'Jane Doe'
In this example, User
is a model with two fields:
id
, which is an integer and is requiredname
, which is a string and is not required (it has a default value).
user = User(id='123')
In this example, user
is an instance of User
.
Initialization of the object will perform all parsing and validation.
If no ValidationError
is raised, you know the resulting model instance is valid.
assert user.id == 123
assert isinstance(user.id, int)
# Note that '123' was coerced to an int and its value is 123
More details on pydantic's coercion logic can be found in Data Conversion.
Fields of a model can be accessed as normal attributes of the user
object.
The string '123'
has been converted into an int as per the field type.
assert user.name == 'Jane Doe'
name
wasn't set when user
was initialized, so it has the default value.
assert user.model_fields_set == {'id'}
The fields which were supplied when user was initialized.
assert user.model_dump() == {'id': 123, 'name': 'Jane Doe'}
Either .model_dump()
or dict(user)
will provide a dict of fields, but .model_dump()
can take numerous other
arguments. (Note that dict(user)
will not recursively convert nested models into dicts, but .model_dump()
will.)
user.id = 321
assert user.id == 321
By default, models are mutable and field values can be changed through attribute assignment.
Model methods and properties¶
The example above only shows the tip of the iceberg of what models can do. Models possess the following methods and attributes:
model_computed_fields
: a dictionary of the computed fields of this model instance.model_construct()
: a class method for creating models without running validation. See Creating models without validation.model_copy()
: returns a copy (by default, shallow copy) of the model. See Serialization.model_dump()
: returns a dictionary of the model's fields and values. See Serialization.model_dump_json()
: returns a JSON string representation ofmodel_dump()
. See Serialization.model_extra
: get extra fields set during validation.model_fields_set
: set of fields which were set when the model instance was initialized.model_json_schema()
: returns a dictionary representing the model as JSON Schema. See JSON Schema.model_modify_json_schema()
: a method for how the "generic" properties of the JSON schema are populated. See JSON Schema.model_parametrized_name()
: compute the class name for parametrizations of generic classes.model_post_init()
: perform additional initialization after the model is initialized.model_rebuild()
: rebuild the model schema.model_validate()
: a utility for loading any object into a model with error handling if the object is not a dictionary. See Helper functions.model_validate_json()
: a utility for validating the given JSON data against the Pydantic model. See Helper functions.
Note
See BaseModel
for the class definition including a full list of methods and attributes.
Tip
See Changes to pydantic.BaseModel
in the
Migration Guide for details on changes from Pydantic V1.
Nested models¶
More complex hierarchical data structures can be defined using models themselves as types in annotations.
from typing import List, Optional
from pydantic import BaseModel
class Foo(BaseModel):
count: int
size: Optional[float] = None
class Bar(BaseModel):
apple: str = 'x'
banana: str = 'y'
class Spam(BaseModel):
foo: Foo
bars: List[Bar]
m = Spam(foo={'count': 4}, bars=[{'apple': 'x1'}, {'apple': 'x2'}])
print(m)
"""
foo=Foo(count=4, size=None) bars=[Bar(apple='x1', banana='y'), Bar(apple='x2', banana='y')]
"""
print(m.model_dump())
"""
{
'foo': {'count': 4, 'size': None},
'bars': [{'apple': 'x1', 'banana': 'y'}, {'apple': 'x2', 'banana': 'y'}],
}
"""
For self-referencing models, see postponed annotations.
Arbitrary class instances¶
(Formerly known as "ORM Mode"/from_orm
.)
Pydantic models can also be created from arbitrary class instances by reading the instance attributes corresponding to the model field names. One common application of this functionality is integration with object-relational mappings (ORMs).
To do this, set the config attribute model_config['from_attributes'] = True
. See
Model Config and ConfigDict for more information.
The example here uses SQLAlchemy, but the same approach should work for any ORM.
from typing import List
from sqlalchemy import Column, Integer, String
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
from pydantic import BaseModel, ConfigDict, constr
Base = declarative_base()
class CompanyOrm(Base):
__tablename__ = 'companies'
id = Column(Integer, primary_key=True, nullable=False)
public_key = Column(String(20), index=True, nullable=False, unique=True)
name = Column(String(63), unique=True)
domains = Column(ARRAY(String(255)))
class CompanyModel(BaseModel):
model_config = ConfigDict(from_attributes=True)
id: int
public_key: constr(max_length=20)
name: constr(max_length=63)
domains: List[constr(max_length=255)]
co_orm = CompanyOrm(
id=123,
public_key='foobar',
name='Testing',
domains=['example.com', 'foobar.com'],
)
print(co_orm)
#> <__main__.CompanyOrm object at 0x0123456789ab>
co_model = CompanyModel.model_validate(co_orm)
print(co_model)
"""
id=123 public_key='foobar' name='Testing' domains=['example.com', 'foobar.com']
"""
Reserved names¶
You may want to name a Column
after a reserved SQLAlchemy field. In that case, Field
aliases will be
convenient:
import typing
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declarative_base
from pydantic import BaseModel, ConfigDict, Field
class MyModel(BaseModel):
model_config = ConfigDict(from_attributes=True)
metadata: typing.Dict[str, str] = Field(alias='metadata_')
Base = declarative_base()
class SQLModel(Base):
__tablename__ = 'my_table'
id = sa.Column('id', sa.Integer, primary_key=True)
# 'metadata' is reserved by SQLAlchemy, hence the '_'
metadata_ = sa.Column('metadata', sa.JSON)
sql_model = SQLModel(metadata_={'key': 'val'}, id=1)
pydantic_model = MyModel.model_validate(sql_model)
print(pydantic_model.model_dump())
#> {'metadata': {'key': 'val'}}
print(pydantic_model.model_dump(by_alias=True))
#> {'metadata_': {'key': 'val'}}
Note
The example above works because aliases have priority over field names for
field population. Accessing SQLModel
's metadata
attribute would lead to a ValidationError
.
Nested attributes¶
When using attributes to parse models, model instances will be created from both top-level attributes and deeper-nested attributes as appropriate.
Here is an example demonstrating the principle:
from typing import List
from pydantic import BaseModel, ConfigDict
class PetCls:
def __init__(self, *, name: str, species: str):
self.name = name
self.species = species
class PersonCls:
def __init__(self, *, name: str, age: float = None, pets: List[PetCls]):
self.name = name
self.age = age
self.pets = pets
class Pet(BaseModel):
model_config = ConfigDict(from_attributes=True)
name: str
species: str
class Person(BaseModel):
model_config = ConfigDict(from_attributes=True)
name: str
age: float = None
pets: List[Pet]
bones = PetCls(name='Bones', species='dog')
orion = PetCls(name='Orion', species='cat')
anna = PersonCls(name='Anna', age=20, pets=[bones, orion])
anna_model = Person.model_validate(anna)
print(anna_model)
"""
name='Anna' age=20.0 pets=[Pet(name='Bones', species='dog'), Pet(name='Orion', species='cat')]
"""
Error handling¶
Pydantic will raise ValidationError
whenever it finds an error in the data it's validating.
A single exception of type ValidationError
will be raised regardless of the number of errors found,
and that ValidationError
will contain information about all of the errors and how they happened.
See Error Handling for details on standard and custom errors.
As a demonstration:
from typing import List
from pydantic import BaseModel, ValidationError
class Model(BaseModel):
list_of_ints: List[int]
a_float: float
data = dict(
list_of_ints=['1', 2, 'bad'],
a_float='not a float',
)
try:
Model(**data)
except ValidationError as e:
print(e)
"""
2 validation errors for Model
list_of_ints.2
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='bad', input_type=str]
a_float
Input should be a valid number, unable to parse string as a number [type=float_parsing, input_value='not a float', input_type=str]
"""
Helper functions¶
Pydantic provides two classmethod
helper functions on models for parsing data:
model_validate
: this is very similar to the__init__
method of the model, except it takes a dict rather than keyword arguments. If the object passed is not a dict aValidationError
will be raised.model_validate_json
: this takes a str or bytes and parses it as json, then passes the result tomodel_validate
.
from datetime import datetime
from typing import Optional
from pydantic import BaseModel, ValidationError
class User(BaseModel):
id: int
name: str = 'John Doe'
signup_ts: Optional[datetime] = None
m = User.model_validate({'id': 123, 'name': 'James'})
print(m)
#> id=123 name='James' signup_ts=None
try:
User.model_validate(['not', 'a', 'dict'])
except ValidationError as e:
print(e)
"""
1 validation error for User
Input should be a valid dictionary or instance of User [type=model_type, input_value=['not', 'a', 'dict'], input_type=list]
"""
m = User.model_validate_json('{"id": 123, "name": "James"}')
print(m)
#> id=123 name='James' signup_ts=None
try:
m = User.model_validate_json('{"id": 123, "name": 123}')
except ValidationError as e:
print(e)
"""
1 validation error for User
name
Input should be a valid string [type=string_type, input_value=123, input_type=int]
"""
try:
m = User.model_validate_json('Invalid JSON')
except ValidationError as e:
print(e)
"""
1 validation error for User
Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value='Invalid JSON', input_type=str]
"""
If you want to validate serialized data in a format other than JSON, you should load the data into a dict yourself and
then pass it to model_validate
.
Note
Depending on the types and model configs involved, model_validate
and model_validate_json
may have
different validation behavior. If you have data coming from a non-JSON source, but want the same validation
behavior and errors you'd get from model_validate_json
, our recommendation for now is to use
model_validate_json(json.dumps(data))
.
Creating models without validation¶
Pydantic also provides the model_construct()
method, which allows models to be created without validation. This
can be useful in at least a few cases:
- when working with complex data that is already known to be valid (for performance reasons)
- when one or more of the validator functions are non-idempotent, or
- when one or more of the validator functions have side effects that you don't want to be triggered.
Note
In Pydantic V2, the performance gap between BaseModel.__init__
and BaseModel.model_construct
has been narrowed
considerably. For simple models, calling BaseModel.__init__
may even be faster. If you are using model_construct
for performance reasons, you may want to profile your use case before assuming that model_construct
is faster.
Warning
model_construct()
does not do any validation, meaning it can create models which are invalid. You should only
ever use the model_construct()
method with data which has already been validated, or that you definitely trust.
from pydantic import BaseModel
class User(BaseModel):
id: int
age: int
name: str = 'John Doe'
original_user = User(id=123, age=32)
user_data = original_user.model_dump()
print(user_data)
#> {'id': 123, 'age': 32, 'name': 'John Doe'}
fields_set = original_user.model_fields_set
print(fields_set)
#> {'age', 'id'}
# ...
# pass user_data and fields_set to RPC or save to the database etc.
# ...
# you can then create a new instance of User without
# re-running validation which would be unnecessary at this point:
new_user = User.model_construct(_fields_set=fields_set, **user_data)
print(repr(new_user))
#> User(id=123, age=32, name='John Doe')
print(new_user.model_fields_set)
#> {'age', 'id'}
# construct can be dangerous, only use it with validated data!:
bad_user = User.model_construct(id='dog')
print(repr(bad_user))
#> User(id='dog', name='John Doe')
The _fields_set
keyword argument to model_construct()
is optional, but allows you to be more precise about
which fields were originally set and which weren't. If it's omitted model_fields_set
will just be the keys
of the data provided.
For example, in the example above, if _fields_set
was not provided,
new_user.model_fields_set
would be {'id', 'age', 'name'}
.
Note that for subclasses of RootModel
, the root value can be passed to model_construct
positionally, instead of using a keyword argument.
Here are some additional notes on the behavior of model_construct
:
- When we say "no validation is performed" — this includes converting dicts to model instances. So if you have a field
with a
Model
type, you will need to convert the inner dict to a model yourself before passing it tomodel_construct
. - In particular, the
model_construct
method does not support recursively constructing models from dicts. - If you do not pass keyword arguments for fields with defaults, the default values will still be used.
- For models with
model_config['extra'] == 'allow'
, data not corresponding to fields will be correctly stored in the__pydantic_extra__
dict. - For models with private attributes, the
__pydantic_private__
dict will be initialized the same as it would be when calling__init__
. - When constructing an instance using
model_construct()
, no__init__
method from the model or any of its parent classes will be called, even when a custom__init__
method is defined.
Generic models¶
Pydantic supports the creation of generic models to make it easier to reuse a common model structure.
In order to declare a generic model, you perform the following steps:
- Declare one or more
typing.TypeVar
instances to use to parameterize your model. - Declare a pydantic model that inherits from
pydantic.BaseModel
andtyping.Generic
, where you pass theTypeVar
instances as parameters totyping.Generic
. - Use the
TypeVar
instances as annotations where you will want to replace them with other types or pydantic models.
Here is an example using a generic BaseModel
subclass to create an easily-reused HTTP response payload wrapper:
from typing import Generic, List, Optional, TypeVar
from pydantic import BaseModel, ValidationError
DataT = TypeVar('DataT')
class Error(BaseModel):
code: int
message: str
class DataModel(BaseModel):
numbers: List[int]
people: List[str]
class Response(BaseModel, Generic[DataT]):
data: Optional[DataT] = None
data = DataModel(numbers=[1, 2, 3], people=[])
error = Error(code=404, message='Not found')
print(Response[int](data=1))
#> data=1
print(Response[str](data='value'))
#> data='value'
print(Response[str](data='value').model_dump())
#> {'data': 'value'}
print(Response[DataModel](data=data).model_dump())
#> {'data': {'numbers': [1, 2, 3], 'people': []}}
try:
Response[int](data='value')
except ValidationError as e:
print(e)
"""
1 validation error for Response[int]
data
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='value', input_type=str]
"""
If you set the model_config
or make use of @field_validator
or other Pydantic decorators in your generic model
definition, they will be applied to parametrized subclasses in the same way as when inheriting from a BaseModel
subclass. Any methods defined on your generic class will also be inherited.
Pydantic's generics also integrate properly with type checkers, so you get all the type checking you would expect if you were to declare a distinct type for each parametrization.
Note
Internally, Pydantic creates subclasses of BaseModel
at runtime when generic models are parametrized.
These classes are cached, so there should be minimal overhead introduced by the use of generics models.
To inherit from a generic model and preserve the fact that it is generic, the subclass must also inherit from
typing.Generic
:
from typing import Generic, TypeVar
from pydantic import BaseModel
TypeX = TypeVar('TypeX')
class BaseClass(BaseModel, Generic[TypeX]):
X: TypeX
class ChildClass(BaseClass[TypeX], Generic[TypeX]):
# Inherit from Generic[TypeX]
pass
# Replace TypeX by int
print(ChildClass[int](X=1))
#> X=1
You can also create a generic subclass of a BaseModel
that partially or fully replaces the type parameters in the
superclass:
from typing import Generic, TypeVar
from pydantic import BaseModel
TypeX = TypeVar('TypeX')
TypeY = TypeVar('TypeY')
TypeZ = TypeVar('TypeZ')
class BaseClass(BaseModel, Generic[TypeX, TypeY]):
x: TypeX
y: TypeY
class ChildClass(BaseClass[int, TypeY], Generic[TypeY, TypeZ]):
z: TypeZ
# Replace TypeY by str
print(ChildClass[str, int](x='1', y='y', z='3'))
#> x=1 y='y' z=3
If the name of the concrete subclasses is important, you can also override the default name generation:
from typing import Any, Generic, Tuple, Type, TypeVar
from pydantic import BaseModel
DataT = TypeVar('DataT')
class Response(BaseModel, Generic[DataT]):
data: DataT
@classmethod
def model_parametrized_name(cls, params: Tuple[Type[Any], ...]) -> str:
return f'{params[0].__name__.title()}Response'
print(repr(Response[int](data=1)))
#> IntResponse(data=1)
print(repr(Response[str](data='a')))
#> StrResponse(data='a')
Using the same TypeVar
in nested models allows you to enforce typing relationships at different points in your model:
from typing import Generic, TypeVar
from pydantic import BaseModel, ValidationError
T = TypeVar('T')
class InnerT(BaseModel, Generic[T]):
inner: T
class OuterT(BaseModel, Generic[T]):
outer: T
nested: InnerT[T]
nested = InnerT[int](inner=1)
print(OuterT[int](outer=1, nested=nested))
#> outer=1 nested=InnerT[int](inner=1)
try:
nested = InnerT[str](inner='a')
print(OuterT[int](outer='a', nested=nested))
except ValidationError as e:
print(e)
"""
2 validation errors for OuterT[int]
outer
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
nested
Input should be a valid dictionary or instance of InnerT[int] [type=model_type, input_value=InnerT[str](inner='a'), input_type=InnerT[str]]
"""
When using bound type parameters, and when leaving type parameters unspecified, Pydantic treats generic models
similarly to how it treats built-in generic types like List
and Dict
:
- If you don't specify parameters before instantiating the generic model, they are treated as the bound of the
TypeVar
. - If the
TypeVar
s involved have no bounds, they are treated asAny
.
Also, like List
and Dict
, any parameters specified using a TypeVar
can later be substituted with concrete types.
Note
For serialization this means: when a TypeVar
is constrained or bound using a parent model ParentModel
and a child model ChildModel
is used as a concrete value, Pydantic will serialize ChildModel
as ParentModel
.
TypeVar
needs to be wrapped inside SerializeAsAny
for Pydantic to serialize ChildModel
as ChildModel
.
from typing import Generic, TypeVar
from pydantic import BaseModel, ValidationError
AT = TypeVar('AT')
BT = TypeVar('BT')
class Model(BaseModel, Generic[AT, BT]):
a: AT
b: BT
print(Model(a='a', b='a'))
#> a='a' b='a'
IntT = TypeVar('IntT', bound=int)
typevar_model = Model[int, IntT]
print(typevar_model(a=1, b=1))
#> a=1 b=1
try:
typevar_model(a='a', b='a')
except ValidationError as exc:
print(exc)
"""
2 validation errors for Model[int, ~IntT]
a
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
b
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
"""
concrete_model = typevar_model[int]
print(concrete_model(a=1, b=1))
#> a=1 b=1
If a Pydantic model is used in a TypeVar
constraint, SerializeAsAny
can be used to
serialize it using the concrete model instead of the model TypeVar
is bound to.
from typing import Generic, TypeVar
from pydantic import BaseModel, SerializeAsAny
class Model(BaseModel):
a: int = 42
class DataModel(Model):
b: int = 2
c: int = 3
BoundT = TypeVar('BoundT', bound=Model)
class GenericModel(BaseModel, Generic[BoundT]):
data: BoundT
class SerializeAsAnyModel(BaseModel, Generic[BoundT]):
data: SerializeAsAny[BoundT]
data_model = DataModel()
print(GenericModel(data=data_model).model_dump())
#> {'data': {'a': 42}}
print(SerializeAsAnyModel(data=data_model).model_dump())
#> {'data': {'a': 42, 'b': 2, 'c': 3}}
Dynamic model creation¶
There are some occasions where it is desirable to create a model using runtime information to specify the fields.
For this Pydantic provides the create_model
function to allow models to be created on the fly:
from pydantic import BaseModel, create_model
DynamicFoobarModel = create_model(
'DynamicFoobarModel', foo=(str, ...), bar=(int, 123)
)
class StaticFoobarModel(BaseModel):
foo: str
bar: int = 123
Here StaticFoobarModel
and DynamicFoobarModel
are identical.
Fields are defined by a tuple of the form (<type>, <default value>)
. The special keyword
arguments __config__
and __base__
can be used to customise the new model. This includes
extending a base model with extra fields.
from pydantic import BaseModel, create_model
class FooModel(BaseModel):
foo: str
bar: int = 123
BarModel = create_model(
'BarModel',
apple=(str, 'russet'),
banana=(str, 'yellow'),
__base__=FooModel,
)
print(BarModel)
#> <class 'pydantic.main.BarModel'>
print(BarModel.model_fields.keys())
#> dict_keys(['foo', 'bar', 'apple', 'banana'])
You can also add validators by passing a dict to the __validators__
argument.
from pydantic import ValidationError, create_model, field_validator
def username_alphanumeric(cls, v):
assert v.isalnum(), 'must be alphanumeric'
return v
validators = {
'username_validator': field_validator('username')(username_alphanumeric)
}
UserModel = create_model(
'UserModel', username=(str, ...), __validators__=validators
)
user = UserModel(username='scolvin')
print(user)
#> username='scolvin'
try:
UserModel(username='scolvi%n')
except ValidationError as e:
print(e)
"""
1 validation error for UserModel
username
Assertion failed, must be alphanumeric [type=assertion_error, input_value='scolvi%n', input_type=str]
"""
Note
To pickle a dynamically created model:
- the model must be defined globally
- it must provide
__module__
RootModel
and custom root types¶
Pydantic models can be defined with a "custom root type" by subclassing pydantic.RootModel
.
The root type can be any type supported by Pydantic, and is specified by the generic parameter to RootModel
.
The root value can be passed to the model __init__
or model_validate
as via the first and only argument.
Here's an example of how this works:
from typing import Dict, List
from pydantic import RootModel
Pets = RootModel[List[str]]
PetsByName = RootModel[Dict[str, str]]
print(Pets(['dog', 'cat']))
#> root=['dog', 'cat']
print(Pets(['dog', 'cat']).model_dump_json())
#> ["dog","cat"]
print(Pets.model_validate(['dog', 'cat']))
#> root=['dog', 'cat']
print(Pets.model_json_schema())
"""
{'items': {'type': 'string'}, 'title': 'RootModel[List[str]]', 'type': 'array'}
"""
print(PetsByName({'Otis': 'dog', 'Milo': 'cat'}))
#> root={'Otis': 'dog', 'Milo': 'cat'}
print(PetsByName({'Otis': 'dog', 'Milo': 'cat'}).model_dump_json())
#> {"Otis":"dog","Milo":"cat"}
print(PetsByName.model_validate({'Otis': 'dog', 'Milo': 'cat'}))
#> root={'Otis': 'dog', 'Milo': 'cat'}
If you want to access items in the root
field directly or to iterate over the items, you can implement
custom __iter__
and __getitem__
functions, as shown in the following example.
from typing import List
from pydantic import RootModel
class Pets(RootModel):
root: List[str]
def __iter__(self):
return iter(self.root)
def __getitem__(self, item):
return self.root[item]
pets = Pets.model_validate(['dog', 'cat'])
print(pets[0])
#> dog
print([pet for pet in pets])
#> ['dog', 'cat']
You can also create subclasses of the parametrized root model directly:
from typing import List
from pydantic import RootModel
class Pets(RootModel[List[str]]):
root: List[str]
def describe(self) -> str:
return f'Pets: {", ".join(self.root)}'
my_pets = Pets.model_validate(['dog', 'cat'])
print(my_pets.describe())
#> Pets: dog, cat
Faux immutability¶
Models can be configured to be immutable via model_config['frozen'] = True
. When this is set, attempting to change the
values of instance attributes will raise errors. See Model Config for more details.
Note
This behavior was achieved in Pydantic V1 via the config setting allow_mutation = False
.
This config flag is deprecated in Pydantic V2, and has been replaced with frozen
.
Warning
Immutability in Python is never strict. If developers are determined/stupid they can always modify a so-called "immutable" object.
from pydantic import BaseModel, ConfigDict, ValidationError
class FooBarModel(BaseModel):
model_config = ConfigDict(frozen=True)
a: str
b: dict
foobar = FooBarModel(a='hello', b={'apple': 'pear'})
try:
foobar.a = 'different'
except ValidationError as e:
print(e)
"""
1 validation error for FooBarModel
a
Instance is frozen [type=frozen_instance, input_value='different', input_type=str]
"""
print(foobar.a)
#> hello
print(foobar.b)
#> {'apple': 'pear'}
foobar.b['apple'] = 'grape'
print(foobar.b)
#> {'apple': 'grape'}
Trying to change a
caused an error, and a
remains unchanged. However, the dict b
is mutable, and the
immutability of foobar
doesn't stop b
from being changed.
Abstract base classes¶
Pydantic models can be used alongside Python's Abstract Base Classes (ABCs).
import abc
from pydantic import BaseModel
class FooBarModel(BaseModel, abc.ABC):
a: str
b: int
@abc.abstractmethod
def my_abstract_method(self):
pass
Field ordering¶
Field order affects models in the following ways:
- field order is preserved in the model schema
- field order is preserved in validation errors
- field order is preserved by
.model_dump()
and.model_dump_json()
etc.
from pydantic import BaseModel, ValidationError
class Model(BaseModel):
a: int
b: int = 2
c: int = 1
d: int = 0
e: float
print(Model.model_fields.keys())
#> dict_keys(['a', 'b', 'c', 'd', 'e'])
m = Model(e=2, a=1)
print(m.model_dump())
#> {'a': 1, 'b': 2, 'c': 1, 'd': 0, 'e': 2.0}
try:
Model(a='x', b='x', c='x', d='x', e='x')
except ValidationError as err:
error_locations = [e['loc'] for e in err.errors()]
print(error_locations)
#> [('a',), ('b',), ('c',), ('d',), ('e',)]
Required fields¶
To declare a field as required, you may declare it using just an annotation, or you may use Ellipsis
/...
as the value:
from pydantic import BaseModel, Field
class Model(BaseModel):
a: int
b: int = ...
c: int = Field(...)
Where Field
refers to the field function.
Here a
, b
and c
are all required. However, this use of b: int = ...
does not work properly
with mypy, and as of v1.0 should be avoided in most cases.
Note
In Pydantic V1, fields annotated with Optional
or Any
would be given an implicit default of None
even if no
default was explicitly specified. This behavior has changed in Pydantic V2, and there are no longer any type
annotations that will result in a field having an implicit default value.
Fields with non-hashable default values¶
A common source of bugs in python is to use a mutable object as a default value for a function or method argument, as the same instance ends up being reused in each call.
The dataclasses
module actually raises an error in this case, indicating that you should use the default_factory
argument to dataclasses.field
.
Pydantic also supports the use of a default_factory
for non-hashable default
values, but it is not required. In the event that the default value is not hashable, Pydantic will deepcopy the default
value when creating each instance of the model:
from typing import Dict, List
from pydantic import BaseModel
class Model(BaseModel):
item_counts: List[Dict[str, int]] = [{}]
m1 = Model()
m1.item_counts[0]['a'] = 1
print(m1.item_counts)
#> [{'a': 1}]
m2 = Model()
print(m2.item_counts)
#> [{}]
Fields with dynamic default values¶
When declaring a field with a default value, you may want it to be dynamic (i.e. different for each model).
To do this, you may want to use a default_factory
.
Here is an example:
from datetime import datetime
from uuid import UUID, uuid4
from pydantic import BaseModel, Field
class Model(BaseModel):
uid: UUID = Field(default_factory=uuid4)
updated: datetime = Field(default_factory=datetime.utcnow)
m1 = Model()
m2 = Model()
assert m1.uid != m2.uid
assert m1.updated != m2.updated
You can find more information in the documentation of the Field
function.
Automatically excluded attributes¶
Class vars¶
Attributes annotated with typing.ClassVar
are properly treated by Pydantic as class variables, and will not
become fields on model instances:
from typing import ClassVar
from pydantic import BaseModel
class Model(BaseModel):
x: int = 2
y: ClassVar[int] = 1
m = Model()
print(m)
#> x=2
print(Model.y)
#> 1
Private model attributes¶
Attributes whose name has a leading underscore are not treated as fields by Pydantic, and are not included in the
model schema. Instead, these are converted into a "private attribute" which is not validated or even set during
calls to __init__
, model_validate
, etc.
Here is an example of usage:
from datetime import datetime
from random import randint
from pydantic import BaseModel, PrivateAttr
class TimeAwareModel(BaseModel):
_processed_at: datetime = PrivateAttr(default_factory=datetime.now)
_secret_value: str
def __init__(self, **data):
super().__init__(**data)
# this could also be done with default_factory
self._secret_value = randint(1, 5)
m = TimeAwareModel()
print(m._processed_at)
#> 2032-01-02 03:04:05.000006
print(m._secret_value)
#> 3
Private attribute names must start with underscore to prevent conflicts with model fields. However, dunder names
(such as __attr__
) are not supported.
Data conversion¶
Pydantic may cast input data to force it to conform to model field types, and in some cases this may result in a loss of information. For example:
from pydantic import BaseModel
class Model(BaseModel):
a: int
b: float
c: str
print(Model(a=3.000, b='2.72', c=b'binary data').model_dump())
#> {'a': 3, 'b': 2.72, 'c': 'binary data'}
This is a deliberate decision of Pydantic, and is frequently the most useful approach. See here for a longer discussion on the subject.
Nevertheless, strict type checking is also supported.
Model signature¶
All Pydantic models will have their signature generated based on their fields:
import inspect
from pydantic import BaseModel, Field
class FooModel(BaseModel):
id: int
name: str = None
description: str = 'Foo'
apple: int = Field(alias='pear')
print(inspect.signature(FooModel))
#> (*, id: int, name: str = None, description: str = 'Foo', pear: int) -> None
An accurate signature is useful for introspection purposes and libraries like FastAPI
or hypothesis
.
The generated signature will also respect custom __init__
functions:
import inspect
from pydantic import BaseModel
class MyModel(BaseModel):
id: int
info: str = 'Foo'
def __init__(self, id: int = 1, *, bar: str, **data) -> None:
"""My custom init!"""
super().__init__(id=id, bar=bar, **data)
print(inspect.signature(MyModel))
#> (id: int = 1, *, bar: str, info: str = 'Foo') -> None
To be included in the signature, a field's alias or name must be a valid Python identifier. Pydantic will prioritize a field's alias over its name when generating the signature, but may use the field name if the alias is not a valid Python identifier.
If a field's alias and name are both not valid identifiers (which may be possible through exotic use of create_model
),
a **data
argument will be added. In addition, the **data
argument will always be present in the signature if
model_config['extra'] == 'allow'
.
Structural pattern matching¶
Pydantic supports structural pattern matching for models, as introduced by PEP 636 in Python 3.10.
from pydantic import BaseModel
class Pet(BaseModel):
name: str
species: str
a = Pet(name='Bones', species='dog')
match a:
# match `species` to 'dog', declare and initialize `dog_name`
case Pet(species='dog', name=dog_name):
print(f'{dog_name} is a dog')
#> Bones is a dog
# default case
case _:
print('No dog matched')
Note
A match-case statement may seem as if it creates a new model, but don't be fooled; it is just syntactic sugar for getting an attribute and either comparing it or declaring and initializing it.
Attribute copies¶
In many cases, arguments passed to the constructor will be copied in order to perform validation and, where necessary, coercion.
In this example, note that the ID of the list changes after the class is constructed because it has been copied during validation:
from typing import List
from pydantic import BaseModel
class C1:
arr = []
def __init__(self, in_arr):
self.arr = in_arr
class C2(BaseModel):
arr: List[int]
arr_orig = [1, 9, 10, 3]
c1 = C1(arr_orig)
c2 = C2(arr=arr_orig)
print('id(c1.arr) == id(c2.arr):', id(c1.arr) == id(c2.arr))
#> id(c1.arr) == id(c2.arr): False
Note
There are some situations where Pydantic does not copy attributes, such as when passing models — we use the
model as is. You can override this behaviour by setting
model_config['revalidate_instances'] = 'always'
.
Extra fields¶
By default, Pydantic models won't error when you provide data for unrecognized fields, they will just be ignored:
from pydantic import BaseModel
class Model(BaseModel):
x: int
m = Model(x=1, y='a')
assert m.model_dump() == {'x': 1}
If you want this to raise an error, you can achieve this via model_config
:
from pydantic import BaseModel, ConfigDict, ValidationError
class Model(BaseModel):
x: int
model_config = ConfigDict(extra='forbid')
try:
Model(x=1, y='a')
except ValidationError as exc:
print(exc)
"""
1 validation error for Model
y
Extra inputs are not permitted [type=extra_forbidden, input_value='a', input_type=str]
"""
To instead preserve any extra data provided, you can set extra='allow'
.
The extra fields will then be stored in BaseModel.__pydantic_extra__
:
from pydantic import BaseModel, ConfigDict
class Model(BaseModel):
x: int
model_config = ConfigDict(extra='allow')
m = Model(x=1, y='a')
assert m.__pydantic_extra__ == {'y': 'a'}
By default, no validation will be applied to these extra items, but you can set a type for the values by overriding
the type annotation for __pydantic_extra__
:
from typing import Dict
from pydantic import BaseModel, ConfigDict, ValidationError
class Model(BaseModel):
__pydantic_extra__: Dict[str, int]
x: int
model_config = ConfigDict(extra='allow')
try:
Model(x=1, y='a')
except ValidationError as exc:
print(exc)
"""
1 validation error for Model
y
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
"""
m = Model(x=1, y='2')
assert m.x == 1
assert m.y == 2
assert m.model_dump() == {'x': 1, 'y': 2}
assert m.__pydantic_extra__ == {'y': 2}
The same configurations apply to TypedDict
and dataclass
' except the config is controlled by setting the
__pydantic_config__
attribute of the class to a valid ConfigDict
.