Skip to content

Dataclasses

If you don't want to use Pydantic's BaseModel you can instead get the same data validation on standard dataclasses (introduced in Python 3.7).

from datetime import datetime

from pydantic.dataclasses import dataclass


@dataclass
class User:
    id: int
    name: str = 'John Doe'
    signup_ts: datetime = None


user = User(id='42', signup_ts='2032-06-21T12:00')
print(user)
"""
User(id=42, name='John Doe', signup_ts=datetime.datetime(2032, 6, 21, 12, 0))
"""

Note

Keep in mind that pydantic.dataclasses.dataclass is not a replacement for pydantic.BaseModel. pydantic.dataclasses.dataclass provides a similar functionality to dataclasses.dataclass with the addition of Pydantic validation. There are cases where subclassing pydantic.BaseModel is the better choice.

For more information and discussion see pydantic/pydantic#710.

Some differences between Pydantic dataclasses and BaseModel include:

You can use all the standard Pydantic field types. Note, however, that arguments passed to constructor will be copied in order to perform validation and, where necessary coercion.

To perform validation or generate a JSON schema on a Pydantic dataclass, you should now wrap the dataclass with a TypeAdapter and make use of its methods.

Fields that require a default_factory can be specified by either a pydantic.Field or a dataclasses.field.

import dataclasses
from typing import List, Optional

from pydantic import Field, TypeAdapter
from pydantic.dataclasses import dataclass


@dataclass
class User:
    id: int
    name: str = 'John Doe'
    friends: List[int] = dataclasses.field(default_factory=lambda: [0])
    age: Optional[int] = dataclasses.field(
        default=None,
        metadata=dict(title='The age of the user', description='do not lie!'),
    )
    height: Optional[int] = Field(None, title='The height in cm', ge=50, le=300)


user = User(id='42')
print(TypeAdapter(User).json_schema())
"""
{
    'properties': {
        'id': {'title': 'Id', 'type': 'integer'},
        'name': {'default': 'John Doe', 'title': 'Name', 'type': 'string'},
        'friends': {
            'items': {'type': 'integer'},
            'title': 'Friends',
            'type': 'array',
        },
        'age': {
            'anyOf': [{'type': 'integer'}, {'type': 'null'}],
            'default': None,
            'description': 'do not lie!',
            'title': 'The age of the user',
        },
        'height': {
            'anyOf': [
                {'maximum': 300, 'minimum': 50, 'type': 'integer'},
                {'type': 'null'},
            ],
            'default': None,
            'title': 'The height in cm',
        },
    },
    'required': ['id'],
    'title': 'User',
    'type': 'object',
}
"""

pydantic.dataclasses.dataclass's arguments are the same as the standard decorator, except one extra keyword argument config which has the same meaning as model_config.

Warning

After v1.2, The Mypy plugin must be installed to type check pydantic dataclasses.

For more information about combining validators with dataclasses, see dataclass validators.

Dataclass config

If you want to modify the config like you would with a BaseModel, you have two options:

  • Apply config to the dataclass decorator as a dict
  • Use ConfigDict as the config
from pydantic import ConfigDict
from pydantic.dataclasses import dataclass


# Option 1 - use directly a dict
# Note: `mypy` will still raise typo error
@dataclass(config=dict(validate_assignment=True))  # (1)!
class MyDataclass1:
    a: int


# Option 2 - use `ConfigDict`
# (same as before at runtime since it's a `TypedDict` but with intellisense)
@dataclass(config=ConfigDict(validate_assignment=True))
class MyDataclass2:
    a: int
  1. You can read more about validate_assignment in model_config.

Note

Pydantic dataclasses do not support extra='allow', where extra fields passed to the initializer would be stored as extra attributes on the dataclass.

extra='ignore' is still supported for the purpose of ignoring unexpected fields while parsing data; they just won't be stored on the instance.

Nested dataclasses

Nested dataclasses are supported both in dataclasses and normal models.

from pydantic import AnyUrl
from pydantic.dataclasses import dataclass


@dataclass
class NavbarButton:
    href: AnyUrl


@dataclass
class Navbar:
    button: NavbarButton


navbar = Navbar(button={'href': 'https://example.com'})
print(navbar)
#> Navbar(button=NavbarButton(href=Url('https://example.com/')))

When used as fields, dataclasses (Pydantic or vanilla) should use dicts as validation inputs.

Stdlib dataclasses and Pydantic dataclasses

Inherit from stdlib dataclasses

Stdlib dataclasses (nested or not) can also be inherited and Pydantic will automatically validate all the inherited fields.

import dataclasses

import pydantic


@dataclasses.dataclass
class Z:
    z: int


@dataclasses.dataclass
class Y(Z):
    y: int = 0


@pydantic.dataclasses.dataclass
class X(Y):
    x: int = 0


foo = X(x=b'1', y='2', z='3')
print(foo)
#> X(z=3, y=2, x=1)

try:
    X(z='pika')
except pydantic.ValidationError as e:
    print(e)
    """
    1 validation error for X
    z
      Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='pika', input_type=str]
    """

Use of stdlib dataclasses with BaseModel

Bear in mind that stdlib dataclasses (nested or not) are automatically converted into Pydantic dataclasses when mixed with BaseModel! Furthermore the generated Pydantic dataclass will have the exact same configuration (order, frozen, ...) as the original one.

import dataclasses
from datetime import datetime
from typing import Optional

from pydantic import BaseModel, ValidationError


@dataclasses.dataclass(frozen=True)
class User:
    name: str


@dataclasses.dataclass
class File:
    filename: str
    last_modification_time: Optional[datetime] = None


class Foo(BaseModel):
    file: File
    user: Optional[User] = None


file = File(
    filename=['not', 'a', 'string'],
    last_modification_time='2020-01-01T00:00',
)  # nothing is validated as expected
print(file)
"""
File(filename=['not', 'a', 'string'], last_modification_time='2020-01-01T00:00')
"""

try:
    Foo(file=file)
except ValidationError as e:
    print(e)
    """
    1 validation error for Foo
    file.filename
      Input should be a valid string [type=string_type, input_value=['not', 'a', 'string'], input_type=list]
    """

foo = Foo(file=File(filename='myfile'), user=User(name='pika'))
try:
    foo.user.name = 'bulbi'
except dataclasses.FrozenInstanceError as e:
    print(e)
    #> cannot assign to field 'name'

Use custom types

Since stdlib dataclasses are automatically converted to add validation, using custom types may cause some unexpected behavior. In this case you can simply add arbitrary_types_allowed in the config!

import dataclasses

from pydantic import BaseModel, ConfigDict
from pydantic.errors import PydanticSchemaGenerationError


class ArbitraryType:
    def __init__(self, value):
        self.value = value

    def __repr__(self):
        return f'ArbitraryType(value={self.value!r})'


@dataclasses.dataclass
class DC:
    a: ArbitraryType
    b: str


# valid as it is a builtin dataclass without validation
my_dc = DC(a=ArbitraryType(value=3), b='qwe')

try:

    class Model(BaseModel):
        dc: DC
        other: str

    # invalid as it is now a pydantic dataclass
    Model(dc=my_dc, other='other')
except PydanticSchemaGenerationError as e:
    print(e.message)
    """
    Unable to generate pydantic-core schema for <class '__main__.ArbitraryType'>. Set `arbitrary_types_allowed=True` in the model_config to ignore this error or implement `__get_pydantic_core_schema__` on your type to fully support it.

    If you got this error by calling handler(<some type>) within `__get_pydantic_core_schema__` then you likely need to call `handler.generate_schema(<some type>)` since we do not call `__get_pydantic_core_schema__` on `<some type>` otherwise to avoid infinite recursion.
    """


class Model(BaseModel):
    model_config = ConfigDict(arbitrary_types_allowed=True)

    dc: DC
    other: str


m = Model(dc=my_dc, other='other')
print(repr(m))
#> Model(dc=DC(a=ArbitraryType(value=3), b='qwe'), other='other')

Initialization hooks

When you initialize a dataclass, it is possible to execute code before or after validation with the help of the @model_validator decorator mode parameter.

from typing import Any, Dict

from pydantic import model_validator
from pydantic.dataclasses import dataclass


@dataclass
class Birth:
    year: int
    month: int
    day: int


@dataclass
class User:
    birth: Birth

    @model_validator(mode='before')
    def pre_root(cls, values: Dict[str, Any]) -> Dict[str, Any]:
        print(values)
        #> ArgsKwargs((), {'birth': {'year': 1995, 'month': 3, 'day': 2}})
        return values

    @model_validator(mode='after')
    def post_root(self) -> 'User':
        print(self)
        #> User(birth=Birth(year=1995, month=3, day=2))
        return self

    def __post_init__(self):
        print(self.birth)
        #> Birth(year=1995, month=3, day=2)


user = User(**{'birth': {'year': 1995, 'month': 3, 'day': 2}})

The __post_init__ in Pydantic dataclasses is called after validation, rather than before.

from dataclasses import InitVar
from pathlib import Path
from typing import Optional

from pydantic.dataclasses import dataclass


@dataclass
class PathData:
    path: Path
    base_path: InitVar[Optional[Path]]

    def __post_init__(self, base_path):
        print(f'Received path={self.path!r}, base_path={base_path!r}')
        #> Received path=PosixPath('world'), base_path=PosixPath('/hello')
        if base_path is not None:
            self.path = base_path / self.path


path_data = PathData('world', base_path='/hello')
# Received path='world', base_path='/hello'
assert path_data.path == Path('/hello/world')

Difference with stdlib dataclasses

Note that the dataclasses.dataclass from Python stdlib implements only the __post_init__ method since it doesn't run a validation step.

When substituting usage of dataclasses.dataclass with pydantic.dataclasses.dataclass, it is recommended to move the code executed in the __post_init__ to methods decorated with model_validator.

JSON dumping

Pydantic dataclasses do not feature a .model_dump_json() function. To dump them as JSON, you will need to make use of the RootModel as follows:

import dataclasses
from typing import List

from pydantic import RootModel
from pydantic.dataclasses import dataclass


@dataclass
class User:
    id: int
    name: str = 'John Doe'
    friends: List[int] = dataclasses.field(default_factory=lambda: [0])


user = User(id='42')
print(RootModel[User](User(id='42')).model_dump_json(indent=4))

JSON output:

{
  "id": 42,
  "name": "John Doe",
  "friends": [
    0
  ]
}