Skip to content

Overview

CI Coverage pypi CondaForge downloads license

Documentation for version: v1.10.18

Data validation and settings management using Python type annotations.

pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid.

Define how data should be in pure, canonical Python; validate it with pydantic.

Example

from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel


class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: Optional[datetime] = None
    friends: List[int] = []


external_data = {
    'id': '123',
    'signup_ts': '2019-06-01 12:22',
    'friends': [1, 2, '3'],
}
user = User(**external_data)
print(user.id)
#> 123
print(repr(user.signup_ts))
#> datetime.datetime(2019, 6, 1, 12, 22)
print(user.friends)
#> [1, 2, 3]
print(user.dict())
"""
{
    'id': 123,
    'signup_ts': datetime.datetime(2019, 6, 1, 12, 22),
    'friends': [1, 2, 3],
    'name': 'John Doe',
}
"""
from datetime import datetime
from typing import Optional
from pydantic import BaseModel


class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: Optional[datetime] = None
    friends: list[int] = []


external_data = {
    'id': '123',
    'signup_ts': '2019-06-01 12:22',
    'friends': [1, 2, '3'],
}
user = User(**external_data)
print(user.id)
#> 123
print(repr(user.signup_ts))
#> datetime.datetime(2019, 6, 1, 12, 22)
print(user.friends)
#> [1, 2, 3]
print(user.dict())
"""
{
    'id': 123,
    'signup_ts': datetime.datetime(2019, 6, 1, 12, 22),
    'friends': [1, 2, 3],
    'name': 'John Doe',
}
"""
from datetime import datetime
from pydantic import BaseModel


class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime | None = None
    friends: list[int] = []


external_data = {
    'id': '123',
    'signup_ts': '2019-06-01 12:22',
    'friends': [1, 2, '3'],
}
user = User(**external_data)
print(user.id)
#> 123
print(repr(user.signup_ts))
#> datetime.datetime(2019, 6, 1, 12, 22)
print(user.friends)
#> [1, 2, 3]
print(user.dict())
"""
{
    'id': 123,
    'signup_ts': datetime.datetime(2019, 6, 1, 12, 22),
    'friends': [1, 2, 3],
    'name': 'John Doe',
}
"""

(This script is complete, it should run "as is")

What's going on here:

  • id is of type int; the annotation-only declaration tells pydantic that this field is required. Strings, bytes or floats will be coerced to ints if possible; otherwise an exception will be raised.
  • name is inferred as a string from the provided default; because it has a default, it is not required.
  • signup_ts is a datetime field which is not required (and takes the value None if it's not supplied). pydantic will process either a unix timestamp int (e.g. 1496498400) or a string representing the date & time.
  • friends uses Python's typing system, and requires a list of integers. As with id, integer-like objects will be converted to integers.

If validation fails pydantic will raise an error with a breakdown of what was wrong:

from pydantic import ValidationError

try:
    User(signup_ts='broken', friends=[1, 2, 'not number'])
except ValidationError as e:
    print(e.json())

(This script requires User from previous example)

Outputs:

[
  {
    "loc": [
      "id"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": [
      "signup_ts"
    ],
    "msg": "invalid datetime format",
    "type": "value_error.datetime"
  },
  {
    "loc": [
      "friends",
      2
    ],
    "msg": "value is not a valid integer",
    "type": "type_error.integer"
  }
]

Rationale

So pydantic uses some cool new language features, but why should I actually go and use it?

plays nicely with your IDE/linter/brain
There's no new schema definition micro-language to learn. If you know how to use Python type hints, you know how to use pydantic. Data structures are just instances of classes you define with type annotations, so auto-completion, linting, mypy, IDEs (especially PyCharm), and your intuition should all work properly with your validated data.
dual use
pydantic's BaseSettings class allows pydantic to be used in both a "validate this request data" context and in a "load my system settings" context. The main differences are that system settings can be read from environment variables, and more complex objects like DSNs and Python objects are often required.
fast
pydantic has always taken performance seriously, most of the library is compiled with cython giving a ~50% speedup, it's generally as fast or faster than most similar libraries.
validate complex structures
use of recursive pydantic models, typing's standard types (e.g. List, Tuple, Dict etc.) and validators allow complex data schemas to be clearly and easily defined, validated, and parsed.
extensible
pydantic allows custom data types to be defined or you can extend validation with methods on a model decorated with the validator decorator.
dataclasses integration
As well as BaseModel, pydantic provides a dataclass decorator which creates (almost) vanilla Python dataclasses with input data parsing and validation.

Using Pydantic

Hundreds of organisations and packages are using pydantic, including:

FastAPI
a high performance API framework, easy to learn, fast to code and ready for production, based on pydantic and Starlette.
Project Jupyter
developers of the Jupyter notebook are using pydantic for subprojects, through the FastAPI-based Jupyter server Jupyverse, and for FPS's configuration management.
Microsoft
are using pydantic (via FastAPI) for numerous services, some of which are "getting integrated into the core Windows product and some Office products."
Amazon Web Services
are using pydantic in gluon-ts, an open-source probabilistic time series modeling library.
The NSA
are using pydantic in WALKOFF, an open-source automation framework.
Uber
are using pydantic in Ludwig, an open-source TensorFlow wrapper.
Cuenca
are a Mexican neobank that uses pydantic for several internal tools (including API validation) and for open source projects like stpmex, which is used to process real-time, 24/7, inter-bank transfers in Mexico.
The Molecular Sciences Software Institute
are using pydantic in QCFractal, a massively distributed compute framework for quantum chemistry.
Reach
trusts pydantic (via FastAPI) and arq (Samuel's excellent asynchronous task queue) to reliably power multiple mission-critical microservices.
Robusta.dev
are using pydantic to automate Kubernetes troubleshooting and maintenance. For example, their open source tools to debug and profile Python applications on Kubernetes use pydantic models.

For a more comprehensive list of open-source projects using pydantic see the list of dependents on github.

Discussion of Pydantic

Podcasts and videos discussing pydantic.

Talk Python To Me
Michael Kennedy and Samuel Colvin, the creator of pydantic, dive into the history of pydantic and its many uses and benefits.
Podcast.__init__
Discussion about where pydantic came from and ideas for where it might go next with Samuel Colvin the creator of pydantic.
Python Bytes Podcast
"This is a sweet simple framework that solves some really nice problems... Data validations and settings management using Python type annotations, and it's the Python type annotations that makes me really extra happy... It works automatically with all the IDE's you already have." --Michael Kennedy
Python pydantic Introduction – Give your data classes super powers
a talk by Alexander Hultnér originally for the Python Pizza Conference introducing new users to pydantic and walking through the core features of pydantic.