When a developer chooses Python, Django, or Django Rest Framework, it's usually not because of its blazing fast performance. Python has always been the "comfortable" choice, the language you choose when you care more about ergonomics than skimming a few microseconds of some process.
There is nothing wrong with ergonomics. Most projects don't really need that micro second performance boost, but they do need to ship quality code fast.
All of this doesn't mean performance is not important. As this story taught us, major performance boosts can be gained with just a little attention, and a few small changes.
Table of Contents
Model Serializer Performance
A while back we noticed very poor performance from one of our main API endpoints. The endpoint fetched data from a very large table, so we naturally assumed that the problem must be in the database.
When we noticed that even small data sets get poor performance, we started looking into other parts of the app. This journey eventually led us to Django Rest Framework (DRF) serializers.
versions
In the benchmark we use Python 3.7, Django 2.1.1 and Django Rest Framework 3.9.4.
Simple Function
Serializers are used for transforming data into objects, and objects into data. This is a simple function, so let's write one that accepts a User
instance, and returns a dict:
from typing import Dict, Any
from django.contrib.auth.models import User
def serialize_user(user: User) -> Dict[str, Any]:
return {
'id': user.id,
'last_login': user.last_login.isoformat() if user.last_login is not None else None,
'is_superuser': user.is_superuser,
'username': user.username,
'first_name': user.first_name,
'last_name': user.last_name,
'email': user.email,
'is_staff': user.is_staff,
'is_active': user.is_active,
'date_joined': user.date_joined.isoformat(),
}
Create a user to use in the benchmark:
>>> from django.contrib.auth.models import User
>>> u = User.objects.create_user(
>>> username='hakib',
>>> first_name='haki',
>>> last_name='benita',
>>> email='me@hakibenita.com',
>>> )
For our benchmark we are using cProfile
. To eliminate external influences such as the database, we fetch a user in advance and serialize it 5,000 times:
>>> import cProfile
>>> cProfile.run('for i in range(5000): serialize_user(u)', sort='tottime')
15003 function calls in 0.034 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
5000 0.020 0.000 0.021 0.000 {method 'isoformat' of 'datetime.datetime' objects}
5000 0.010 0.000 0.030 0.000 drf_test.py:150(serialize_user)
1 0.003 0.003 0.034 0.034 <string>:1(<module>)
5000 0.001 0.000 0.001 0.000 __init__.py:208(utcoffset)
1 0.000 0.000 0.034 0.034 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
The simple function took 0.034 seconds to serialize a User
object 5,000 times.
ModelSerializer
Django Rest Framework (DRF) comes with a few utility classes, namely the ModelSerializer
.
A ModelSerializer
for the built-in User
model might look like this:
from rest_framework import serializers
class UserModelSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = [
'id',
'last_login',
'is_superuser',
'username',
'first_name',
'last_name',
'email',
'is_staff',
'is_active',
'date_joined',
]
Running the same benchmark as before:
>>> cProfile.run('for i in range(5000): UserModelSerializer(u).data', sort='tottime')
18845053 function calls (18735053 primitive calls) in 12.818 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
85000 2.162 0.000 4.706 0.000 functional.py:82(__prepare_class__)
7955000 1.565 0.000 1.565 0.000 {built-in method builtins.hasattr}
1080000 0.701 0.000 0.701 0.000 functional.py:102(__promise__)
50000 0.594 0.000 4.886 0.000 field_mapping.py:66(get_field_kwargs)
1140000 0.563 0.000 0.581 0.000 {built-in method builtins.getattr}
55000 0.489 0.000 0.634 0.000 fields.py:319(__init__)
1240000 0.389 0.000 0.389 0.000 {built-in method builtins.setattr}
5000 0.342 0.000 11.773 0.002 serializers.py:992(get_fields)
20000 0.338 0.000 0.446 0.000 {built-in method builtins.__build_class__}
210000 0.333 0.000 0.792 0.000 trans_real.py:275(gettext)
75000 0.312 0.000 2.285 0.000 functional.py:191(wrapper)
20000 0.248 0.000 4.817 0.000 fields.py:762(__init__)
1300000 0.230 0.000 0.264 0.000 {built-in method builtins.isinstance}
50000 0.224 0.000 5.311 0.000 serializers.py:1197(build_standard_field)
It took DRF 12.8 seconds to serialize a user 5,000 times, or 2.56ms to serialize just a single user. That is 377 times slower than the plain function.
We can see that a significant amount of time is spent in functional.py
. ModelSerializer
uses the lazy
function from django.utils.functional
to evaluate validations. It is also used by Django verbose names and so on, which are also being evaluated by DRF. This function seem to be weighing down the serializer.
Read Only ModelSerializer
Field validations are added by ModelSerializer
only for writable fields. To measure the effect of validation, we create a ModelSerializer
and mark all fields as read only:
from rest_framework import serializers
class UserReadOnlyModelSerializer(serializers.ModelSerializer):
class Meta:
model = User
fields = [
'id',
'last_login',
'is_superuser',
'username',
'first_name',
'last_name',
'email',
'is_staff',
'is_active',
'date_joined',
]
read_only_fields = fields
When all fields are read only, the serializer cannot be used to create new instances.
Let's run our benchmark with the read only serializer:
>>> cProfile.run('for i in range(5000): UserReadOnlyModelSerializer(u).data', sort='tottime')
14540060 function calls (14450060 primitive calls) in 7.407 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
6090000 0.809 0.000 0.809 0.000 {built-in method builtins.hasattr}
65000 0.725 0.000 1.516 0.000 functional.py:82(__prepare_class__)
50000 0.561 0.000 4.182 0.000 field_mapping.py:66(get_field_kwargs)
55000 0.435 0.000 0.558 0.000 fields.py:319(__init__)
840000 0.330 0.000 0.346 0.000 {built-in method builtins.getattr}
210000 0.294 0.000 0.688 0.000 trans_real.py:275(gettext)
5000 0.282 0.000 6.510 0.001 serializers.py:992(get_fields)
75000 0.220 0.000 1.989 0.000 functional.py:191(wrapper)
1305000 0.200 0.000 0.228 0.000 {built-in method builtins.isinstance}
50000 0.182 0.000 4.531 0.000 serializers.py:1197(build_standard_field)
50000 0.145 0.000 0.259 0.000 serializers.py:1310(include_extra_kwargs)
55000 0.133 0.000 0.696 0.000 text.py:14(capfirst)
50000 0.127 0.000 2.377 0.000 field_mapping.py:46(needs_label)
210000 0.119 0.000 0.145 0.000 gettext.py:451(gettext)
Only 7.4 seconds. A 40% improvement compared to the writable ModelSerializer
.
In the benchmark's output we can see a lot of time is being spent in field_mapping.py
and fields.py
. These are related to the inner workings of the ModelSerializer
. In the serialization and initialization process the ModelSerializer
is using a lot of metadata to construct and validate the serializer fields, and it comes at a cost.
"Regular" Serializer
In the next benchmark, we wanted to measure exactly how much the ModelSerializer
"costs" us. Let's create a "regular" Serializer
for the User
model:
from rest_framework import serializers
class UserSerializer(serializers.Serializer):
id = serializers.IntegerField()
last_login = serializers.DateTimeField()
is_superuser = serializers.BooleanField()
username = serializers.CharField()
first_name = serializers.CharField()
last_name = serializers.CharField()
email = serializers.EmailField()
is_staff = serializers.BooleanField()
is_active = serializers.BooleanField()
date_joined = serializers.DateTimeField()
Running the same benchmark using the "regular" serializer:
>>> cProfile.run('for i in range(5000): UserSerializer(u).data', sort='tottime')
3110007 function calls (3010007 primitive calls) in 2.101 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
55000 0.329 0.000 0.430 0.000 fields.py:319(__init__)
105000/5000 0.188 0.000 1.247 0.000 copy.py:132(deepcopy)
50000 0.145 0.000 0.863 0.000 fields.py:626(__deepcopy__)
20000 0.093 0.000 0.320 0.000 fields.py:762(__init__)
310000 0.092 0.000 0.092 0.000 {built-in method builtins.getattr}
50000 0.087 0.000 0.125 0.000 fields.py:365(bind)
5000 0.072 0.000 1.934 0.000 serializers.py:508(to_representation)
55000 0.055 0.000 0.066 0.000 fields.py:616(__new__)
5000 0.053 0.000 1.204 0.000 copy.py:268(_reconstruct)
235000 0.052 0.000 0.052 0.000 {method 'update' of 'dict' objects}
50000 0.048 0.000 0.097 0.000 fields.py:55(is_simple_callable)
260000 0.048 0.000 0.075 0.000 {built-in method builtins.isinstance}
25000 0.047 0.000 0.051 0.000 deconstruct.py:14(__new__)
55000 0.042 0.000 0.057 0.000 copy.py:252(_keep_alive)
50000 0.041 0.000 0.197 0.000 fields.py:89(get_attribute)
5000 0.037 0.000 1.459 0.000 serializers.py:353(fields)
Here is the leap we were waiting for!
The "regular" serializer took only 2.1 seconds. That's 60% faster than the read only ModelSerializer
, and a whooping 85% faster than the writable ModelSerializer
.
At this point it become obvious that the ModelSerializer
does not come cheap!
Read Only "regular" Serializer
In the writable ModelSerializer
a lot of time was spent on validations. We were able to make it faster by marking all fields as read only. The "regular" serializer does not define any validation, so marking fields as read only is not expected to be faster. Let's make sure:
from rest_framework import serializers
class UserReadOnlySerializer(serializers.Serializer):
id = serializers.IntegerField(read_only=True)
last_login = serializers.DateTimeField(read_only=True)
is_superuser = serializers.BooleanField(read_only=True)
username = serializers.CharField(read_only=True)
first_name = serializers.CharField(read_only=True)
last_name = serializers.CharField(read_only=True)
email = serializers.EmailField(read_only=True)
is_staff = serializers.BooleanField(read_only=True)
is_active = serializers.BooleanField(read_only=True)
date_joined = serializers.DateTimeField(read_only=True)
And running the benchmark for a user instance:
>>> cProfile.run('for i in range(5000): UserReadOnlySerializer(u).data', sort='tottime')
3360009 function calls (3210009 primitive calls) in 2.254 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
55000 0.329 0.000 0.433 0.000 fields.py:319(__init__)
155000/5000 0.241 0.000 1.385 0.000 copy.py:132(deepcopy)
50000 0.161 0.000 1.000 0.000 fields.py:626(__deepcopy__)
310000 0.095 0.000 0.095 0.000 {built-in method builtins.getattr}
20000 0.088 0.000 0.319 0.000 fields.py:762(__init__)
50000 0.087 0.000 0.129 0.000 fields.py:365(bind)
5000 0.073 0.000 2.086 0.000 serializers.py:508(to_representation)
55000 0.055 0.000 0.067 0.000 fields.py:616(__new__)
5000 0.054 0.000 1.342 0.000 copy.py:268(_reconstruct)
235000 0.053 0.000 0.053 0.000 {method 'update' of 'dict' objects}
25000 0.052 0.000 0.057 0.000 deconstruct.py:14(__new__)
260000 0.049 0.000 0.076 0.000 {built-in method builtins.isinstance}
As expected, marking the fields as readonly didn't make a significant difference compared to the "regular" serializer. This reaffirms that the time was spent on validations derived from the model's field definitions.
Results Summary
Here is a summary of the results so far:
serializer | seconds |
---|---|
UserModelSerializer |
12.818 |
UserReadOnlyModelSerializer |
7.407 |
UserSerializer |
2.101 |
UserReadOnlySerializer |
2.254 |
serialize_user |
0.034 |
Why is This Happening?
A lot of articles were written about serialization performance in Python. As expected, most articles focus on improving DB access using techniques like select_related
and prefetch_related
. While both are valid ways to improve the overall response time of an API request, they don't address the serialization itself. I suspect this is because nobody expects serialization to be slow.
Prior Work
Other articles that do focus solely on serialization usually avoid fixing DRF, and instead motivate new serialization frameworks such as marshmallow and serpy. There is even a site devoted to comparing serialization formats in Python. To save you a click, DRF always comes last.
In late 2013, Tom Christie, the creator of Django Rest Framework, wrote an article discussing some of DRF's drawbacks. In his benchmarks, serialization accounted for 12% of the total time spend on processing a single request. In the summary, Tom recommends to not always resort to serialization:
4. You don't always need to use serializers.
For performance critical views you might consider dropping the serializers entirely and simply use .values() in your database queries.
As we see in a bit, this is solid advice.
Fixing Django's lazy
In the first benchmark using ModelSerializer
we saw a significant amount of time being spent in functional.py
, and more specifically in the function lazy
.
The function lazy
is used internally by Django for many things such as verbose names, templates etc. The source describes lazy
as follows:
Encapsulate a function call and act as a proxy for methods that are called on the result of that function. The function is not evaluated until one of the methods on the result is called.
The lazy
function does its magic by creating a proxy of the result class. To create the proxy, lazy
iterates over all attributes and functions of the result class (and its super-classes), and creates a wrapper class which evaluates the function only when its result is actually used.
For large result classes, it can take some time to create the proxy. So, to speed things up, lazy
caches the proxy. But as it turns out, a small oversight in the code completely broke the cache mechanism, making the lazy
function very very slow.
To get a sense of just how slow lazy
is without proper caching, let's use a simple function which returns an str
(the result class), such as upper
. We choose str
because it has a lot of methods, so it should take a while to set up a proxy for it.
To establish a baseline, we benchmark using str.upper
directly, without lazy
:
>>> import cProfile
>>> from django.utils.functional import lazy
>>> upper = str.upper
>>> cProfile.run('''for i in range(50000): upper('hello') + ""''', sort='cumtime')
50003 function calls in 0.034 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.034 0.034 {built-in method builtins.exec}
1 0.024 0.024 0.034 0.034 <string>:1(<module>)
50000 0.011 0.000 0.011 0.000 {method 'upper' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Now for the scary part, the exact same function but this time wrapped with lazy
:
>>> lazy_upper = lazy(upper, str)
>>> cProfile.run('''for i in range(50000): lazy_upper('hello') + ""''', sort='cumtime')
4900111 function calls in 1.139 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 1.139 1.139 {built-in method builtins.exec}
1 0.037 0.037 1.139 1.139 <string>:1(<module>)
50000 0.018 0.000 1.071 0.000 functional.py:160(__wrapper__)
50000 0.028 0.000 1.053 0.000 functional.py:66(__init__)
50000 0.500 0.000 1.025 0.000 functional.py:83(__prepare_class__)
4600000 0.519 0.000 0.519 0.000 {built-in method builtins.hasattr}
50000 0.024 0.000 0.031 0.000 functional.py:106(__wrapper__)
50000 0.006 0.000 0.006 0.000 {method 'mro' of 'type' objects}
50000 0.006 0.000 0.006 0.000 {built-in method builtins.getattr}
54 0.000 0.000 0.000 0.000 {built-in method builtins.setattr}
54 0.000 0.000 0.000 0.000 functional.py:103(__promise__)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
No mistake! Using lazy
it took 1.139 seconds to turn 5,000 strings uppercase. The same exact function used directly took only 0.034 seconds. That is 33.5 faster.
This was obviously an oversight. The developers were clearly aware of the importance of caching the proxy. A PR was issued, and merged shortly after (diff here). Once released, this patch is supposed to make Django overall performance a bit better.
Fixing Django Rest Framework
DRF uses lazy
for validations and fields verbose names. When all of these lazy evaluations are put together, you get a noticeable slowdown.
The fix to lazy
in Django would have solved this issue for DRF as well after a minor fix, but nonetheless, a separate fix to DRF was made to replace lazy
with something more efficient.
To see the effect of the changes, install the latest of both Django and DRF:
(venv) $ pip install git+https://github.com/encode/django-rest-framework
(venv) $ pip install git+https://github.com/django/django
After applying both patches, we ran the same benchmark again. These are the results side by side:
serializer | before | after | % change |
---|---|---|---|
UserModelSerializer |
12.818 | 5.674 | -55% |
UserReadOnlyModelSerializer |
7.407 | 5.323 | -28% |
UserSerializer |
2.101 | 2.146 | +2% |
UserReadOnlySerializer |
2.254 | 2.125 | -5% |
serialize_user |
0.034 | 0.034 | 0% |
To sum up the results of the changes to both Django and DRF:
- Serialization time for writable
ModelSerializer
was cut by half. - Serialization time for a read only
ModelSerializer
was cut by almost a third. - As expected, there is no noticeable difference in the other serialization methods.
Takeaway
Our takeaways from this experiment were:
Take away
Upgrade DRF and Django once these patches make their way into a formal release.
Both PR's were merged but not yet released.
Take away
In performance critical endpoints, use a "regular" serializer, or none at all.
We had several places where clients were fetching large amounts or data using an API. The API was used only for reading data from the server, so we decided to not use a Serializer
at all, and inline the serialization instead.
Take away
Serializer fields that are not used for writing or validation, should be read only.
As we've seen in the benchmarks, the way validations are implemented makes them expensive. Marking fields as read only eliminate unnecessary additional cost.
Bonus: Forcing Good Habits
To make sure developers don't forget to set read only fields, we added a Django check to make sure all ModelSerializer
s set read_only_fields
:
# common/checks.py
import django.core.checks
@django.core.checks.register('rest_framework.serializers')
def check_serializers(app_configs, **kwargs):
import inspect
from rest_framework.serializers import ModelSerializer
import conf.urls # noqa, force import of all serializers.
for serializer in ModelSerializer.__subclasses__():
# Skip third-party apps.
path = inspect.getfile(serializer)
if path.find('site-packages') > -1:
continue
if hasattr(serializer.Meta, 'read_only_fields'):
continue
yield django.core.checks.Warning(
'ModelSerializer must define read_only_fields.',
hint='Set read_only_fields in ModelSerializer.Meta',
obj=serializer,
id='H300',
)
With this check in place, when a developer adds a serializer she must also set read_only_fields
. If the serializer is writable, read_only_fields
can be set to an empty tuple. If a developer forgets to set read_only_fields
, she gets the following error:
$ python manage.py check
System check identified some issues:
WARNINGS:
<class 'serializers.UserSerializer'>: (H300) ModelSerializer must define read_only_fields.
HINT: Set read_only_fields in ModelSerializer.Meta
System check identified 1 issue (4 silenced).
We use Django checks a lot to make sure nothing falls through the cracks. You can find many other useful checks in this article about how we use the Django system check framework.