Automating the Boring Stuff in Django Using the Check Framework

How we use inspect, ast and the Django system check framework to improve our development process


Every team has a unique development style. Some teams implement localization and require translations. Some teams are more sensitive to database issues and require more careful handling of indexes and constraints.

Existing tools can not always address these specific issues out of the box, so we came up with a way to enforce our own development style using the Django check framework, the inspect and the ast modules from the Python standard library.

Image by <a href="https://www.instagram.com/_wrightdesign/">Wright Design</a>
Image by Wright Design

Table of Contents


Django checks

Django checks are part of the Django System Check framework. To quote the docs:

The system check framework is a set of static checks for validating Django projects. It detects common problems and provides hints for how to fix them. The framework is extensible so you can easily add your own checks.

One check you might be familiar with is this one from Django admin:

SystemCheckError: System check identified some issues:

ERRORS:
<class 'app.admin.BarAdmin>
(admin.E108) The value of 'list_display[3]' refers to 'foo',
which is not a callable, an attribute of 'Bar', or an attribute
or method on 'app.Bar'.

The Django admin developers added a system check to warn developers about fields in the model admin that does not exist in the actual model. In this case the field 'foo' do not exist in model Bar.

Checks are executed by some management commands such as makemigrations and migrate. It's also possible to explicitly run check using manage.py:

$ ./manage.py check

It's a good idea to incorporate check in your CI. If you want to fail the CI on warnings you can do that by setting a flag:

$ ./manage.py check --fail-level=WARNING

A simple example of how Django uses checks can be found in the source code of the model Field checks.

Our first check

Most of our apps are not designated for English speakers so we use translations extensively. We put a lot of focus during code review to make sure everything is translated properly.

One of the main issues that come up during code reviews is that developers often forget to set verbose_name on model fields.

Checking that a field has a verbose name is a pretty straightforward task and we wanted to automate the process of making sure it was set.

To get us started we are going to define a simple customer profile model:

class CustomerProfile(models.Model):

    id = models.PositiveSmallIntegerField(
        primary_key=True,
        verbose_name=_('id'),
    )

    name = models.CharField(
        max_length=100,
    )

    created_by = models.ForeignKey(
        User,
        on_delete=models.PROTECT,
    )

The "name" field does not have verbose_name. Let's see if we can identify that using only the model's _meta:

>>> name_field = CustomerProfile._meta.get_field('name')
>>> name_field.verbose_name
name

It looks like Django did something under to hood to set the verbose_name. Looking at the Field class, there is a function called set_attributes_from_name that populates verbose_name by transforming the name of the field - this is where the verbose_name "name" came from.

Because Django is setting the verbose_name on its own the string "name" will not be picked up by makemessages and will not be added to the po file automatically. This will probably cause the string "name" to go unnoticed. We don't want that.

Also, because Django is populating the field automatically we can't use the model _meta to check if verbose_name was originally set. To do that we need to inspect the actual source code.

Inspecting the code

I didn't use the word inspect for no reason - Python has a module called inspect that we can use to, well, inspect code:

The inspect module provides several useful functions to help get information about live objects such as modules, classes, methods, functions, tracebacks, frame objects, and code objects. For example, it can help you examine the contents of a class, retrieve the source code of a method, extract and format the argument list for a function, or get all the information you need to display a detailed traceback.

Let's see what we can get from inspect:

>>> import inspect
>>> inspect.getsource(CustomerProfile)

"class CustomerProfile(models.Model):\n id = models.PositiveSmallIntegerField(\n
primary_key=True,\n verbose_name=_('Name'),\n )\n name = models.CharField(\n
max_length=100,\n )\n created_by = models.ForeignKey(\n User,\n on_delete=models.PROTECT,\n
)\n\n def __str__(self):\n return self.name\n"

That's pretty exciting. We gave inspect the class and got the source code for that class as text.

Given the source code we could have used some fancy RegExp to parse the code but once again, Python already has us covered.

Parsing the code

Parsing code in Python is done by the ast module:

The ast module helps Python applications to process trees of the Python abstract syntax grammar.

Great! A tree is much easier to work with than text.

Let's use ast to parse the source code of our model:

>>> import inspect
>>> import ast
>>> model_source = inspect.getsource(CustomerProfile)
>>> model_node = ast.parse(model_source)
>>> ast.dump(model_node, False)

Module([
    ClassDef('CustomerProfile',
    [Attribute(Name('models', Load()), 'Model', Load())],
    [],
    [

        Assign(
            [Name('id', Store())],
            Call(
                Attribute(Name('models', Load()), 'PositiveSmallIntegerField', Load()),
                [],
                [
                keyword('primary_key', NameConstant(True)),
                keyword('verbose_name', Call(Name('_', Load()), [Str('Name')], []))
                ]
            )
        ),

        Assign(
            [Name('name', Store())],
            Call(
                Attribute(Name('models', Load()), 'CharField', Load()),
                [],
                [keyword('max_length', Num(100))]
            )
        ),

        Assign(
            [Name('created_by', Store())],
            Call(
                Attribute(Name('models', Load()), 'ForeignKey', Load()),
                [Name('User', Load())],
                [keyword('on_delete', Attribute(Name('models', Load()), 'PROTECT', Load()))]
            )
        ),

        FunctionDef(
            '__str__',
            arguments([arg('self', None)],None,[],[],None,[]),
            [Return(Attribute(Name('self', Load()), 'name', Load()))], [], None
        )
    ],
    []
    )
])

If we look closely at the dump we can identify that our model fields are all Assign nodes.

Let's zoom-in on the "name" field:

Assign(
    [Name('name', Store())],
    Call(
        Attribute(Name('models', Load()), 'CharField', Load()),
        [],
        [keyword('max_length', Num(100))]
    )
)

The model field is an assignment of a Call node (CharField) to a Name node ("name"). The Call node has a list of arguments. In this case we only have one argument "max_length" with the numeric value 100.

Our id field looks like this:

Assign(
    [Name('id', Store())],
    Call(
        Attribute(Name('models', Load()), 'PositiveSmallIntegerField', Load()), [], [
            keyword('primary_key', NameConstant(True)),
            keyword('verbose_name', Call(
               Name('_', Load()), [Str('Name')], []
            )
        )
    ])
)

The id field is also an Assign node with a Name node and a Call node. The id field has two keywords - primary_key and verbose_name, which is the one we are looking for.

Evaluating a Model Field

To evaluate the fields we first need to identify them. We already saw that model fields are Assign nodes but we can't rely on them being the only Assign nodes in the class.

The only thing we can rely on is that at the top level of the class the attribute names are unique. Meaning, if we know there is a field called "name" we can assume the attribute "name" of the class is the field.

Let's join forces with Django model _meta to find the nodes of the model fields:

from django.db.models import FieldDoesNotExist


for node in model_node.body[0].body:
    if not isinstance(node, ast.Assign):
        continue

    if len(node.targets) != 1:
        continue

    if not isinstance(node.targets[0], ast.Name):
        continue

    field_name = node.targets[0].id
    try:
        field = model._meta.get_field(field_name)
    except FieldDoesNotExist:
        continue

   # node is field!

Let's break it down:

  1. Model fields are defined at the top level of the class - we only need to check attributes defined at the top level (no need to "visit" nodes recursively).
  2. Model fields will have a Name target - the name of the field.
  3. Finally, the field we assign will be registered in the Django model as a field.

Now we have the field node and we can check if there is a verbose_name attribute defined.

Let's iterate the keywords and search for verbose_name:

for kw in node.value.keywords:
    if kw.arg == 'verbose_name':
       verbose_name = kw
        break
else:
    verbose_name = None

At this point, if verbose_name is None we know that the attribute was not set and we are ready to issue our first warning!

Issuing Django checks

To issue checks we need to register a function with the check framework:

from django.core import check checks

@checks.register(checks.Tags.models)
def run_custom_checks(app_configs, **kwargs):
    # implement check logic

Inside the function we implement the check logic and return a list of checks.

We want to warn the developer that a field is missing a verbose_name attribute, so once we find a field that has no verbose_name we create a CheckMessage of type Warning:

from django.core.checks import Warning

@checks.register(checks.Tags.models)
def run_custom_checks(app_configs, **kwargs):

    # inspect and parse models...

    return [(
        Warning(
            'Field has no verbose name',
            hint='Set verbose name on field {}.'.format(field.name),
            obj=field,
            id='H001',
        )
    )]

I assigned the code H00X to my warnings (guess why…). For each warning we can also add a hint to inform the developer on how to address the issue raised by the warning.

Putting it all together

To recap what we did so far:

  1. Get the source code for a model using inspect.
  2. Parse the model source code using ast and identify the field nodes.
  3. Examine a field node and check if verbose_name is defined.
  4. Register a function with the check framework and issue a Warning.

The skeleton of a function that checks a single model:

# common/checks.py

def check_model(model):
   """Check a single model.

   Yields (django.checks.CheckMessage)
   """
   model_source = inspect.getsource(model)
   model_node = ast.parse(model_source)

   for node in model_node.body[0].body:

       # Check if node is a model field.

       # Check if field has verbose name defined

       yield Warning(
            'Field has no verbose name',
            hint='Set verbose name on field {}.'.format(field.name),
            obj=field,
            id='H001',
        )

The next step is to implement a single function to iterate over all models, run our checks and register it with the Django check framework:

# common/checks.py

@checks.register(checks.Tags.models)
def check_models(app_configs, **kwargs):
    errors = []
    for app in django.apps.apps.get_app_configs():

        # Skip third party apps.
        if app.path.find('site-packages') > -1:
            continue

        for model in app.get_models():
            for check_message in check_model(model):
                errors.append(check_message)

    return errors

We use a little trick to skip models from third party apps. We assume that when installing third party apps using pip install they are installed in a directory called "site-packages".

The only thing left to do it to import this file somewhere in the code and that's it.

# app/__init__.py

from common.checks import *  # noqa

Let's see our new check in action:

$ ./manage.py check

SystemCheckError: System check identified some issues:

WARNINGS:
app.CustomerProfile.name: (H001) Field has no verbose name
HINT: Set verbose name on the field "name".

System check identified 1 issues (0 silenced).

Exactly what we wanted!


Custom Checks in the Real World

To give a sense of what you can do with Django checks, these are the checks we use in our code base:

  • H001: Field has no verbose name. This is the example we just saw.

  • H002: Verbose name should use gettext. Make sure verbose_name is always in the form of verbose_name=_('text'). If the value is not using gettext it will not be translated.

  • H003: Words in verbose name must be all upper case or all lower case. We decided to use only lower case in verbose names. Using lower case texts we were able to reuse more translations. One exception to the rule is acronyms such as API and ETL. The general rule we ended up with is making sure all words are either all lower or all upper case. For example, "etl run" is valid, "ETL run" is also valid, "Etl Run" is not valid.

  • H004: Help text should use gettext. Help text is displayed to the user in admin forms and detail views so it should use gettext and be translated.

  • H005: Model must define class Meta. The translation of the model name is defined in the model Meta class so every model must have a class Meta.

  • H006: Model has no verbose name. Model verbose names are defined in the class Meta and are displayed to the user in the admin so they should be translated.

  • H007: Model has no verbose name plural. Plural model names are used in the admin and are displayed to the user so they should be translated.

  • H008: Must set db_index explicitly on a ForeignKey field. This must be the most useful check we defined. This check forces the developer to explicitly set db_index on every ForeignKey field. I wrote in the past about how a database index is created implicitly for every foreign key field. By making sure the developer is aware of that and making him decide if an index is required or not, you are left with only the indexes you really need!

This is it, go piss off some colleagues!

source code

The complete source code for the checks above can be found in this gist.




Similar articles