Table of contents
No headings in the article.
While working with Django, we all write code that does the job, but some code may be performing excessive computations or operations that we are unaware of. These operations may be ineffective and/or counterproductive in practice.
Here, I am going to mention some anti-patterns in Django models.
The queryset in Django are lazily evaluated which means that records in database aren’t read from database until we interact with the data.
len(queryset)performs the count of database records by Python interpreter in application level. For doing so, all the records should be fetched from the database at first, which is computationally heavy operation.
queryset.count()calculates the count at the database level and just returns the count.
For a model
from django.db import models class Post(models.Model): author = models.CharField(max_length=100) title = models.CharField(max_length=200) content = models.TextField()
If we use
len(queryset), it handles the calculation like
SELECT * FROM post which returns a list of records (queryset) and then python interpreter calculates the length of queryset which is similar to list data structure. Imagine the waste in downloading many records only to check the length and throw them away at the end! But, if we need the records after reading the length, then
len(queryset) can be valid.
If we use
queryset.count(), it handles the calculation like
SELECT COUNT(*) FROM post at database level. It makes the code execution quicker and improves database performance.
- While we kept praising the use of
queryset.count()to check the length of a queryset, using it may be performance heavy if we want to check the existence of the queryset.
For the same model
Post, when we want to check if there are any post written by author
Arjun, we may do something like:
- While we kept praising the use of
posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun') if posts_by_arjun.count() > 0: print('Arjun writes posts here.') else: print('Arjun doesnt write posts here.)
The posts_by_arjun.count() performs an SQL operation that scans every row in a database table. But, if we are just interested in
If Arjun writes posts here or not ? then, more efficient code will be:
posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun') if posts_by_arjun.exists(): print('Arjun writes posts here.') else: print('Arjun doesnt write posts here.)
posts_by_arjun.exists() returns a bool expression that finds out if at least one result exists or not. It simply reads a single record in the most optimized way (removing ordering, clearing any user-defined
Also, checking existence / truthiness of queryset like this is inefficient.
posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun') if posts_by_arjun: print('Arjun writes posts here.') else: print('Arjun doesnt write posts here.)
This does the fine job in checking if there are any posts by Arjun or not but is computationally heavy for larger no of records. Hence, use of
queryset.exists() is encouraged for checking existence / truthiness of querysets.
Django signals are great for triggering jobs based on events. But it has some valid cases, and they shouldn’t be used excessively. Think of any alternative for signals within your codebase, brainstorm on its substitution and try to place signals logic in your models itself, if possible.
They are not executed asynchronously. There is no background thread or worker to execute them. If you want some background worker to do your job for you, try using
As signals are spread over separate files if you’re working on a larger project, they can be harder to trace for someone who is a fresh joiner to the company and that’s not great. Although,
django-debug-toolbardoes some help in tracing the triggered signals.
Let’s create a scenario where we want to keep the record of
Post writings in a separate model
class PostWritings(models.Model): author = models.CharField(max_length=100, unique=True) posts_written = models.PositiveIntegerField(default=0)
If we want to automatically update the
PostWritings record for a use based on records created on
Post model, there are ways to achieve the task with / without signals.
A. With Signals
from django.db.models.signals import post_save from django.db.models import F from django.dispatch import receiver from .models import Post def post_writing_handler(sender, instance, created, **kwargs): if created: writing, created = PostWritings.objects.get_or_create(author=instance.author) writing.update(posts_written=F('posts_written') + 1)
B. Without Signals
We need to override the
save() method for
from django.db import models from django.db.models import F class Post(models.Model): author = models.CharField(max_length=100) title = models.CharField(max_length=200) content = models.TextField() def save(self, *args, **kwargs: # Overridden method. author = self.author if self.id: writing, created = PostWritings.objects.get_or_create(author=author) writing.update(posts_written=F('posts_written') + 1) super(Post, self).save(*args, **kwargs)
As the same job can be accomplished without signals, the code can be easily traced and prevent unnecessary event triggers.
If someone feels about not having readability on
save() method here, breaking up code is always great. Let’s do that.
from django.db import models from django.db.models import F class Post(models.Model): author = models.CharField(max_length=100) title = models.CharField(max_length=200) content = models.TextField() def _update_post_writing(self, created=False, author=None): if author is not None and craeted: writing, created = PostWritings.objects.get_or_create(author=author) writing.update(posts_written=F('posts_written') + 1) def save(self, *args, **kwargs: # Overridden method. author = self.author created = self.id is None super(Post, self).save(*args, **kwargs) self._update_post_writing(created, author)
Not defining an
__str__method for a model
When you define a model, it is important to create a string representation of that model so that it can be displayed accurately. If the
__str__method is not defined, then it will not be possible to access the model and display it within the user interface correctly.
Not using the built-in ModelForm validation
Using the built-in ModelForm validation can help to ensure that the data being entered into the form is valid. This is useful for form validation, preventing invalid data from being entered and ensuring that forms are processed correctly.
Having complex relationships or calculations within your model
Models should not contain too much complexity as this can lead to them becoming difficult to maintain. Keeping the relationships and any calculations within the model to a minimum helps to keep things organized and makes it easier to debug any potential issues.
When creating large numbers of objects - When creating large numbers of objects, it is important to use the
bulk_createmethods rather than creating them one at a time. This can help to make the process faster and more efficient by avoiding unnecessary extra steps.
When creating model instances - It is important to pay attention to the
unique_for_datefields when creating model instances, as they are used to ensure that data is unique and consistent. Failing to take these fields into account can lead to duplicate data being written to the database.
We appear to have learned how to mitigate some Django Model Anti Patterns. For now, thank you everyone for having me here. I'll be back with more Django-related content soon. You can also find me on GitHub. Till then
keep coding :)