Django has a sophisticated system to trigger certain logic when you save, delete, change many-to-many relations, etc. Very often people make use of such signals and for some edge-cases these are indeed the only effective solution, but there are only a few cases where using signals is appropriate.
Why is it a problem?
Signals have a variety of problems and unforeseen consequences. In the below sections, we list a few.
Signals can be circumvented
One of the main problems with signals is that signals do not always run. Indeed the
post_save signals will not run when we save or update objects in bulk. For example if we create multiple
Post.objects.bulk_create([ Post(title='foo'), Post(title='bar'), Post(title='qux') ])
then the signals do not run. The same happens when you update posts, for example with:
Often people assume that signals will run in that case, and for example perform calculations with the signals: they recalculate a certain field, based on the updated values. Since one can update a field without triggering the the corresponding signals, then this results in an inconsistent value. Signals thus give a false sense of security that the handler will indeed update the object accordingly.
Signals are request unaware
Every now and then people try to implement a signal that needs a request object to perform a certain piece of logic. For example one might want to send an email each time a certain model is created to the person that made the HTTP request.
Django's signal processing however does not capture the request. This makes sense since these signals get fired by models that are created, updated, removed, etc. and models are request unaware as well.
The fact that these are request unaware makes it harder to implement certain behavior. For example a signal that updates the
owner field of a model object with the currently logged in user. Strictly speaking it is possible to implement this, for example by inspecting the call stack. If the signal is triggered by a view, eventually one will find a call to that view, and thus one can obtain the
request object. Another way to do this would be to implement middleware that keeps track of the user that makes the request, and then uses that, but that will introduce a global state antipattern.
This thus means that while technically there are some ways to make signals request-aware these solutions are often ugly and furthermore can fail if a management command triggers the change.
Signals can raise exceptions and break the code flow
If the signals run, for example when we call
.save() on a model object, then the triggers will run. Contrary to popular belief, signals do not run asynchronous, but in a synchronous manner: there is a list of functions and these will all run. A second problem is that these signals might raise an error, and this will thus result in the function that triggered the views, raising that error. Developers often do not take this into account.
If such error is raised, then eventually the
.save() call will raise an error. Even if the developer takes this into account, it is hard to anticipate on the consequences: if there are multiple handlers for the same signal, then some of the handlers can have made changes whereas others might not have been invoked. It thus makes it more complicated to repair the object, since the handlers might already have changed the object partially.
Signals can result in infinite recursion
It is also rather easy to get stuck in an infinite loop with signals. If we for example have a model of a
Profile with a signal that will remove the
User if we remove the
from django.db import models from django.db.models.signals import pre_delete from django.dispatch import receiver class Profile(models.Model): user = models.OneToOneField( settings.AUTH_USER_MODEL, on_delete=models.CASCADE ) # … @receiver(pre_delete, sender=Profile) def delete_profile(sender, instance, using): instance.user.delete()
If we now remove a
Profile, this will get stuck in an infinite loop. Indeed, first we start removing a
Profile. This will trigger the signal to run, which will remove the related user object. But Django will look what to do when removing the user, and it thus will first remove the
Profile again triggering the signal. It is easy to end up with infinite recursion when defining signals. Especially if we use signals on two models that are related to each other.
Signals run before updating many-to-many relations
post_save signals of an object run immediately before and after an object is saved to the database. If we have a model with a
ManyToManyField, then when we create that object and the signals run, the
ManyToManyField is not yet populated. This is because a
ModelForm first needs to create the object, before that object has a primary key and thus can start populating the many-to-many relation. If we for example have two models
Book with a many-to-many relation, and we want to use a signal that counts the number of books an
Author has written, then the following signal will not work when we create an
Author, and the form also to specify the books:
from django.db.models.signals import pre_save from django.dispatch import receiver @receiver(pre_save, sender=Author) def save_author(sender, instance, created, raw, using, update_fields): instance.num_books = instance.books.count()
Regardless whether we use a
post_save signal, at that moment in time
instance.books.all() is an empty queryset.
Signals make altering objects less predictable
Even if only one handler is attached to the the signal, and that handler can never raise an error, the handler still is often not an elegant solution. Another developer might not be aware of its existence, since it has only a "weak" binding to the model, and thus it makes the effect of saving an object less predictable.
Signals do not run when other programs make changes
Finally other programs can also make changes to the database, and thus will not trigger the signals, and this eventually could lead to the database being in an inconsistent state. Another program could for example create a new book for an author, but might not update the field in the
Author model that keeps track of the number of books written by that author. It will be quite hard to "translate" all the handlers in Django to other programs that interact with the same database.
What can be done to resolve the problem?
Often it is better to avoid using signals. One can implement a lot of logic without signals.
Calculating properties on-demand
The most robust way to count the number of
Books of an
Author is not to store the number of books in a field, but use
.annotate(…) [Django-doc] to each time annotate the
Authors with the number of
Books they have written. We thus can make a query that looks like:
from django.db.models import Count Author.objects.annotate( num_books=Count('books') )
Often if the number of
Books is not that large, this will still scale quite well. It is more robust: if somehow another program removed a book, or a view was triggered that somehow circumvented the update logic, it will still work with the correct amount of books.
Here of course we each time recalculate the number of
Author when we query. If the number of
Authors grows, then this can become a performance bottleneck.
Encapsulating update logic in the view/form and ModelAdmin
Another option might be to encapsulate the handler logic in a specific function. For example if we want to count the number of books of an
Author each time we save/update a
Book, we can implement the logic:
and then we can call this function in the views where we create/update the book. For example:
def my_view(request): if request.method == 'POST': form = BookForm(request.POST, request.FILES) if form.is_valid(): book = form.save() update_book(book) # … # … # …
we can also construct a mixin that we can use in class-based views and the
from django.contrib import admin class MyModelAdmin(admin.ModelAdmin): def save_model(self, request, obj, form, change): update_book(obj) super().save_model(request, obj, form, change)
If the task takes too much time, you can set up a queue where a message is queued that will then trigger a task to update the data. This is however not something specific to encapsulate logic into a function: if you work with signals, then these signals can go in timeout as well, and thus render the server irresponsive.
Signals can still be a good solution if you want to handle events raised by a third party Django application. In many cases, this is the only effective way to handle certain events. For example the
auth module provides signals when the user logs in, logs out, or fails to log in [Django-doc] these signals are typically more reliable, since these are not triggered by the ORM. Often for third party applications signals are an effective way to communicate with these applications.