
At WooMe, we’ve been using Django and Postgresql in anger for coming up on three years, scaling up to tens of thousands of queries per second means we’ve had to look long and hard at Django’s database support to meet our needs.
Our performance tweaks have run the gamut from simple queryset changes, through changing Django’s transactional behaviour, right the way up to writing our own table partitioning and master/slave replication support long before multidb support hit Django trunk.
Of biggest impact so far was rewriting Django’s transactional layer to allow database transactions to commit automatically and run with PgBouncer as a connection pooler.
Before we carried out this work (almost 2 years ago now), our 16-CPU database server was constantly being overwhelmed by Postgresql locks as it struggled to maintain a very high number of concurrent transactions. After the change we went from over 10,000 concurrent locks to under 100. Obviously with such a reduction resource utilisation fell dramatically.
In our production environment we chose to run PgBouncer in transaction pooling mode, an aggressive form of connection pooling. In transaction pooling, the connection is returned to the pool at the end of every single transaction. For single statement which don’t require a transaction block, this can offer a considerable performance gain.
By default, Django runs with an open transaction whenever it opens a connection to the database, and commits whenever the ORM sends the database a DML query (INSERT/UPDATE/DELETE). This means that in addition to the unnecessary performance overhead of maintaining many open transactions, we lose any potential gains that transaction pooling might offer.
from django.contrib.auth.models import User
from django.db import connection
from django.config import settings
settings.DEBUG=True
User.objects.get(username="test")
print connection.queries[-1]
# begin; select * from auth_user where username='test'; rollback;
User.objects.create(username='test2', email='test2@example.com')
print connection.queries[-1]
# begin; insert into auth_user (username, email) values ('test2', 'test2@example.com'); select currval(auth_user_id_seq); commit;
Obviously, for queries that don’t alter your data, running an open transaction has a high performance cost that could be better spent serving more queries. Ideally, what we’d want to do is only open a transaction when we know that we’re going to make use of Django’s transactional behaviour.
For Django 1.1, we wrote a patch to Django’s transactional behaviour which provides exactly this feature. The first step therefore is to add this to your settings:
# django 1.1
DATABASE_OPTIONS = {
'autocommit': True
}
# django 1.2+
DATABASES = {
'default': {
'OPTIONS': {
'autocommit': True
}
}
}
Now, our ORM queries don’t generate a transaction at all unless we specifically require one.
User.objects.get(username="test")
print connection.queries[-1]
# select * from auth_user where username='test';
User.objects.create(username='test2', email='test2@example.com')
print connection.queries[-1]
# insert into auth_user (username, email) values ('test2', 'test2@example.com') returning id;
from django.db import transaction
transaction.enter_transaction_management()
User.objects.create(username='test3', email='test3@example.com')
transaction.commit()
transaction.leave_transaction_management()
print connection.queries[-1]
# begin; insert into auth_user (username, email) values ('test2', 'test2@example.com') returning id;commit;
In the new queries we return the created id within the scope of the INSERT, rather than querying for the sequence value. One side effect of not having a transaction is that if we can’t guarantee that there are no additional inserts before the sequence look-up. Therefore we make use of Postgresql’s RETURNING operator to pass back the inserted id (or whatever field is designated as the primary key).
Of course, if you do want transactional control, the provided decorators and functions will still work perfectly. So you can decorate functions like this:
from django.db import transaction
@transaction.commit_on_success
def func():
# both of these queries will rollback if either fails
u, created = User.objects.get_or_create(username='test4', email='test4@example.org')
p, created = Profile.objects.get_or_create(user=u, info="a test")

Wow, this can be described with a simple word: awesome !!! This is the kind of work that I want to do with favorites rocks. I’ve used PostgreSQL intensively for 4 years and now I’m using Django and Python (I love it) for my daily work.
With the new release of PostgreSQL, there are many things that you can get of it:
- Built-in Binary Replication System(Streaming Replication and Hot Standby)
- Extended Frames of Windows Functions
- Triggers on columns
- WHEN Triggers
- A completed new VACUUM FULL
Simply amazing
All the features of the 9.0 version are described here:
http://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9.0
If you need another young but very focused professional to getting the things done, please send me a message. I don’t really interested on the money, but in the knowledge that I can get here. Regards
You know, the transactional behaviour required has been a life saver a few weeks ago. I was actually having a hard time making a custom migration from a Drupal website. There has been a need to mix manual transaction commits with automatic commits, and it’s quite amazing to see that it was you who make those decorators.
There are many people out there including myself who would like to thank you, for sure.