Skip to main content

Django Optimization Processes for Write Operation for Postgresql

When optimizing a Django project for large write operations, especially when dealing with PostgreSQL, there are several strategies you can employ to reduce the time it takes to perform these operations:



1. Bulk Inserts

In django, we create objects using create() . Asynchronous version is acreate() .It's a  convenience method for creating an object and saving it all in one step. 



and 



These are same and equivalent. The create() method is used to create and save a single object in the database.

Example:



Instead of inserting one row at a time, consider using Django's bulk_create() method to insert multiple rows in a single query. This reduces the overhead of multiple database round trips.

Example: 



The bulk_create() method is used to create and save multiple objects in the database in a single query.

It accepts a list of model instances and inserts them into the database in a single batch operation, which significantly reduces the overhead compared to individual create() calls.


Cons:

1. It does not work with many-to-many relationships.

2. The model’s save() method will not be called, and the pre_save and post_save signals will not be sent.

3. If the model’s primary key is an AutoField and ignore_conflicts is False, the primary key attribute can only be retrieved on certain databases (currently PostgreSQL, MariaDB 10.5+, and SQLite 3.35+). On other databases, it will not be set.


2. Database optimization

Tune your PostgreSQL database configuration settings such as shared_buffers, work_mem, and checkpoint_segments to better suit your application's needs. This can help PostgreSQL handle large write operations more efficiently.


3. Transactions

Django gives you a few ways to control how database transactions are managed.Django’s default behavior is to run in autocommit mode. Each query is immediately committed to the database, unless a transaction is active.


# How to use transactions in django ?

Imagine you're building a web app for a bookstore where users can purchase books. Each purchase involves updating multiple tables in your database: deducting the purchased book's quantity from the inventory and recording the transaction details in the sales table.

Here's how you can use Django transactions to optimize this process:




Let's break down how this code works:

1. We import Django's transaction module and our models (Book and Sale).

2. Inside the purchase_book function, we use transaction.atomic() as a context manager to ensure that all database operations within it are treated as a single transaction. This means either all the operations succeed, or none of them do.

3. We use select_for_update() to lock the row in the Book table that corresponds to the book being purchased. This prevents other transactions from modifying the same row simultaneously, ensuring data consistency.

4. We check if the requested quantity is available in stock. If not, we raise a ValueError and the transaction will be rolled back.

5. If there's enough stock, we deduct the purchased quantity from the book's inventory and save the changes to the database.

6. We record the transaction details (user, book, quantity) in the Sale table.

7. If any exception occurs during the transaction (e.g., database error, insufficient stock), the transaction is rolled back, and any changes made within it are undone.


By using transactions, we ensure that the purchase operation is atomic, consistent, isolated, and durable. This not only optimizes the write operation by bundling multiple database changes into a single transaction but also ensures data integrity and reliability, even in the face of errors or concurrent access.


Mahfujul Hasan
Software Engineer

linkedIn: https://www.linkedin.com/in/shojibhasan/ 

Comments

Popular posts from this blog

Implementing Advance Query Optimization in Django ORM

 Django's ORM makes database interactions seamless, allowing developers to write queries in Python without raw SQL. However, as applications scale, inefficient queries can slow down performance, leading to high latency and database load.  This guide explores advanced query optimization techniques in Django ORM to go beyond basic CRUD (Create, Read, Update, Delete) operations and improve efficiency.  1. Use QuerySet Caching to Avoid Repeated Queries Using cache reduces redundant queries for frequently accessed data. Caching helps reduce repeated database hits. 2. Avoid .count() on Large Datasets Using .count() on large tables can be expensive Inefficient way: Optimized way ( .exists() is Faster) 3. Use Indexes for Faster Lookups Indexes speed up queries on frequently filtered fields. Add db_index=True for frequently queried fields: 4. Optimize Bulk Inserts and Updated Performing operations on multiple records one by one is inefficient. Use bulk_create() for mass insert...

Database Indexing in Django application

  Database Indexing Database indexing is a technique used to optimize the performance of database queries by allowing the database management system (DBMS) to quickly locate and retrieve specific rows of data. Indexes are data structures that provide a faster way to look up records based on the values stored in one or more columns of a table. When you create an index on a table, the DBMS creates a separate data structure that maps the values in the indexed columns to the corresponding rows in the table. Default Type of Index is B-Tree Index ( The king of all indexes) বইতে কোন টপিক খুজতে গেলে আমরা টেবিল অফ কনটেন্ট থেকে দেখি এই টপিক কত নম্বর পেজে আছে।যাতে করে আমাদের পুরো বই খুজতে না হয়। ডেটাবেজ ইনডেক্সিং ও তেমনই একটা ইফিসিয়েন্ট টেকনিক।ডেটাবেজে কোন ডেটাকে দ্রুত খুজে বের করার জন্য ইনডেক্সিং করা লাগে।যদি এমন হয় একটা কুয়েরি বার বার এক্সিকিউট করতে হচ্ছে এবং একটা কলাম থেকে ভ্যালু বার বার খুজতে হচ্ছে তখন আমরা সেই কলামে ইনডেক্সিং করতে পারি।এর মাধ্যমে কোন ডেটা দ্রুত রিট্রাইভ করা যায়।কিন্তু ই...

Django select_related and prefetch_related

  Difference between select_related and prefetch_related Reducing SQL queries is one of the first steps when optimizing a Django project. There are two powerful methods included in the Django ORM to help us that can boost performance by creating a single more complex QuerySet rather than multiple, smaller queries. In this project we will understand about  select_related and prefetch_related.  Django use these two orm method to reduce sql queries in database based on different scenario.  select_related Lets assume  this two model we have.  class Author ( models . Model ): name = models . CharField ( max_length = 200 ) def __str__ ( self ): return self . name class Courses ( models . Model ): name = models . CharField ( max_length = 200 ) author = models . ForeignKey ( Author , on_delete = models . CASCADE , related_name = 'courses' ) def __str__ ( self ): return self . name Here we have two mode. ...