Could Anyone Tell Me About The Lazy Loading And Transaction Of Django?
background like this:
data = User.objects.get(pk=1)
if data.age > 20:
with transaction.atomic():
data.age -=2
data.save()
I want to know ,if many process do the code at the same time,it like that,each process would get the data at the same time without transaction,for example ,age is 30.
then,one process do the next,make age-2=28 and save.
Then the next process do,when it do data.age -=2 ,the data get by data.
age ,would be 18 or 20? If it is 20,did it means,the transaction add the wrong place? Or it means , the transaction would not work,because the transaction would add to the data do select lines,and can change and save.but the transaction add with out select line?
second question:
if I do like this:
data = User.objects.get(pk=1)
with transaction.atomic():
if data.age > 20:
data.age -=2
data.save()
this demo,add the transaction before the data.age > 20. For the lazy loading,the sql lines would do when I use it ,such as data.age > 20. But when it readly do sql lines,the transction had add before. So, I want to know,did this demo would add transaction on sql lines?
thanks a lot,nice people.
Answer
There are two issues we need to address here; transactions and locking, and lazy loading (which your code doesn't appear to use).
You have a race condition in all your examples; multiple requests fetching the age of the same user will try to update the database table to set 18
if they all fetched 20
before any of them committed the transaction.
It doesn't matter here if the column was fetched inside or outside of the transaction. All that the transaction guarantees, is that all writes will succeed together, or will all fail together. The data read will be consistent (so multiple reads will produce the same data), but the transaction will not prevent other transactions from reading and updating based on the read data.
That's because an atomic transaction only (briefly) locks rows when writing the data; all the changes in the transaction are written together, as one unit. But that doesn't mean that what you write to the database is correct, multiple transaction can read 20
as the age, and will all write 18
to the row when it is their turn to get the lock and have their commit succeed.
But, to address the lazy loading question, unless you explicitly marked the age
column with defer()
you are not using any lazy loading. The age
value will have been loaded with all other User
data when executing the User.objects.get()
method. It doesn't really matter here, because even if the user.age > 20
test triggers a separate statement to read the age
column, you still are going to read inconsistent data (you can read 20
just before another transaction commits and writes 18
).
What you need then, is to lock the row before reading, so that other requests can't read the wrong value. If you lock first, then read, then commit, then unlock, other requests will have to wait until the lock is released and then read the age
column.
You can use the select_for_update()
method to lock a specific row, at which point any other request trying to get a lock on the same row will have to wait until you are done with the lock:
with transaction.atomic():
data = User.objects.select_for_update().get(pk=1)
if data.age > 20:
data.age -=2
data.save()
However, you should only use locking as a last resort. Locking will create a performance bottleneck, because now requests have to wait for each other. Unless your actual use case is more complex and covers multiple reads and writes that all have to be executed as one unit and you can only use Python code to make the decisions, you do not need to resort to using row locking.
Instead, if you need to update a column atomically, you should use an update()
query with a filter on the age, at which point it is the database that determines if the age needs updating. Together with an F()
expression, you then leave the whole calculation to the database, which executes this atomically:
from django.db.models import F
rowcount = User.objects.filter(pk=1, age_gt=20).update(age=F('age') - 2)
For more complex scenarious, you could use a conditional expression to determine the final value in an update.
Using UPDATE
syntax with appropriate filters and expressions move the work to the database, both to test for your condition and the value calculation, and it will do so while committing, so while the row is locked. That ensures the lock is held for the minimum amount possible, reducing the bottleneck.
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module