Postgres Transactions In Multithreaded Environment

Posted by muzir on December 11, 2022 · 7 mins read

Few days ago, I was reviewing a simple change in one of our team’s repositories. It was one line change at a domain object. Simply inside a transaction there are below operations which change sets a status;

start transaction

Order order = getOrderFromDatabase(orderId);
order.setState(IN_PROGRESS)
update(order)
/*
* some other operations
*/ 
Order order = getOrderFromDatabase(orderId);
sendOrder(order)

end transaction

The code block is critical for our business use case. Also I was confused especially about the second getOrderFromDatabase method. The question was how the database behaves in that case where data is updated however transaction was not committed yet. Before proceeding with the deeper investigation better to give some context to our tech stack. This is a Spring boot java application and we use postgres as a database. Application has the Spring Transaction Manager which covers insert and update statements of the order domain object.

Deeper Investigation

First I had checked the spring transaction manager default isolation level, according to Spring documentation default transactional settings , it is DEFAULT. However it is not clear what is the meaning of DEFAULT. Because there is no DEFAULT isolation level in RDBMS isolation levels. So I need check the Spring java doc in here, according to its java doc it uses the default isolation level of the underlying datastore. Then the question is what is the Postgres database default isolation level? According to Postgres documentation it is READ COMMITTED. If its default isolation level is READ COMMITTED I shouldn’t see any change yet in the second getOrderFromDatabase, because the transaction has not been committed yet. However according to our logs that isn’t the case. When I log the order object after the second get from the database I see that status has updated to IN_PROGRESS, why?

start transaction
// First get from database before status update
Order order = getOrderFromDatabase(orderId);
order.setState(IN_PROGRESS)
update(order)
/*
* some other operations
*/ 
// Second get from database after status update 
Order order = getOrderFromDatabase(orderId);
sendOrder(order)

end transaction

Mystery Solved

To answer the above, why? I need to read the postgres READ COMMITTED documentation a bit more carefully, in here it states as

Read Committed is the default isolation level in PostgreSQL. 
When a transaction uses this isolation level, 
a SELECT query (without a FOR UPDATE/SHARE clause) sees only 
data committed before the query began; 
it never sees either uncommitted data or changes committed 
during query execution by concurrent transactions. 
In effect, a SELECT query sees a snapshot of the database as 
of the instant the query begins to run. However, SELECT does see 
the effects of previous updates executed within its own transaction, 
even though they are not yet committed. 

So the answer to my question is exactly here, there is a different behavior for concurrent and own transactions. Let’s test that behavior in a multithreaded environment.

Testing Postgres Transactions

I need to create a test scenario to check the difference between concurrent and own transactions behavior. To do that I create below scenarios

Testing with Single Thread

First create the order and save it to the database. Then in a single thread(thread1);

Wait 100 milliseconds Log the status of the order Set the order status to IN_PROGRESS and update. Get the updated order object and log the status

Let’s run the test testReadCommittedIsolationLevel_withSingleTransaction, here is the logs;

thread1 is starting
thread1 - orderAfterInsert orderStatus= NEW
thread1 is updated
thread1 - orderAfterUpdate orderStatus= IN_PROGRESS
thread1 is committing

So it proves that SELECT reads uncommitted changes in its own transaction, because orderStatus has changed from NEW to IN_PROGRESS after update.

Testing with Multi Threads

Now let’s test it with a multithreaded environment. I use the same thread1 method with thread2. Inside the thread2;

  1. Log the status of the order.
  2. Set the order status to IN_PROGRESS and update.
  3. Get the updated order object and log the status
  4. Wait 500 milliseconds.

Let’s run the test testReadCommittedIsolationLevel_withMultipleThreads, here is the logs;

2022-12-18 15:49:01.845 : thread2 is starting
2022-12-18 15:49:01.856 : thread2 - orderAfterInsert orderStatus= NEW
2022-12-18 15:49:01.859 : thread2 is updated
2022-12-18 15:49:01.862 : thread2 - orderAfterUpdate orderStatus= PROCESSED
2022-12-18 15:49:01.950 : thread1 is starting
2022-12-18 15:49:01.953 : thread1 - orderAfterInsert orderStatus= NEW
2022-12-18 15:49:02.365 : thread2 is committing
2022-12-18 15:49:02.368 : thread1 is updated
2022-12-18 15:49:02.371 : thread1 - orderAfterUpdate orderStatus= IN_PROGRESS
2022-12-18 15:49:02.372 : thread1 is committing

Let’s explain the logs step by step;

  1. thread1 and thread2 nearly start at the same time.
  2. thread1 sleeps 100ms, in this time thread2 has updated orderStatus as PROCESSED.
  3. thread1 starts and logs the orderStatus because thread2 has not committed yet thread1 logs it as NEW, READ COMMITTED applies.
  4. thread2 committed and orderStatus should be committed as PROCESSED.
  5. thread1 updated the orderStatus and then logged it as IN_PROGRESS 5.1. That step is critical because thread2 committed and orderStatus should be PROCESSED however thread1 updated the record 3ms later as IN_PROGRESS
  6. thread1 committed.

Notes

Threads are a good way to test a multithreaded environment with integration tests. Relying on the official documentation of the frameworks, libraries or databases is the best practice on investigations.

Result

Postgres transactions have different isolation level behavior based on its own transactions and concurrent transactions. Its own transaction they behave like READ UNCOMMITTED even if their default isolation level is READ COMMITTED.

You can find the all project on Github

References

Spring transaction manager strategies

Spring Doc DEFAULT

Spring Java doc default isolation level

What is the default isolation level of the Postgres database?

Background picture reference

Happy coding :)