Few days ago, I was reviewing a simple change in one of our team’s repositories. It was one line change at a domain object. Simply inside a transaction there are below operations which change sets a status;
start transaction
Order order = getOrderFromDatabase(orderId);
order.setState(IN_PROGRESS)
update(order)
/*
* some other operations
*/
Order order = getOrderFromDatabase(orderId);
sendOrder(order)
end transaction
The code block is critical for our business use case. Also I was confused especially about the second getOrderFromDatabase method. The question was how the database behaves in that case where data is updated however transaction was not committed yet. Before proceeding with the deeper investigation better to give some context to our tech stack. This is a Spring boot java application and we use postgres as a database. Application has the Spring Transaction Manager which covers insert and update statements of the order domain object.
First I had checked the spring transaction manager default isolation level, according to Spring documentation default transactional settings
, it is DEFAULT
. However it is not clear what is the meaning of DEFAULT
. Because there is no DEFAULT
isolation level in RDBMS isolation levels.
So I need check the Spring java doc in here, according to
its java doc it uses the default isolation level of the underlying datastore. Then the question is what is the Postgres database default isolation level? According to Postgres documentation it is READ COMMITTED.
If its default isolation level is READ COMMITTED I shouldn’t see any change yet in the second getOrderFromDatabase
, because the transaction has not been committed yet.
However according to our logs that isn’t the case. When I log the order
object after the second get from the database I see that status has updated to IN_PROGRESS
, why?
start transaction
// First get from database before status update
Order order = getOrderFromDatabase(orderId);
order.setState(IN_PROGRESS)
update(order)
/*
* some other operations
*/
// Second get from database after status update
Order order = getOrderFromDatabase(orderId);
sendOrder(order)
end transaction
To answer the above, why?
I need to read the postgres READ COMMITTED documentation a bit more carefully, in here it states as
Read Committed is the default isolation level in PostgreSQL.
When a transaction uses this isolation level,
a SELECT query (without a FOR UPDATE/SHARE clause) sees only
data committed before the query began;
it never sees either uncommitted data or changes committed
during query execution by concurrent transactions.
In effect, a SELECT query sees a snapshot of the database as
of the instant the query begins to run. However, SELECT does see
the effects of previous updates executed within its own transaction,
even though they are not yet committed.
So the answer to my question is exactly here, there is a different behavior for concurrent and own transactions. Let’s test that behavior in a multithreaded environment.
I need to create a test scenario to check the difference between concurrent and own transactions behavior. To do that I create below scenarios
First create the order and save it to the database. Then in a single thread(thread1
);
Wait 100 milliseconds
Log the status of the order
Set the order status to IN_PROGRESS
and update.
Get the updated order object and log the status
Let’s run the test testReadCommittedIsolationLevel_withSingleTransaction
, here is the logs;
thread1 is starting
thread1 - orderAfterInsert orderStatus= NEW
thread1 is updated
thread1 - orderAfterUpdate orderStatus= IN_PROGRESS
thread1 is committing
So it proves that SELECT
reads uncommitted changes in its own transaction, because orderStatus has changed from NEW
to IN_PROGRESS
after update.
Now let’s test it with a multithreaded environment. I use the same thread1
method with thread2
. Inside the thread2
;
IN_PROGRESS
and update.Let’s run the test testReadCommittedIsolationLevel_withMultipleThreads
, here is the logs;
2022-12-18 15:49:01.845 : thread2 is starting
2022-12-18 15:49:01.856 : thread2 - orderAfterInsert orderStatus= NEW
2022-12-18 15:49:01.859 : thread2 is updated
2022-12-18 15:49:01.862 : thread2 - orderAfterUpdate orderStatus= PROCESSED
2022-12-18 15:49:01.950 : thread1 is starting
2022-12-18 15:49:01.953 : thread1 - orderAfterInsert orderStatus= NEW
2022-12-18 15:49:02.365 : thread2 is committing
2022-12-18 15:49:02.368 : thread1 is updated
2022-12-18 15:49:02.371 : thread1 - orderAfterUpdate orderStatus= IN_PROGRESS
2022-12-18 15:49:02.372 : thread1 is committing
Let’s explain the logs step by step;
thread1
and thread2
nearly start at the same time.thread1
sleeps 100ms, in this time thread2 has updated orderStatus as PROCESSED
.thread1
starts and logs the orderStatus because thread2
has not committed yet thread1
logs it as NEW
, READ COMMITTED
applies.thread2
committed and orderStatus should be committed as PROCESSED
.thread1
updated the orderStatus and then logged it as IN_PROGRESS
5.1. That step is critical because thread2
committed and orderStatus should be PROCESSED
however thread1 updated the record 3ms
later as IN_PROGRESS
thread1
committed.Threads are a good way to test a multithreaded environment with integration tests. Relying on the official documentation of the frameworks, libraries or databases is the best practice on investigations.
Postgres transactions have different isolation level behavior based on its own transactions and concurrent transactions. Its own transaction they behave like
READ UNCOMMITTED
even if their default isolation level is READ COMMITTED
.
You can find the all project on Github
Spring transaction manager strategies
Spring Java doc default isolation level
What is the default isolation level of the Postgres database?
Happy coding :)