This content originally appeared on DEV Community and was authored by Franck Pachot
The previous post used the primary key to record read intents in YugabyteDB. Serializable isolation can be scalable, but it requires optimal schema and indexes with a good knowledge of how the database works. Indexes define the predicate locks for PostgreSQL Serializable Snapshot Isolation, and the primary key defines the range locks used by YugabyteDB's Two-Phase Commit.
Here an example taken from nathanl/postgresql_serializable_isolation.sql which simply inserts two rows from two transactions:
create table users(
id serial not null primary key,
username varchar not null
);
-- create index on users(username asc);
-- session 1
begin transaction isolation level serializable;
select * from users where username = 'alice';
insert into users ("username") values ('alice');
-- session 2
\! psql -c '\timing on' -c "begin transaction isolation level serializable; select * from users where username = 'bob'; insert into users ("username") values ('bob'); commit;" &
-- session 1
\! sleep 1
commit;
\! sleep 1
select * from users;
Let's examine the performance in PostgreSQL and YugabyteDB, with and without and index on "username"
PostgreSQL Seq Scan
Without an index on "username", the SELECT does a Sequential Scan. Session 2 can commit, but session 1 detects a conflict on commit:
COMMIT
Time: 2.844 ms
postgres=*# commit;
ERROR: could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on identification as a pivot, during commit attempt.
HINT: The transaction might succeed if retried.
This serializable error is a false positive. The two sessions read different values, they are serializable. PostgreSQL employs predicate locks but only records the predicates used for the scan operation. Since it was read with a Seq Scan, the lock encompasses the entire table.
PostgreSQL Index Scan
I run the same with the index creation uncommented, and there's no error:
COMMIT
Time: 2.581 ms
postgres=*# commit;
COMMIT
postgres=# \! sleep 1
postgres=# select * from users;
id | username
----+----------
1 | alice
2 | bob
(2 rows)
In this case, PostgreSQL uses the index condition as a predicate lock, and they do not conflict between the two transactions.
Note that in this scenario, instead of creating an additional index, the "username" column should be declared unique. This creates an implicit index and has the same effect:
create table users(
id serial not null primary ley,
username varchar not null unique
);
YugabyteDB
YugabyteDB uses Wait-on-Conflict to prevent the serializable error, allowing both transactions to commit even without the index.
yugabyte=*# commit;
COMMIT
COMMIT
Time: 926.484 ms
I waited one second before committing to the first session, and this delay is reflected in the response time of the second session, which waited for the first one. The reason for this is similar to what we have observed with PostgreSQL: the read lock intent is at the table level.
This is visible as STRONG_READ at the level of the "relation," which is blocked by the insert operation that acquires a WEAK_WRITE lock on the relation:
yugabyte=*# select
substr(ybdetails->>'transactionid',1,8) transactionid , substr(ybdetails->>'blocked_by',1,8) blocked_by,
relkind, locktype, mode, relname , ybdetails->'keyrangedetails'->'cols'
from pg_locks natural join (select oid relation, relname, relkind from pg_class) pg_class
order by transactionid, relkind desc, relname, case locktype when 'relation' then 1 when 'keyrange' then 2 when 'row' then 3 end, mode desc, locktype desc;
transactionid | blocked_by | relkind | locktype | mode | relname | ?column?
---------------+------------+---------+----------+----------------------+---------+----------
74e1329c | | r | relation | WEAK_READ,WEAK_WRITE | users | null
74e1329c | | r | relation | STRONG_READ | users | null
74e1329c | | r | row | STRONG_WRITE | users | ["1"]
74e1329c | | r | row | STRONG_READ | users | ["1"]
c5611a22 | ["74e132 | r | relation | STRONG_READ | users | null
The index will add the range lock intents in the index, but there's still a conflict with the table lock intents at "relation" level:
yugabyte=*# select
substr(ybdetails->>'transactionid',1,8) transactionid , substr(ybdetails->>'blocked_by',1,8) blocked_by,
relkind, locktype, mode, relname , ybdetails->'keyrangedetails'->'cols'
from pg_locks natural join (select oid relation, relname, relkind from pg_class) pg_class
order by transactionid, relkind desc, relname, case locktype when 'relation' then 1 when 'keyrange' then 2 when 'row' then 3 end, mode desc, locktype desc;
transactionid | blocked_by | relkind | locktype | mode | relname | ?column?
---------------+------------+---------+----------+----------------------+--------------------+---------------------------------------------
02a108e3 | | r | relation | WEAK_READ,WEAK_WRITE | users | null
02a108e3 | | r | relation | STRONG_READ | users | null
02a108e3 | | r | row | STRONG_WRITE | users | ["1"]
02a108e3 | | r | row | STRONG_READ | users | ["1"]
02a108e3 | | i | relation | WEAK_WRITE | users_username_idx | null
02a108e3 | | i | keyrange | WEAK_WRITE | users_username_idx | ["\"alice\""]
02a108e3 | | i | row | STRONG_WRITE | users_username_idx | ["\"alice\"", "\"H\\x80\\x00\\x00\\x01!\""]
5b82a9e3 | ["02a108 | r | relation | STRONG_READ | users | null
As a consequence of it, simply adding the index in YugabyteDB doesn't reduce the wait. If we want to minimize the latency for this use case, it is preferable to define the username as a primary key rather than the generated identifier:
create table users(
id serial not null unique,
username varchar not null primary key
);
In PostgreSQL, table rows are stored in heap tables, and all indexes, including the primary key, are considered secondary indexes. However, in YugabyteDB, the rows are stored in the primary key LSM Tree, so it's essential to define the primary key for the main access pattern. There are no limitations because all secondary indexes are global and shared on their key. Another reason the primary key matters in YugabyteDB is to improve serializable locks. In PostgreSQL, even if the read intents are defined by the scan predicates, they are stored in memory. On the other hand, YugabyteDB is distributed and doesn't use a single-node shared memory, so it stores the intents with the table row in the primary key index.
With such a definition, both transactions can be committed without waiting for the other:
COMMIT
Time: 147.917 ms
yugabyte=*# commit;
COMMIT
Here are the locks just before the commits:
transactionid | blocked_by | relkind | locktype | mode | relname | ?column?
---------------+------------+---------+----------+----------------------+--------------+-----------------
350a076c | | r | relation | WEAK_READ,WEAK_WRITE | users | null
350a076c | | r | relation | WEAK_READ | users | null
350a076c | | r | row | STRONG_WRITE | users | ["\"alice\""]
350a076c | | r | row | STRONG_READ | users | ["\"alice\""]
350a076c | | r | row | STRONG_READ | users | ["\"alice\""]
350a076c | | i | relation | WEAK_READ,WEAK_WRITE | users_id_key | null
350a076c | | i | keyrange | WEAK_READ,WEAK_WRITE | users_id_key | ["1"]
350a076c | | i | row | STRONG_WRITE | users_id_key | ["1", "null"]
350a076c | | i | row | STRONG_READ | users_id_key | ["1", "null"]
97f5bd76 | | r | relation | WEAK_READ,WEAK_WRITE | users | null
97f5bd76 | | r | relation | WEAK_READ | users | null
97f5bd76 | | r | row | STRONG_WRITE | users | ["\"bob\""]
97f5bd76 | | r | row | STRONG_READ | users | ["\"bob\""]
97f5bd76 | | r | row | STRONG_READ | users | ["\"bob\""]
97f5bd76 | | i | relation | WEAK_READ,WEAK_WRITE | users_id_key | null
97f5bd76 | | i | keyrange | WEAK_READ,WEAK_WRITE | users_id_key | ["101"]
97f5bd76 | | i | row | STRONG_WRITE | users_id_key | ["101", "null"]
97f5bd76 | | i | row | STRONG_READ | users_id_key | ["101", "null"]
This situation does not involve any blocking. The STRONG_READ intents only affect individual rows, and each transaction accesses different rows.
As we discussed in the previous blog post, the Serializable isolation level can be scalable. However, like many other aspects of SQL database performance, its performance depends on the definition of primary keys and secondary indexes.
This content originally appeared on DEV Community and was authored by Franck Pachot
Franck Pachot | Sciencx (2024-10-19T21:28:17+00:00) Indexing for a Scalable Serialization Isolation Level. Retrieved from https://www.scien.cx/2024/10/19/indexing-for-a-scalable-serialization-isolation-level/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.