In this write up I have tried to give a very basic overview of DB2
Locking mechanism and Lock avoidance.
Maintaining data concurrency is one of the most important aspects
of DB2. DB2 uses locking to ensure that data can be shared by multiple
applications without compromising on data integrity.
To achieve locking DB2 uses IBM's IRLM. The disadvantage of using
IRLM is that it is a very cumbersome and time consuming process. I have
tried to explain in a simplistic way, how locking requests are
processed on behalf of DB2. DBM1 places a cross system call to IRLM
through MVS. IRLM Processes this request and hands the result back to
MVS and MVS hands that result back to DBM1.
To avoid doing a costly call to the IRLM, DB2 has some innovative
methods to determine if the data page or the row it is trying to access
is clean without going to the IRLM..
If the page has consistent data, then it need not go to the IRLM
and ask for a costly lock. This process is called lock
Generally you enable lock avoidance by the BIND parameter
CURRENTDATA(NO) and ISOLATION(CS).
There are two tests that are performed by DB2 to determine if it can
avoid taking a lock.
1. CLSN test
2. PUNC test
Before we go into the CLSN test we need to understand a few
RBA or the Relative Byte
Address is the DB2 equivalent of the human calendar/clock. How
ever, unlike our clock that keeps ticking continuously, the RBA ticks
only if there are log records that need to be written. You can think of
RBA as a simple counter that starts at 0 and is incremented by 1 when
ever there is a log record that needs to be written.
You can also think of RBA as an ever increasing counter that keeps
what time (in DB2 terms) each change took place (each log record was
Page Log RBA is recorded
in the page header of every page. Page Log RBA is the last RBA at which
any changes to that particular page were committed. Let us assume that
we have a table space TS1 (with DB2 Internal identifiers DBID=X, PSID=Y
and OBID=Z)and an application program DML(Insert, update, Delete)
wants to update a row in page 100. Let us further assume that DB2 Reads
the Page at time T1(say a log RBA value of 000000123000), and does some
program logic that takes 5 seconds. Next it updates the actual row in
that page at T1+5 seconds (say a log RBA value of 0000001230A3)
and it does more program logic. Finally it comes back after an
other 12 seconds and issues an implicit or an explicit commit at time
T1+5+12 seconds (say a log RBA value of 000000123456A). At this point,
DB2 updates a register in the page header of Page 100 with the
value 000000123456A. This is the Page Log RBA of Page 100.
CLSN or Commit Log Sequence Number
for a particular object is the Start
RBA of the oldest, still active but not yet committed Unit of
Recovery for that object.
You can think of this as a simple table that is held in DB2's memory
which has two columns. Column 1 identifies each object for which
there is an active and uncommitted Unit of Recovery and Column 2
identifies the START RBA of the oldest, uncommitted Unit of
Let us assume that there are two programs A and B that are updating a
table space TS2 (with DB2 Internal identifiers DBID=X, PSID=Y and
Program A wants to update Page number 325 and Program B wants to update
Page 129. Program A does an update on Page 325 at RBA 123450 and
goes to sleep without issuing COMMIT, Program B, on the other hand,
updates page 129 at RBA 123455, then goes on to update Page 871 at RBA
12345E and issues a COMMIT. The CLSN for the page set X,Y,Z is
still 123450 because it is the oldest, still active Unit of Recovery.
Now for the CLSN test. When DB2 wants to acquire a lock on a page it
compares the Page Log RBA of that page with the CLSN for that
object. If Page Log RBA is lesser than the CLSN then it means that all
the committed changes to this page have been written to this page and
hence DB2 does not need a lock.
In the figure above if DB2 wants to take a lock on DBID X, PSID
Y, OBID Z Page 100 it cannot use lock avoidance because the Page
Log RBA of Page 100 (12346A) is greater than the CLSN for Object X,Y,Z
(123450). This is because there is a chance that there might be some
changes made to this object by the Unit of Recovery that started at
123450 that have still
not been written to page 100. This page might have inconsistent
data. So, for Page 100 the CLSN test fails.
On the other hand if DB2 wants a lock on DBID U, PSID V, OBID W
Page 2127 it can use lock avoidance as the Page Log RBA (12345B)
is lesser than the CLSN for the object U,V,W (123FA0). So, DB2
can be sure that all the changes made to this page have already been
committed and there is no chance of any other changes in the pipe that
might be written to this page. For Page 2127 the CLSN test is
successful. Same is true for Page 45.
When CLSN test succeeds, DB2 uses lock avoidance. When CLSN test fails,
DB2 does an other test called PUNC test. The PUNC test operates at the
PUNC bit stands for Possibly Un Committed bit.
Every Data Row in the data page has a row header. One of the bits in
the row header is the PUNC bit. This PUNC bit is set every time an
update activity is done to that row.
It is important to understand that there is a chance that a data row
that has been committed can still have it's PUNC bit set. This is
because the PUNC bits are not reset as soon as the changes are
For a given Data page, the PUNC bit is reset (set to 0) if
1. The Unit of Recovery that is doing the update to the page is running
with Page or Row locking.
2. If the Page Log RBA is Less than the CLSN for the Page set.
3. If at least 25 % of the rows in the page have their PUNC bits set.
For a given Row, the PUNC bit is reset just for the row being read
(Assuming the CLSN test succeeded for that page).
When the CLSN test fails, DB2 does the PUNC test. It takes a
at the row header to see if the PUNC bit is 0. If it is not 0
the PUNC test also fails and DB2 has to do a costly cross system call
to the IRLM.