您的位置:首页 > 数据库

Advanced SQL Server Locking

2005-04-11 22:37 239 查看
I thought I knew SQL Server pretty well. I've been using the product for more than 6 years now, and I like to know my tools from the inside out.
While teaching a SQL Server programming course, I noticed that the Microsoft material presented a lock compatibility table. The same table is presented at MSDN.
Looking at this table I wondered -- is there no Intent Update lock? This led me to research locking. This article is the result of the research. I wrote this article with a specific reader in mind -- someone who understands isolation levels, intent locks, deadlocks, and lock granularity. If you don't have a good background in these areas, you might want to read up on these subjects before reading this article.
My hope is to expand your understanding of SQL Server locking, and maybe teach you a few tricks you can use when programming SQL Server.
I have to say that you can happily work with SQL Server for a long time, and never need to know how it locks its resources, and still produce high-quality code and database schemas. But, if you’re like me, and like to know the inside of things, or you work with a system that requires that little extra in performance, I might teach you something useful.

Update Locks
I'm sure you've heard of deadlocks where spid X holds a lock on resource A and waits to lock resource B, while spid Y holds the lock on B and is waiting for resource A to be freed. If not, surf over to MSDN and read more on the topic.
Well, guess what, there are more types of deadlocks. Consider this situation: Spid X has a shared lock on resource A, and so has spid Y. This is no problem, since two shared locks are compatible with each other. Now X wants to convert the shared lock to an exclusive lock to update the resource. X has to wait for Y to release the shared lock to do this, and while X waits, Y decides to do the same and convert its lock to an exclusive lock. X is waiting for Y, and Y is waiting for X. Deadlock. These kinds of deadlocks are called conversion deadlocks.
This is a fairly common situation, and to avoid it, update locks were introduced. Update locks allow a connection to read a resource, and at the same time, advertise its intention to lock it at some later point in time in order to edit the resource. An issue with update locks is that SQL Server doesn't know in advance if a transaction will want to convert a shared lock to an exclusive lock, unless it's a single SQL statement, like an UPDATE statement, that first has to read the table before it can update it. To use update locks in a SELECT-statement, you must explicitly specify that you want SQL Server to use them, and you do this with the lock hint UPDLOCK.
I like showing code examples of what I'm talking about, so here goes. In one window in QA, I run this batch:


Note that I open a transaction, but I don't close it. This is to hold on to the locks inside the transaction.
If another connection would try to get another update lock on the same row, it would be put on hold until the first connection releases its lock -- two update locks on the same resource are incompatible. The result from SP_LOCK, showing only the relevant rows and columns and looks like this:


As expected, the key we are locking, OrderId, is locked with an Update Lock. The strange value in the Resource column (89003da47679) is the hashed value of our key, 10633. Internally, SQL Server uses a hash table as the lock table, and this is what we see here.
The page containing this row is, also as expected, locked with an Intent Update lock. The value in the resource column (1:242) tells us that the page is placed on the first file of the database, and is the page #242 in that file. And lastly, not as expected, SQL Server has taken an IX lock on the table level. SQL Server will never use U/IU locks on the table level, so we'll only see X/IX locks on that level.
When you do an update with a where clause, SQL Server has to read the table and/or any indexes to determine which rows to change. Before it reads anything from the table/index, SQL Server first locks the object it's accessing. Since SQL Server knows you've issued an update, it will choose an update lock, and not a shared lock. This is to prevent the kind of deadlock I talked about earlier ? a conversion deadlock.
When SQL Server has found the row/rows to change, it will convert the update lock to an exclusive lock on the row -- on a RID if it's a heap, or on the KEY on a clustered index table. This means that normally you won’t see the update locks ‘when executing an UPDATE-statement’–- they are converted almost immediately to exclusive locks.
But, of course, there are some exceptions to this rule. If SQL Server uses one index to find the rows, it will lock the index pages and the index keys with update locks. If the update doesn’t change any of the columns contained in the index, the update locks won’t be converted to exclusive locks. I’ll give you an example:


Region is a heap table with a non-clustered primary key unique index on RegionId. So, to fulfill this query, SQL Server will traverse the index on RegionId, locking index pages and index keys with update locks on the way. Finally, it will find the row to change. Since the update statement doesn’t change RegionId, it doesn’t have to take an exclusive lock on the index. The output from SP_LOCK looks like this:


We see an IX lock on the table, and an X lock on the RID we’re working on. The KEY lock we see is on the RegionId index. This is evident on the Indid column. As you can see, we still have an update lock on the index. This is one of the few times you’ll see an update lock live.
And, to finish it off, we have two page locks ? one on an index page (1:306), and one on the heap (1:300). How do I know this? Look at the Indid column. Index id 0 is the always the heap.

Lock Granularity
SQL Server has a number of lock types, but it can also choose different granularities on the locks.
If you've run SP_LOCK any time at all, or looked in Enterprise Manager’s current activity, you're sure to have seen at least four or five different lock types. I'll go through each of them quickly.
Database (DB): This is a session lock -- i.e., it's not connected to any transaction, only to a user connected to a specific DB. This is to prevent a database being dropped while one or more users are connected to it. Note, however that SQL Server knows that master and tempdb can't be dropped, and so it doesn't take a DB lock on these databases.

Table (TAB): This is the coarsest logical lock SQL Server can use. Often you'll find intent locks on this level. (Feel insecure about intent locks? Read more here.)

Extent (EXT): These locks are not, as some believe, a lock used to lock logical rows, but is used when SQL Server is about to create new tables, or expand existing ones, and you might also see it when a file is grown.

Page (PAG): When SQL Server needs to lock a lot of rows at the same time, and the available lock slots are running out, it might choose to use page locks. More often, you'll see intent locks on the page level. Up to, and including SQL Server 6.5, this was the finest lock there was.

Key (KEY): The finest level of locking possible in SQL Server, together with the RID lock. KEY locks are used on indexes, and RID locks on heaps.
Now, when studying locking behavior in SQL Server 2000, I've found that SQL Server, most of the time, will value concurrency higher than speed. Having high concurrency means that many users can work at the same time. This is done with as small locks as possible, so as not to unnecessarily lock out other users from the data. High speed, on the other hand, can be achieved by using larger locks, which is faster than getting many smaller ones.
SQL Server 2000 can also escalate your locks, if it notices that you are locking more and more rows. What SQL Server does in this case is that it gets a table lock, and drops all the individual locks on the pages/keys/RIDs. Note -- all escalations are to table locks. SQL Server will not upgrade to page locks from RID/KEY locks.
When does SQL Server 2000 escalate your locks? Well, it has nothing to do with how large percentage of the table you are locking; the only thing that matters is how many locks are used in the system as a whole. When locks are using a large enough percentage of the memory, SQL Server 2000 will try to escalate the locks on all transaction of all connections. It will also try to escalate your locks if the lock slots are running out. You can try this yourself by lowering the number of lock slots SQL Server will use, with the procedure SP_CONFIGURE.
SQL Server will try to keep high concurrency by using as small locks as possible. But sometimes you know things about your data that SQL Server doesn’t, and this information changes which locking level you want to use. An example is a large look-up table, where only reads are done. Instead of getting a lot of key-locks, you want to lock the whole table in one go. Your options here are locking hints or SP_INDEXOPTION.
Locking hints are well known, and are well-documented in BOL, so I won't repeat them here. The system stored procedure SP_INDEXOPTION is a nice way to force SQL Server to use a specific size on the locks.

With SP_INDEXOPTION, you can turn off locking on the row or page level. This means that you don't need locking hints -- all locks on the table/index will be of the size you specify. Even though BOL claims that the procedure is used to choose locking granularities on indexes, it can also be used on a heap. Use the name of the table as the @IndexNamePattern variable. Very nice and fairly unknown.
The number of factors that matter do not end there though. If you are using one of the two higher isolation levels, and don't have any usable indexes on the selection criteria, SQL Server is forced to lock most, if not the whole table, to satisfy your query. I'll give you an example here.


We see an IX lock on the table, and an X lock on the RID we’re working on. The KEY lock we see is on the RegionId index. This is evident on the Indid column. As you can see, we still have an update lock on the index. This is one of the few times you’ll see an update lock live.
And, to finish it off, we have two page locks ? one on an index page (1:306), and one on the heap (1:300). How do I know this? Look at the Indid column. Index id 0 is the always the heap.

Lock Granularity
SQL Server has a number of lock types, but it can also choose different granularities on the locks.
If you've run SP_LOCK any time at all, or looked in Enterprise Manager’s current activity, you're sure to have seen at least four or five different lock types. I'll go through each of them quickly.
Database (DB): This is a session lock -- i.e., it's not connected to any transaction, only to a user connected to a specific DB. This is to prevent a database being dropped while one or more users are connected to it. Note, however that SQL Server knows that master and tempdb can't be dropped, and so it doesn't take a DB lock on these databases.

Table (TAB): This is the coarsest logical lock SQL Server can use. Often you'll find intent locks on this level. (Feel insecure about intent locks? Read more here.)

Extent (EXT): These locks are not, as some believe, a lock used to lock logical rows, but is used when SQL Server is about to create new tables, or expand existing ones, and you might also see it when a file is grown.

Page (PAG): When SQL Server needs to lock a lot of rows at the same time, and the available lock slots are running out, it might choose to use page locks. More often, you'll see intent locks on the page level. Up to, and including SQL Server 6.5, this was the finest lock there was.

Key (KEY): The finest level of locking possible in SQL Server, together with the RID lock. KEY locks are used on indexes, and RID locks on heaps.
Now, when studying locking behavior in SQL Server 2000, I've found that SQL Server, most of the time, will value concurrency higher than speed. Having high concurrency means that many users can work at the same time. This is done with as small locks as possible, so as not to unnecessarily lock out other users from the data. High speed, on the other hand, can be achieved by using larger locks, which is faster than getting many smaller ones.
SQL Server 2000 can also escalate your locks, if it notices that you are locking more and more rows. What SQL Server does in this case is that it gets a table lock, and drops all the individual locks on the pages/keys/RIDs. Note -- all escalations are to table locks. SQL Server will not upgrade to page locks from RID/KEY locks.
When does SQL Server 2000 escalate your locks? Well, it has nothing to do with how large percentage of the table you are locking; the only thing that matters is how many locks are used in the system as a whole. When locks are using a large enough percentage of the memory, SQL Server 2000 will try to escalate the locks on all transaction of all connections. It will also try to escalate your locks if the lock slots are running out. You can try this yourself by lowering the number of lock slots SQL Server will use, with the procedure SP_CONFIGURE.
SQL Server will try to keep high concurrency by using as small locks as possible. But sometimes you know things about your data that SQL Server doesn’t, and this information changes which locking level you want to use. An example is a large look-up table, where only reads are done. Instead of getting a lot of key-locks, you want to lock the whole table in one go. Your options here are locking hints or SP_INDEXOPTION.
Locking hints are well known, and are well-documented in BOL, so I won't repeat them here. The system stored procedure SP_INDEXOPTION is a nice way to force SQL Server to use a specific size on the locks.

With SP_INDEXOPTION, you can turn off locking on the row or page level. This means that you don't need locking hints -- all locks on the table/index will be of the size you specify. Even though BOL claims that the procedure is used to choose locking granularities on indexes, it can also be used on a heap. Use the name of the table as the @IndexNamePattern variable. Very nice and fairly unknown.
The number of factors that matter do not end there though. If you are using one of the two higher isolation levels, and don't have any usable indexes on the selection criteria, SQL Server is forced to lock most, if not the whole table, to satisfy your query. I'll give you an example here.


Now, in another window, check the output of sp_lock. On my computer, I see 853 locks corresponding to the connection on which I ran the UPDATE. The Orders table in Northwind has 830 rows, so I've locked every row. Rollback the transaction, and try reissuing the update, after creating an index, like this.


Now, the SP_LOCK output only shows 25 locks. This is a part of performance hunting often overlooked. Even if you are just using READ COMMITTED, the default isolation level, you'll see a dramatic difference -- going from 136 locks to 24 by creating the index.

Locking Trace Flags
There are a few trace flags that can help you fine tune your locking and find deadlock problems.
Trace flags are used to turn on or off certain behaviors in SQL Server. You can set a trace flag using DBCC TRACEON, and if you want a trace flag to be turned on every time SQL Server starts, you can specify it as a start parameter using ?T?
1200: Shows all locks acquired by all connections. Very voluminous output, my recommendation would be to use this in a controlled environment, where only a single connection is working at the same time.

1204: Outputs a lot of information about the participants in a deadlock. This is an example, beautified for your pleasure.


KEY: means that the resource we're deadlocking on is an index key. It could also be any of the other sizes of locks, like page, RID, table, etc.
ECID is taken from master.dbo.sysprocesses. It’s used to distinguish between the locks taken by different threads. Mode is the requested mode on the deadlocking lock, like S, X or U.
The string "6:885578193:2" means: database with database id 6, object with id 885578193 and the index with index id 2. The number inside the parentheses is the hash value used to recognize the lock. You can see this in the column rsc_text in master.dbo.syslockinfo. Unfortunately, this is a one-way hash, which means that you can't find out the row locked with only the hash value. Spid is, of course, the spid holding the lock.
The way this output should be read is by first looking at the two nodes, Node 1 & 2, which show you the locks already granted and involved in the deadlock. Then you see the two locks that are in the wait queue, marked with “Requested By:?
1205: Prints out information about the lock manager’s work. Every time a deadlock search is initiated, this trace flag makes the lock manager print out information about the search. Only works if trace flag 1204 has been given.

1211: Disable all lock escalation. This trace flag forces the lock manager not to escalate locks, even if it’s running out of locks.

Column Locks
As you probably know, the smallest unit of locking in SQL Server 2000 is the rowlock. Column locking is not directly available from SQL Server. Here, I’ll show you how you can use index locking to mimic column level locks.
Column locks are often deemed as too slow for practical purposes, and SQL Server is no exception. But since a rowlock doesn’t automatically mean that the index on the table is locked, you can always use the data on the index pages to work with data that’s really locked. Again, I’ll explain using the Region table in the Northwind database.
The Region table is a heap, with two columns, RegionDescription and RegionId. A unique, non-clustered index exists on the RegionId column.
What we’ll do now, is to change the RegionDescription on one of the rows, with a simple update, like this:


To satisfy this query, SQL Server can’t use the index -- the index doesn’t cover the RegionDescription column. So SQL Server will use a full table scan to find the row to change. Once it finds the row, it will upgrade the update lock on that row to an exclusive lock. Just to make sure, run SP_LOCK in another Query Analyzer window. There should be an exclusive RID lock on the correct object. In the same window that you ran SP_LOCK, you can try issuing a select, like this:


Unless you've specified READPAST, or set the transaction isolation level to read uncommitted, you should be put "on hold" waiting for the first connection to release its lock.

This is just as you would expect. But, here's the trick:


This time, you won't be kept waiting. If you're like me, you like looking at execution plans, and the execution plan to this query explains why we don't have to wait.


As you can see, SQL Server chooses an index scan to get the data it needs to answer your query. And since you asked for information that could be answered by reading an index, it doesn’t need to read the data on the heap. This kind of query is called a covering query.
Two criteria exist for this to happen. The first is that all the data needed by the query is contained in the index. Remember that if the table has a clustered index, all non-clustered indexes will have the index columns and the clustered index columns in it.
The second is that the original UPDATE mustn’t change any of the columns contained in the index. If the index is changed, it must be exclusively locked and this trick won’t work.

Extended Lock Capability Table
The tables currently available in BOL and MSDN aren't very comprehensive when it comes to which locks are compatible with each other. Because of this, I have included a more comprehensive table below. I hope you find it useful.



Conclusion
I did find the elusive intent update lock, and a lot more while searching for it. Locks and locking behavior are poorly documented in BOL, which has made this research a lot harder than it had to be. I’ve learned a lot writing this article, and I hope I’ve shared a little of that knowledge with you.

BIO
Andrés Taylor tries to give equal amounts of attention to his favorite pastimes: SQL Server studying, Brazilian Jiu-Jitsu, and his beautiful wife Mimi. He holds 20 MCPs, is a MCT and works at Dotway as a .NET/SQL Server consultant.

Published with the express written permission of the author. Copyright 2003



KEY: means that the resource we're deadlocking on is an index key. It could also be any of the other sizes of locks, like page, RID, table, etc.
ECID is taken from master.dbo.sysprocesses. It’s used to distinguish between the locks taken by different threads. Mode is the requested mode on the deadlocking lock, like S, X or U.
The string "6:885578193:2" means: database with database id 6, object with id 885578193 and the index with index id 2. The number inside the parentheses is the hash value used to recognize the lock. You can see this in the column rsc_text in master.dbo.syslockinfo. Unfortunately, this is a one-way hash, which means that you can't find out the row locked with only the hash value. Spid is, of course, the spid holding the lock.
The way this output should be read is by first looking at the two nodes, Node 1 & 2, which show you the locks already granted and involved in the deadlock. Then you see the two locks that are in the wait queue, marked with “Requested By:?
1205: Prints out information about the lock manager’s work. Every time a deadlock search is initiated, this trace flag makes the lock manager print out information about the search. Only works if trace flag 1204 has been given.

1211: Disable all lock escalation. This trace flag forces the lock manager not to escalate locks, even if it’s running out of locks.

Column Locks
As you probably know, the smallest unit of locking in SQL Server 2000 is the rowlock. Column locking is not directly available from SQL Server. Here, I’ll show you how you can use index locking to mimic column level locks.
Column locks are often deemed as too slow for practical purposes, and SQL Server is no exception. But since a rowlock doesn’t automatically mean that the index on the table is locked, you can always use the data on the index pages to work with data that’s really locked. Again, I’ll explain using the Region table in the Northwind database.
The Region table is a heap, with two columns, RegionDescription and RegionId. A unique, non-clustered index exists on the RegionId column.
What we’ll do now, is to change the RegionDescription on one of the rows, with a simple update, like this:


To satisfy this query, SQL Server can’t use the index -- the index doesn’t cover the RegionDescription column. So SQL Server will use a full table scan to find the row to change. Once it finds the row, it will upgrade the update lock on that row to an exclusive lock. Just to make sure, run SP_LOCK in another Query Analyzer window. There should be an exclusive RID lock on the correct object. In the same window that you ran SP_LOCK, you can try issuing a select, like this:


Unless you've specified READPAST, or set the transaction isolation level to read uncommitted, you should be put "on hold" waiting for the first connection to release its lock.

This is just as you would expect. But, here's the trick:


This time, you won't be kept waiting. If you're like me, you like looking at execution plans, and the execution plan to this query explains why we don't have to wait.


As you can see, SQL Server chooses an index scan to get the data it needs to answer your query. And since you asked for information that could be answered by reading an index, it doesn’t need to read the data on the heap. This kind of query is called a covering query.
Two criteria exist for this to happen. The first is that all the data needed by the query is contained in the index. Remember that if the table has a clustered index, all non-clustered indexes will have the index columns and the clustered index columns in it.
The second is that the original UPDATE mustn’t change any of the columns contained in the index. If the index is changed, it must be exclusively locked and this trick won’t work.

Extended Lock Capability Table
The tables currently available in BOL and MSDN aren't very comprehensive when it comes to which locks are compatible with each other. Because of this, I have included a more comprehensive table below. I hope you find it useful.



Conclusion
I did find the elusive intent update lock, and a lot more while searching for it. Locks and locking behavior are poorly documented in BOL, which has made this research a lot harder than it had to be. I’ve learned a lot writing this article, and I hope I’ve shared a little of that knowledge with you.

BIO
Andrés Taylor tries to give equal amounts of attention to his favorite pastimes: SQL Server studying, Brazilian Jiu-Jitsu, and his beautiful wife Mimi. He holds 20 MCPs, is a MCT and works at Dotway as a .NET/SQL Server consultant.

Published with the express written permission of the author. Copyright 2003
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: