SQL Server : Optimizing Update Queries for Large Data Volumes
2014-04-02 08:29
411 查看
Updating very large tables can be a time taking task and sometimes it might take hours to finish. In addition to this,it might also cause blocking issues. Here are few tips to optimize the updates on large data volumes.Removing index on the column to be updated.
Executing the update in smaller batches.
Disabling Delete triggers.
Replacing Update statement with a Bulk-Insert operation.
With that being said,let’s apply the above points to optimize an update query.The code below creates a dummy table with 200,000 rows and required indexes.
Consider the following update query which is to be optimized. It’s a very straight forward query to update a single column.
The query takes 2:19 minutes to execute.Let’s look at the execution plan of the query shown below. In addition to the clustered index update,the index ix_col1is also updated. The index update and Sort operation together take 64% of the execution cost.
1. Removing index on the column to be updatedThe same query takes 14-18 seconds when there isn’t any index on col1. Thus,an update query runs faster if the column to be updated is not an index key column. The index can always be created once the update completes.2. Executing the update in smaller batches The query can be further optimized by executing it in smaller batches. This is generally faster. The code below updates the records in batches of 20000.
The above query takes 6-8 seconds to execute. When updating in batches,even if the update fails or it needs to be stopped,only rows from the current batch are rolled back.3. Disabling Delete triggersTriggers with cursors can extremely slow down the performance of a delete query. Disabling After delete triggers will considerably increase the query performance.4. Replacing Update statement with a Bulk-Insert operationAn update statement is a fully logged operation and thus it will certainly take considerable amount of time if millions of rows are to be updated.The fastest way to speed up the update query is to replace it with a bulk-insert operation. It is a minimally logged operation in simple and Bulk-logged recovery model. This can be done easily by doing a bulk-insert in a new table and then rename the table to original one. The required indexes and constraint can be created on a new table as required.The code below shows how the update can be converted to a bulk-insert operation. It takes 4 seconds to execute.
The bulk-insert can then be further optimized to get additional performance boost.Reference: http://www.sqlservergeeks.com/blogs/AhmadOsama/personal/450/sql-server-optimizing-update-queries-for-large-data-volumes
Executing the update in smaller batches.
Disabling Delete triggers.
Replacing Update statement with a Bulk-Insert operation.
With that being said,let’s apply the above points to optimize an update query.The code below creates a dummy table with 200,000 rows and required indexes.
01 | CREATE TABLE tblverylargetable |
02 | ( |
03 | sno INT IDENTITY, |
04 | col1 CHAR (800), |
05 | col2 CHAR (800), |
06 | col3 CHAR (800) |
07 | ) |
08 | GO |
09 | DECLARE @i INT =0 |
10 | WHILE(@i< 200000 ) |
11 | BEGIN |
12 | INSERT INTO tblverylargetable |
13 | VALUES ( 'Dummy' , |
14 | Replicate( 'Dummy' ,160), |
15 | Replicate( 'Dummy' ,160)) |
16 | SET @i=@i+ 1 |
17 | END |
18 | GO |
19 | CREATE INDEX ix_col1 |
20 | ON tblverylargetable(col1) |
21 | GO |
22 | CREATE INDEX ix_col2_col3 |
23 | ON tblverylargetable(col2) |
24 | INCLUDE(col3) |
1 | UPDATE tblverylargetable |
2 | SET col1= 'D' |
3 | WHERE col1= 'Dummy' |
1. Removing index on the column to be updatedThe same query takes 14-18 seconds when there isn’t any index on col1. Thus,an update query runs faster if the column to be updated is not an index key column. The index can always be created once the update completes.2. Executing the update in smaller batches The query can be further optimized by executing it in smaller batches. This is generally faster. The code below updates the records in batches of 20000.
1 | DECLARE @i INT =1 |
2 | WHILE(@i<= 10 ) |
3 | BEGIN |
4 | UPDATE TOP (20000)tblverylargetable |
5 | SET col1= 'D' |
6 | WHERE col1= 'Dummy' |
7 | SET @i=@i+ 1 |
8 | END |
1 | SELECT sno, |
2 | CASE col1 |
3 | WHEN 'Dummy' THEN 'D' |
4 | ELSE col1 |
5 | END AS col1, |
6 | col2, |
7 | col3 |
8 | INTO tblverylargetabletemp |
9 | FROM tblverylargetable |
相关文章推荐
- Incremental update for data warehouse analysis!
- ls:Value too large for defined data type 解决办法
- How to handling large volumes of data on PostgreSQL?
- SQL Server------Best Practices for Troubleshooting Slow Running Queries
- Linux mount Windows共享后编译出现“Value too large for defined data type”
- Data for Source Column 3(’Col3’) is too large for the specified buffer size.
- VMware共享文件夹编译出现“Value too large for defined data type”错误的解决办法
- Optimizing Data Access and Messaging - Best Practices for Maximizing Scalability and Cost Effectiven
- 《Efficient Batch Processing for Multiple Keyword Queries on Graph Data》——论文笔记
- Linux mount Windows共享后编译出现“Value too large for defined data type”
- [CareerCup] 10.2 Data Structures for Large Social Network 大型社交网站的数据结构
- Mysql: Table name is specified twice, both as a target for UPDATE and as a separate source for data
- 10 Tips for Optimizing MySQL Queries (That don’t suck)
- mount目录访问出现“Value too large for defined data type”错误解决办法
- 在mount windows 文件,编译时 cc1plus: error: hello.cpp: Value too large for defined data type
- Cannot directly invoke the abstract method doUpdate() for the type IUpdateSysData
- Linux mount Windows共享后编译出现“Value too large for defined data type”的问题 (zz)
- Transferring Data Without Draining the Battery - Optimizing Downloads for Efficient Network Access
- SQL Queries for Mere Mortals(R): A Hands-On Guide to Data Manipulation in SQL (2nd Edition)
- 执行update操作的话,就会报“Connection is read-only. Queries leading to data modification are not allowed”的异常。