您的位置:首页 > 其它

[原]由clob引发的性能问题所想到的

2010-01-02 18:52 429 查看
2009年最后一天下午比较闲,于是在生产库做了AWR report,看看生产库的性能如何,发现有一条Top SQL性能很差(貌似性能调优实例的开场白都是这样的),不多说了,上图看吧:













从这张图来看最Top 的SQL就是那条“ SELECT ‘receivedoc’,( ”了。

语句摘录如下:

SELECT
'receivedoc',
(
select COUNT(*) from
(
select
F_CURACTANDUSERIDLIST,
CAST(SUBSTR(SUBSTR(F_CURACTANDUSERIDLIST, 0, INSTR(F_CURACTANDUSERIDLIST, '*599;') - 1),
INSTR(SUBSTR(F_CURACTANDUSERIDLIST, 0, INSTR(F_CURACTANDUSERIDLIST, '*599;')), ';', -1) + 1)
as VARCHAR2(4000)) as F_ActId from TD_ReceiveDoc
) A
join Td_Activity B
on A.F_ActId = B.F_Id
where dbms_lob.instr(F_CURACTANDUSERIDLIST, '*599;', 1, 1 ) > 0
) AS ReceiveDoc,
'senddoc',
(
SELECT COUNT(*) FROM TD_SendDoc
WHERE DBMS_LOB.INSTR(F_CuractAndUserIDList, '*599;', 1, 1) > 0
) AS SendDoc,
'message',
(
SELECT COUNT(*) FROM TD_PublicInfo
WHERE dbms_lob.instr(', ' || F_NotifyUserIDList || ', ', ', 599, ', 1, 1 ) > 0
and F_Type = 0 and F_CurrentStatus = 4 AND F_UserID <> 599
and ((F_HadReadedIDList||', ' not like '%, 599, %' or F_HadReadedIDList || 'A' = 'A')
AND (SELECT COUNT(*) FROM TD_ReadPerson WHERE F_PublicID = TD_PublicInfo.F_ID AND F_UserID = 599) = 0)
) AS Message,
'bulletin',
(
SELECT COUNT(*) FROM TD_PublicInfo
WHERE F_Type = 1 and F_CurrentStatus = 4
and (F_HadReadedIDList||', ' not like '%, 599, %' or F_HadReadedIDList || 'A' = 'A')
AND F_UserID <> 599
AND (
SELECT COUNT(*) FROM TD_ReadPerson
WHERE F_PublicID = TD_PublicInfo.F_ID AND F_UserID = 599) = 0
) AS Affiche,
'email',
(
SELECT COUNT(*) FROM TD_EMail
WHERE F_Status=1 and F_IsRead <> '0'
and F_UserID=599 and trunc(F_MailTime) <= trunc(SYSDATE)
) AS EMail,
'publication',
(
SELECT COUNT(*) FROM TD_PublicInfo
WHERE F_Type = 3 and F_CurrentStatus = 4
and ((F_HadReadedIDList||', ' not like '%, 599, %' or F_HadReadedIDList || 'A' = 'A')
AND dbms_lob.instr(', ' || F_NotifyUserIDList || ', ', ', 599, ', 1, 1 ) > 0)
and F_UserID <> 599 AND (SELECT COUNT(*) FROM TD_ReadPerson
WHERE F_PublicID = TD_PublicInfo.F_ID AND F_UserID = 599) = 0
) AS InsidePublication
FROM DUAL


执行几乎和统计信息如下:

----------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                           |     1 |       |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |                           |     1 |   394 |            |          |
|   2 |   NESTED LOOPS               |                           |   685 |   263K|  1029   (1)| 00:00:13 |
|*  3 |    TABLE ACCESS FULL         | TD_RECEIVEDOC             |   685 |   260K|  1029   (1)| 00:00:13 |
|*  4 |    INDEX UNIQUE SCAN         | SYS_C0015018              |     1 |     4 |     0   (0)| 00:00:01 |
|   5 |  SORT AGGREGATE              |                           |     1 |   295 |            |          |
|*  6 |   TABLE ACCESS FULL          | TD_SENDDOC                |    80 | 23600 |   160   (1)| 00:00:02 |
|   7 |  SORT AGGREGATE              |                           |     1 |   507 |            |          |
|*  8 |   FILTER                     |                           |       |       |            |          |
|*  9 |    TABLE ACCESS FULL         | TD_PUBLICINFO             |     2 |  1014 |   133   (1)| 00:00:02 |
|  10 |    SORT AGGREGATE            |                           |     1 |     9 |            |          |
|* 11 |     INDEX RANGE SCAN         | IDX_TDREADPERSON_PUB_USER |     1 |     9 |     3   (0)| 00:00:01 |
|  12 |  SORT AGGREGATE              |                           |     1 |    22 |            |          |
|* 13 |   FILTER                     |                           |       |       |            |          |
|* 14 |    TABLE ACCESS FULL         | TD_PUBLICINFO             |    38 |   836 |   133   (1)| 00:00:02 |
|  15 |    SORT AGGREGATE            |                           |     1 |     9 |            |          |
|* 16 |     INDEX RANGE SCAN         | IDX_TDREADPERSON_PUB_USER |     1 |     9 |     3   (0)| 00:00:01 |
|  17 |  SORT AGGREGATE              |                           |     1 |    16 |            |          |
|* 18 |   TABLE ACCESS BY INDEX ROWID| TD_EMAIL                  |    13 |   208 |  9419   (1)| 00:01:54 |
|* 19 |    INDEX RANGE SCAN          | IDX_TDEMAIL_USERID        |  9987 |       |    23   (0)| 00:00:01 |
|  20 |  SORT AGGREGATE              |                           |     1 |   507 |            |          |
|* 21 |   FILTER                     |                           |       |       |            |          |
|* 22 |    TABLE ACCESS FULL         | TD_PUBLICINFO             |     2 |  1014 |   133   (1)| 00:00:02 |
|  23 |    SORT AGGREGATE            |                           |     1 |     9 |            |          |
|* 24 |     INDEX RANGE SCAN         | IDX_TDREADPERSON_PUB_USER |     1 |     9 |     3   (0)| 00:00:01 |
|  25 |  FAST DUAL                   |                           |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("DBMS_LOB"."INSTR"("F_CURACTANDUSERIDLIST",'*599;',1,1)>0)
4 - access("B"."F_ID"=TO_NUMBER(CAST(SUBSTR(SUBSTR("F_CURACTANDUSERIDLIST",0,INSTR("F_CURACTAND
USERIDLIST",'*599;')-1),INSTR(SUBSTR("F_CURACTANDUSERIDLIST",0,INSTR("F_CURACTANDUSERIDLIST",'*599
;')),';',(-1))+1) AS VARCHAR2(4000))))
6 - filter("DBMS_LOB"."INSTR"("F_CURACTANDUSERIDLIST",'*599;',1,1)>0)
8 - filter( (SELECT COUNT(*) FROM "TD_READPERSON" "TD_READPERSON" WHERE "F_USERID"=599 AND
"F_PUBLICID"=:B1)=0)
9 - filter(TO_NUMBER("F_CURRENTSTATUS")=4 AND TO_NUMBER("F_TYPE")=0 AND
("F_HADREADEDIDLIST"||', ' NOT LIKE '%, 599, %' OR "F_HADREADEDIDLIST"||'A'='A') AND
"DBMS_LOB"."INSTR"(', '||"F_NOTIFYUSERIDLIST"||', ',', 599, ',1,1)>0 AND "F_USERID"<>599)
11 - access("F_PUBLICID"=:B1 AND "F_USERID"=599)
13 - filter( (SELECT COUNT(*) FROM "TD_READPERSON" "TD_READPERSON" WHERE "F_USERID"=599 AND
"F_PUBLICID"=:B1)=0)
14 - filter(TO_NUMBER("F_CURRENTSTATUS")=4 AND TO_NUMBER("F_TYPE")=1 AND
("F_HADREADEDIDLIST"||', ' NOT LIKE '%, 599, %' OR "F_HADREADEDIDLIST"||'A'='A') AND
"F_USERID"<>599)
16 - access("F_PUBLICID"=:B1 AND "F_USERID"=599)
18 - filter(TO_NUMBER("F_STATUS")=1 AND "F_ISREAD"<>'0' AND
TRUNC(INTERNAL_FUNCTION("F_MAILTIME"))<=TRUNC(SYSDATE@!))
19 - access("F_USERID"=599)
21 - filter( (SELECT COUNT(*) FROM "TD_READPERSON" "TD_READPERSON" WHERE "F_USERID"=599 AND
"F_PUBLICID"=:B1)=0)
22 - filter(TO_NUMBER("F_CURRENTSTATUS")=4 AND TO_NUMBER("F_TYPE")=3 AND
("F_HADREADEDIDLIST"||', ' NOT LIKE '%, 599, %' OR "F_HADREADEDIDLIST"||'A'='A') AND
"DBMS_LOB"."INSTR"(', '||"F_NOTIFYUSERIDLIST"||', ',', 599, ',1,1)>0 AND "F_USERID"<>599)
24 - access("F_PUBLICID"=:B1 AND "F_USERID"=599)

Statistics
----------------------------------------------------------
1  recursive calls
136702  db block gets
52121  consistent gets
8182  physical reads
0  redo size
1323  bytes sent via SQL*Net to client
542  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed


该查询涉及的表的规模如下:

. . imported "SUSUNOAGC"."TD_EMAIL"                      783.9 MB 1264719 rows
. . imported "SUSUNOAGC"."TD_RECEIVEDOC"                 148.9 MB   13703 rows
. . imported "SUSUNOAGC"."TD_SENDDOC"                    76.55 MB    1605 rows
. . imported "SUSUNOAGC"."TD_PUBLICINFO"                 17.64 MB    5091 rows
. . imported "SUSUNOAGC"."TD_READPERSON"                 26.43 MB  614403 rows
. . imported "SUSUNOAGC"."TD_ACTIVITY"                   20.75 KB      86 rows


经过两天的调试,也就是2010年的1月1日,这个问题查了“两年”啊,终于找到导致 physical reads 如此之高的原因—— clob 。

仔细看以下这三个子句,其中TD_PublicInfo 的 NotifyUserIDList 和 TD_ReceiveDoc的F_CURACTANDUSERIDLIST 都是 clob 类型:

SELECT COUNT(*) FROM TD_PublicInfo
WHERE dbms_lob.instr(', ' || F_NotifyUserIDList || ', ', ', 599, ', 1, 1 ) > 0
and F_Type = 0 and F_CurrentStatus = 4 AND F_UserID <> 599
and ((F_HadReadedIDList||', ' not like '%, 599, %' or F_HadReadedIDList || 'A' = 'A')
AND (SELECT COUNT(*) FROM TD_ReadPerson WHERE F_PublicID = TD_PublicInfo.F_ID AND F_UserID = 599) = 0)

SELECT COUNT(*) FROM TD_PublicInfo
WHERE F_Type = 1 and F_CurrentStatus = 4
and (F_HadReadedIDList||', ' not like '%, 599, %' or F_HadReadedIDList || 'A' = 'A')
AND F_UserID <> 599
AND (
SELECT COUNT(*) FROM TD_ReadPerson
WHERE F_PublicID = TD_PublicInfo.F_ID AND F_UserID = 599) = 0
select COUNT(*) from
(
select
F_CURACTANDUSERIDLIST,
CAST(SUBSTR(SUBSTR(F_CURACTANDUSERIDLIST, 0, INSTR(F_CURACTANDUSERIDLIST, '*599;') - 1),
INSTR(SUBSTR(F_CURACTANDUSERIDLIST, 0, INSTR(F_CURACTANDUSERIDLIST, '*599;')), ';', -1) + 1)
as VARCHAR2(4000)) as F_ActId from TD_ReceiveDoc
) A
join Td_Activity B
on A.F_ActId = B.F_Id
where dbms_lob.instr(F_CURACTANDUSERIDLIST, '*599;', 1, 1 ) > 0


由于没有 lob 的 storag in row 属性,所以会产生大量 direct path read ,导致大量的物理读。

我尝试将这个几个表的clob字段缓存在内存中(storage in row),虽然可以大大缓解 physical reads,但是 consistent get 还是不能很好地减少。

思前想后,总觉得奇怪,为什么会在clob字段进行dbms_lob.instr搜索呢?

查看了一下示例数据,原来这些clob都共同的特点,都是保存一个长长的用户ID列表,做到这里我在想这样的设计究竟出了什么问题呢?有没有违反范式呢?如果违反了范式又是违法了哪一级范式呢?

我记得念大学的时候,教材有这么一句话,关系型数据库不允许违反第一范式,否则将无法运行。于是心中总有一个“概念”,这年代能跑的关系数据库都不用讨论是否遵守第一范式啦,于是乎从第二范式到第四范式,逐个对照,发现没有一个对应得上,于是再研究了一下最基础的第一范式 1NF 。

感谢wiki啊!,以下很多内容都是参照wiki

关系数据库之父 Edgar F. Codd 在1970年的时候提出了关系模型(Relational Model),并介绍了数据库范式的概念,其中有我们后来熟知的1NF。

简单地说1NF就是希望我们的数据组织成这样:



而不是这样





有些朋友可能会说,第二张图的数据更像我们喜欢的报表类型啊,注意,从1NF到5NF更加关心的是数据的存储方式和是否会出现更新异常,而不是展现方式。说回正题,Top SQL 中clob的数据类似于Transactions里面的数据,我们要访问其中的一些信息,首先就要将其拆分开(当然dbms_lob.instr不是做拆分),然后再做匹配、统计等,wiki中的原话:

Unpacking one or more customers' groups of transactions allowing the individual transactions in a group to be examined, and

Deriving a query result based on the results of the first stage

而事实上,我们的Top SQL就是这样做的,于是乎该情景就要求数据库访问所有的记录(TABLE ACCESS FULL)拆分数据、匹配数据,或者可以称为数据不能被索引化,最显而易见的结果就是查询时需要访问大量无关数据,系统的扩展性不好。

解决方案比较简单,这里就不说了,但是,能不能协调得动开发商就难说咯。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: