关于shared pool的深入探讨
2013-08-19 14:12
337 查看
关于shared
pool的深入探讨
作者:eygle关于shared pool的深入探讨(一)
关于shared pool的设置一直是一个争议较多的内容。很多文章上说,shared pool设置过大会带来额外的管理上的负担,从而在某些条件下会导致性能的下降。那么这个管理上的负担指的是什么内容呢? 本文对这个内容作一定的深入探讨。本文只涉及一个方面,后续的文章将从其他方面继续讨论。基础知识:
我们可以通过如下命令转储shared pool共享内存的内容:
SQL> alter session set events 'immediate trace name heapdump level 2'; Session altered. 本测试中引用的两个trace文件: 9i: SQL> @gettrcname TRACE_FILE_NAME -------------------------------------------------------------------------------- /opt/oracle/admin/hsjf/udump/hsjf_ora_24983.trc 8i: SQL> @gettrcname TRACE_FILE_NAME -------------------------------------------------------------------------------- /usr/oracle8/admin/guess/udump/guess_ora_22038.trc |
在Oracle8i中,不同bucket的size范围如下所示(size显示的是下边界):
oracle:/usr/oracle8/admin/guess/udump>cat guess_ora_22038.trc|grep Bucket Bucket 0 size=44 Bucket 1 size=76 Bucket 2 size=140 Bucket 3 size=268 Bucket 4 size=524 Bucket 5 size=1036 Bucket 6 size=2060 Bucket 7 size=4108 Bucket 8 size=8204 Bucket 9 size=16396 Bucket 10 size=32780 |
Oracle请求shared pool空间时,首先进入相应的Bucket进行查找,如果找不到,则转向下一个非空的bucket,获取第一个chunk。分割这个chunk,剩余部分会进入相应的Bucket,进一步增加碎片。
最终的结果是,Bucket 0上的内存块会越来越多,越来越碎小(在我这个测试的小型的数据库上,Bucket 0上的碎片已经达到9030个而shared_pool_size设置仅为150M),通常如果每个Bucket上的chunk多余2000个,就被认为是share pool碎片过多。
而在大多数情况下,我们请求的都是相对小的chunk,这样搜索Bucket 0往往消耗了大量的时间以及资源,这可能导致share pool Latch被长时间的持有,导致更多的share pool竞争。
所以在Oracle9i之前,如果盲目的增大shared_pool_size或设置过大的shared_pool_size,往往会适得其反。
我们看一下Oracle9i中的处理方式:
[oracle@jumper oracle]$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on Wed Aug 18 22:13:07 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.3.0 - Production SQL> alter session set events 'immediate trace name heapdump level 2'; Session altered. SQL> @gettrcname TRACE_FILE_NAME -------------------------------------------------------------------------------- /opt/oracle/admin/hsjf/udump/hsjf_ora_24983.trc SQL> SQL> ! [oracle@jumper oracle]$ cd $admin [oracle@jumper udump]$ cat hsjf_ora_24983.trc|grep Bucket |
Bucket 0 size=16 Bucket 1 size=20 Bucket 2 size=24 Bucket 3 size=28 Bucket 4 size=32 Bucket 5 size=36 Bucket 6 size=40 Bucket 7 size=44 Bucket 8 size=48 Bucket 9 size=52 Bucket 10 size=56 Bucket 11 size=60 Bucket 12 size=64 Bucket 13 size=68 Bucket 14 size=72 Bucket 15 size=76 Bucket 16 size=80 Bucket 17 size=84 Bucket 18 size=88 Bucket 19 size=92 Bucket 20 size=96 Bucket 21 size=100 Bucket 22 size=104 Bucket 23 size=108 Bucket 24 size=112 Bucket 25 size=116 Bucket 26 size=120 Bucket 27 size=124 Bucket 28 size=128 Bucket 29 size=132 Bucket 30 size=136 Bucket 31 size=140 Bucket 32 size=144 Bucket 33 size=148 Bucket 34 size=152 Bucket 35 size=156 Bucket 36 size=160 Bucket 37 size=164 Bucket 38 size=168 Bucket 39 size=172 Bucket 40 size=176 Bucket 41 size=180 Bucket 42 size=184 Bucket 43 size=188 Bucket 44 size=192 Bucket 45 size=196 Bucket 46 size=200 Bucket 47 size=204 Bucket 48 size=208 Bucket 49 size=212 Bucket 50 size=216 Bucket 51 size=220 Bucket 52 size=224 Bucket 53 size=228 Bucket 54 size=232 Bucket 55 size=236 Bucket 56 size=240 Bucket 57 size=244 Bucket 58 size=248 Bucket 59 size=252 Bucket 60 size=256 Bucket 61 size=260 Bucket 62 size=264 Bucket 63 size=268 Bucket 64 size=272 Bucket 65 size=276 Bucket 66 size=280 Bucket 67 size=284 Bucket 68 size=288 Bucket 69 size=292 Bucket 70 size=296 Bucket 71 size=300 Bucket 72 size=304 Bucket 73 size=308 Bucket 74 size=312 Bucket 75 size=316 Bucket 76 size=320 Bucket 77 size=324 Bucket 78 size=328 Bucket 79 size=332 Bucket 80 size=336 Bucket 81 size=340 Bucket 82 size=344 Bucket 83 size=348 Bucket 84 size=352 Bucket 85 size=356 Bucket 86 size=360 Bucket 87 size=364 Bucket 88 size=368 Bucket 89 size=372 Bucket 90 size=376 Bucket 91 size=380 Bucket 92 size=384 Bucket 93 size=388 Bucket 94 size=392 Bucket 95 size=396 Bucket 96 size=400 Bucket 97 size=404 Bucket 98 size=408 Bucket 99 size=412 Bucket 100 size=416 Bucket 101 size=420 Bucket 102 size=424 Bucket 103 size=428 Bucket 104 size=432 Bucket 105 size=436 Bucket 106 size=440 Bucket 107 size=444 Bucket 108 size=448 Bucket 109 size=452 Bucket 110 size=456 Bucket 111 size=460 Bucket 112 size=464 Bucket 113 size=468 Bucket 114 size=472 Bucket 115 size=476 Bucket 116 size=480 Bucket 117 size=484 Bucket 118 size=488 Bucket 119 size=492 Bucket 120 size=496 Bucket 121 size=500 Bucket 122 size=504 Bucket 123 size=508 Bucket 124 size=512 Bucket 125 size=516 Bucket 126 size=520 Bucket 127 size=524 Bucket 128 size=528 Bucket 129 size=532 Bucket 130 size=536 Bucket 131 size=540 Bucket 132 size=544 Bucket 133 size=548 Bucket 134 size=552 Bucket 135 size=556 Bucket 136 size=560 Bucket 137 size=564 Bucket 138 size=568 Bucket 139 size=572 Bucket 140 size=576 Bucket 141 size=580 Bucket 142 size=584 Bucket 143 size=588 Bucket 144 size=592 Bucket 145 size=596 Bucket 146 size=600 Bucket 147 size=604 Bucket 148 size=608 Bucket 149 size=612 Bucket 150 size=616 Bucket 151 size=620 Bucket 152 size=624 Bucket 153 size=628 Bucket 154 size=632 Bucket 155 size=636 Bucket 156 size=640 Bucket 157 size=644 Bucket 158 size=648 Bucket 159 size=652 Bucket 160 size=656 Bucket 161 size=660 Bucket 162 size=664 Bucket 163 size=668 Bucket 164 size=672 Bucket 165 size=676 Bucket 166 size=680 Bucket 167 size=684 Bucket 168 size=688 Bucket 169 size=692 Bucket 170 size=696 Bucket 171 size=700 Bucket 172 size=704 Bucket 173 size=708 Bucket 174 size=712 Bucket 175 size=716 Bucket 176 size=720 Bucket 177 size=724 Bucket 178 size=728 Bucket 179 size=732 Bucket 180 size=736 Bucket 181 size=740 Bucket 182 size=744 Bucket 183 size=748 Bucket 184 size=752 Bucket 185 size=756 Bucket 186 size=760 Bucket 187 size=764 Bucket 188 size=768 Bucket 189 size=772 Bucket 190 size=776 Bucket 191 size=780 Bucket 192 size=784 Bucket 193 size=788 Bucket 194 size=792 Bucket 195 size=796 Bucket 196 size=800 Bucket 197 size=804 Bucket 198 size=808 Bucket 199 size=812 Bucket 200 size=876 Bucket 201 size=940 Bucket 202 size=1004 Bucket 203 size=1068 Bucket 204 size=1132 Bucket 205 size=1196 Bucket 206 size=1260 Bucket 207 size=1324 Bucket 208 size=1388 Bucket 209 size=1452 Bucket 210 size=1516 Bucket 211 size=1580 Bucket 212 size=1644 Bucket 213 size=1708 Bucket 214 size=1772 Bucket 215 size=1836 Bucket 216 size=1900 Bucket 217 size=1964 Bucket 218 size=2028 Bucket 219 size=2092 Bucket 220 size=2156 Bucket 221 size=2220 Bucket 222 size=2284 Bucket 223 size=2348 Bucket 224 size=2412 Bucket 225 size=2476 Bucket 226 size=2540 Bucket 227 size=2604 Bucket 228 size=2668 Bucket 229 size=2732 Bucket 230 size=2796 Bucket 231 size=2860 Bucket 232 size=2924 Bucket 233 size=2988 Bucket 234 size=3052 Bucket 235 size=3116 Bucket 236 size=3180 Bucket 237 size=3244 Bucket 238 size=3308 Bucket 239 size=3372 Bucket 240 size=3436 Bucket 241 size=3500 Bucket 242 size=3564 Bucket 243 size=3628 Bucket 244 size=3692 Bucket 245 size=3756 Bucket 246 size=3820 Bucket 247 size=3884 Bucket 248 size=3948 Bucket 249 size=4012 Bucket 250 size=4108 Bucket 251 size=8204 Bucket 252 size=16396 Bucket 253 size=32780 Bucket 254 size=65548 |
每个Bucket容纳的size范围
Bucket 0~199 容纳size以 4 递增
Bucket 200~249 容纳size以 64 递增
从Bucket 249开始,Oracle各Bucket步长进一步增加:
Bucket 249: 4012 ~4107 = 96
Bucket 250: 4108 ~8203 = 4096
Bucket 251: 8204 ~16395 = 8192
Bucket 252: 16396~32779 = 16384
Bucket 253: 32780~65547 = 32768
Bucket 254: >=65548
在Oracle9i中,对于小的chunk,Oracle增加了更多的Bucket来管理。0~199共200个Bucket,size以4为步长递增;200~249共50个Bucket,size以64递增。这样每个Bucket中容纳的chunk数量大大减少,查找的效率得以提高。
这就是Oracle9i中shared pool管理的增强,通过这个算法的改进。Oracle8i中,过大shared pool带来的栓锁争用等性能问题在某种程度上得以解决。
关于shared pool的深入探讨(二)
我们继续把前面的问题展开一下。其实我们可以从数据库内部监控shared pool的空间碎片情况。
这涉及到一个内部视图x$ksmsp
X$KSMSP的名称含义为: [K]ernal [S]torage [M]emory Management [S]GA Hea[P]
其中每一行都代表着shared pool中的一个chunk.
首先记录一下测试环境:
SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production PL/SQL Release 9.2.0.3.0 - Production CORE 9.2.0.3.0 Production TNS for Linux: Version 9.2.0.3.0 - Production NLSRTL Version 9.2.0.3.0 – Production |
SQL> desc x$ksmsp Name Null? Type ----------------------------------------- -------- ---------------------------- ADDR RAW(4) INDX NUMBER INST_ID NUMBER KSMCHIDX NUMBER KSMCHDUR NUMBER KSMCHCOM VARCHAR2(16) KSMCHPTR RAW(4) KSMCHSIZ NUMBER KSMCHCLS VARCHAR2(8) KSMCHTYP NUMBER KSMCHPAR RAW(4) |
KSMCHCOM是注释字段,每个内存块被分配以后,注释会添加在该字段中.
x$ksmsp.ksmchsiz代表块大小
x$ksmsp.ksmchcls列代表类型,主要有四类,说明如下:
free
Free chunks--不包含任何对象的chunk,可以不受限制的被分配.
recr
Recreatable chunks--包含可以被临时移出内存的对象,在需要的时候,这个对象可以被重新创建.例如,许多存储共享sql代码的内存都是可以重建的.
freeabl
Freeable chunks--包含session周期或调用的对象,随后可以被释放.这部分内存有时候可以全部或部分提前释放.但是注意,由于某些对象是中间过程产生的,这些对象不能临时被移出内存(因为不可重建).
perm
Permanent memory chunks--包含永久对象.通常不能独立释放.
我们可以通过查询x$ksmsp视图来考察shared pool中存在的内存片的数量。
不过注意:Oracle的某些版本(如:10.1.0.2)在某些平台上(如:HP-UX PA-RISC 64-bit)查
询该视图可能导致过度的CPU耗用,这是由于bug引起的。
我们看一下测试:
初始启动数据库,x$ksmsp中存在2259个chunk SQL> select count(*) from x$ksmsp; COUNT(*) ---------- 2259 执行查询: SQL> select count(*) from dba_objects; COUNT(*) ---------- 10491 此时shared pool中的chunk数量增加 SQL> select count(*) from x$ksmsp; COUNT(*) ---------- 2358 |
我们看以下测试:
首先重新启动数据库: SQL> startup force; ORACLE instance started. Total System Global Area 47256168 bytes Fixed Size 451176 bytes Variable Size 29360128 bytes Database Buffers 16777216 bytes Redo Buffers 667648 bytes Database mounted. Database opened. 创建一张临时表用以保存之前x$ksmsp的状态: SQL> CREATE GLOBAL TEMPORARY TABLE e$ksmsp ON COMMIT PRESERVE ROWS AS 2 SELECT a.ksmchcom, 3 SUM (a.CHUNK) CHUNK, 4 SUM (a.recr) recr, 5 SUM (a.freeabl) freeabl, 6 SUM (a.SUM) SUM 7 FROM (SELECT ksmchcom, COUNT (ksmchcom) CHUNK, 8 DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr, 9 DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl, 10 SUM (ksmchsiz) SUM 11 FROM x$ksmsp GROUP BY ksmchcom, ksmchcls) a 12 where 1 = 0 13 GROUP BY a.ksmchcom; Table created. 保存当前shared pool状态: SQL> INSERT INTO E$KSMSP 2 SELECT a.ksmchcom, 3 SUM (a.CHUNK) CHUNK, 4 SUM (a.recr) recr, 5 SUM (a.freeabl) freeabl, 6 SUM (a.SUM) SUM 7 FROM (SELECT ksmchcom, COUNT (ksmchcom) CHUNK, 8 DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr, 9 DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl, 10 SUM (ksmchsiz) SUM 11 FROM x$ksmsp 12 GROUP BY ksmchcom, ksmchcls) a 13 GROUP BY a.ksmchcom 14 / 41 rows created. 执行查询: SQL> select count(*) from dba_objects; COUNT(*) ---------- 10492 比较前后shared pool内存分配的变化: SQL> select a.ksmchcom,a.chunk,a.sum,b.chunk,b.sum,(a.chunk - b.chunk) c_diff,(a.sum -b.sum) s_diff 2 from 3 (SELECT a.ksmchcom, 4 SUM (a.CHUNK) CHUNK, 5 SUM (a.recr) recr, 6 SUM (a.freeabl) freeabl, 7 SUM (a.SUM) SUM 8 FROM (SELECT ksmchcom, COUNT (ksmchcom) CHUNK, 9 DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr, 10 DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl, 11 SUM (ksmchsiz) SUM 12 FROM x$ksmsp 13 GROUP BY ksmchcom, ksmchcls) a 14 GROUP BY a.ksmchcom) a,e$ksmsp b 15 where a.ksmchcom = b.ksmchcom and (a.chunk - b.chunk) <>0 16 / KSMCHCOM CHUNK SUM CHUNK SUM C_DIFF S_DIFF ---------------- ---------- ---------- ---------- ---------- ---------- ---------- KGL handles 313 102080 302 98416 11 3664 KGLS heap 274 365752 270 360424 4 5328 KQR PO 389 198548 377 192580 12 5968 free memory 93 2292076 90 2381304 3 -89228 library cache 1005 398284 965 381416 40 16868 sql area 287 547452 269 490052 18 57400 6 rows selected. |
首先free memory的大小减少了89228(增加到另外五个组件中),这说明sql解析存储占用了一定的内存空间。而chunk从90增加为93,这说明内存碎片增加了。
在下面的部分中,我会着手介绍一下KGL handles, KGLS heap这两个非常重要的shared pool中的内存结构。
关于shared pool的深入探讨(三)
基本命令:ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level LL';
其中LL代表Level级别,对于9.2.0及以后版本,不同Level含义如下:
Level =1 ,转储Library cache统计信息
Level =2 ,转储hash table概要
Level =4 ,转储Library cache对象,只包含基本信息
Level =8 ,转储Library cache对象,包含详细信息(包括child references,pin waiters等)
Level =16,增加heap sizes信息
Level =32,增加heap信息
Library cache由一个hash表组成,而hash表是一个由hash buckets组成的数组。
每个hash bucket都是包含library cache handle的一个双向链表。
Library Cache Handle指向Library Cache Object和一个引用列表。
library cache对象进一步分为:依赖表、子表和授权表等
我们看一下library cache的结构:
通过
ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4'
获得以下输出(这部分信息来自Oracle8i,Trace文件可以从www.eygle.com上找到)
第一部分(等价于Level 1):
LIBRARY CACHE STATISTICS: gets hit ratio pins hit ratio reloads invalids namespace ---------- --------- ---------- --------- ---------- ---------- --------- 619658171 0.9999160 2193292112 0.9999511 9404 380 CRSR 79698558 0.9998832 424614847 0.9999108 13589 0 TABL/PRCD/TYPE 163399 0.9979926 163402 0.9978948 16 0 BODY/TYBD 0 0.0000000 0 0.0000000 0 0 TRGR 34 0.0294118 35 0.0571429 0 0 INDX 18948 0.9968862 24488 0.9953855 0 0 CLST 0 0.0000000 0 0.0000000 0 0 OBJE 0 0.0000000 0 0.0000000 0 0 PIPE 0 0.0000000 0 0.0000000 0 0 LOB 0 0.0000000 0 0.0000000 0 0 DIR 0 0.0000000 0 0.0000000 0 0 QUEU 0 0.0000000 0 0.0000000 0 0 OBJG 0 0.0000000 0 0.0000000 0 0 PROP 0 0.0000000 0 0.0000000 0 0 JVSC 0 0.0000000 0 0.0000000 0 0 JVRE 0 0.0000000 0 0.0000000 0 0 ROBJ 0 0.0000000 0 0.0000000 0 0 REIP 0 0.0000000 0 0.0000000 0 0 CPOB 115071 0.9992179 115071 0.9930999 704 0 EVNT 0 0.0000000 0 0.0000000 0 0 SUMM 0 0.0000000 0 0.0000000 0 0 DIMN 0 0.0000000 0 0.0000000 0 0 CTX 0 0.0000000 0 0.0000000 0 0 OUTL 0 0.0000000 0 0.0000000 0 0 RULS 0 0.0000000 0 0.0000000 0 0 RMGR 0 0.0000000 0 0.0000000 0 0 UNUSED 0 0.0000000 0 0.0000000 0 0 PPLN 0 0.0000000 0 0.0000000 0 0 PCLS 0 0.0000000 0 0.0000000 0 0 SUBS 0 0.0000000 0 0.0000000 0 0 LOCS 0 0.0000000 0 0.0000000 0 0 RMOB 0 0.0000000 0 0.0000000 0 0 RSMD 699654181 0.9999117 2618209955 0.9999440 23713 380 CUMULATIVE |
第二部分(等价于Level 2中的输出):
LIBRARY CACHE HASH TABLE: size=509 count=354 BUCKET 0: BUCKET 1: BUCKET 2: * BUCKET 3: BUCKET 4: BUCKET 5: * BUCKET 6: * BUCKET 7: BUCKET 8: ** BUCKET 9: *** BUCKET 10: * BUCKET 11: * BUCKET 12: *** …….. BUCKET 103: BUCKET 104: * BUCKET 105: BUCKET 106: BUCKET 107: **** BUCKET 108: BUCKET 109: BUCKET 110: BUCKET 111: * BUCKET 112: ** BUCKET 113: BUCKET 114: BUCKET 115: BUCKET 116: * BUCKET 117: BUCKET 118: ***** BUCKET 119: BUCKET 120: * BUCKET 121: BUCKET 122: BUCKET 123: BUCKET 124: BUCKET 125: * BUCKET 126: BUCKET 127: BUCKET 128: * BUCKET 129: BUCKET 130: * BUCKET 131: * BUCKET 132: BUCKET 133: BUCKET 134: BUCKET 135: * BUCKET 136: BUCKET 137: BUCKET 138: BUCKET 139: * BUCKET 140: * BUCKET 141: * BUCKET 142: BUCKET 143: * BUCKET 144: BUCKET 145: *** BUCKET 146: BUCKET 147: * BUCKET 148: BUCKET 149: BUCKET 150: ** BUCKET 151: BUCKET 152: BUCKET 153: * BUCKET 154: BUCKET 155: BUCKET 156: BUCKET 157: BUCKET 158: BUCKET 159: BUCKET 160: BUCKET 161: BUCKET 162: BUCKET 163: BUCKET 164: * BUCKET 165: * BUCKET 166: BUCKET 167: BUCKET 168: BUCKET 169: BUCKET 170: ** BUCKET 171: BUCKET 172: * BUCKET 173: BUCKET 174: BUCKET 175: * BUCKET 176: * BUCKET 177: BUCKET 178: BUCKET 179: BUCKET 180: BUCKET 181: * BUCKET 182: BUCKET 183: BUCKET 184: BUCKET 185: * BUCKET 186: BUCKET 187: BUCKET 188: ** BUCKET 189: BUCKET 190: * BUCKET 191: * BUCKET 192: BUCKET 193: BUCKET 194: * BUCKET 195: ** BUCKET 196: * BUCKET 197: ** BUCKET 198: **** BUCKET 199: * BUCKET 200: * ………. BUCKET 509: BUCKET 510: BUCKET 511: |
在以上输出中我们看到Bucket 198中包含四个对象.
我们在第三部分中可以找到bucket 198:
BUCKET 198: LIBRARY OBJECT HANDLE: handle=2c2b4ac4 name= SELECT a.statement_id, a.timestamp, a.remarks, a.operation, a.options, a.object_node, a.object_owner, a.object_name, a.object_instance, a.object_type, a.optimizer, a.search_columns, a.id, a.parent_id, a.position, a.cost, a.cardinality, a.bytes, a.other_tag, a.partition_start, a.partition_stop, a.partition_id, a.other, a.distribution , ROWID FROM plan_table a hash=60dd47a1 timestamp=08-27-2004 10:19:28 namespace=CRSR flags=RON/TIM/PN0/LRG/[10010001] kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch=0 lwt=2c2b4adc[2c2b4adc,2c2b4adc] ltm=2c2b4ae4[2c2b4ae4,2c2b4ae4] pwt=2c2b4af4[2c2b4af4,2c2b4af4] ptm=2c2b4b4c[2c2b4b4c,2c2b4b4c] ref=2c2b4acc[2c2b4acc,2c2b4acc] LIBRARY OBJECT: object=2c0b1430 type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0 CHILDREN: size=16 child# table reference handle ------ -------- --------- -------- 0 2c0b15ec 2c0b15b4 2c2c0d50 DATA BLOCKS: data# heap pointer status pins change ----- -------- -------- ------ ---- ------ 0 2c362290 2c0b14b4 I/-/A 0 NONE LIBRARY OBJECT HANDLE: handle=2c3675d4 name=SYS.DBMS_STANDARD hash=50748ddb timestamp=NULL namespace=BODY/TYBD flags=TIM/SML/[02000000] kkkk-dddd-llll=0000-0011-0011 lock=0 pin=0 latch=0 lwt=2c3675ec[2c3675ec,2c3675ec] ltm=2c3675f4[2c3675f4,2c3675f4] pwt=2c367604[2c367604,2c367604] ptm=2c36765c[2c36765c,2c36765c] ref=2c3675dc[2c3675dc,2c3675dc] LIBRARY OBJECT: object=2c1528e8 flags=NEX[0002] pflags= [00] status=VALD load=0 DATA BLOCKS: data# heap pointer status pins change ----- -------- -------- ------ ---- ------ 0 2c367564 2c1529cc I/-/A 0 NONE 4 2c15297c 0 -/P/- 0 NONE LIBRARY OBJECT HANDLE: handle=2c347dd8 name=select pos#,intcol#,col#,spare1 from icol$ where obj#=:1 hash=fa15ebe3 timestamp=07-28-2004 18:04:43 namespace=CRSR flags=RON/TIM/PN0/SML/[12010000] kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch=0 lwt=2c347df0[2c347df0,2c347df0] ltm=2c347df8[2c347df8,2c347df8] pwt=2c347e08[2c347e08,2c347e08] ptm=2c347e60[2c347e60,2c347e60] ref=2c347de0[2c347de0,2c347de0] LIBRARY OBJECT: object=2c1cd1a0 type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0 CHILDREN: size=16 child# table reference handle ------ -------- --------- -------- 0 2c1cd35c 2c1cd324 2c281678 1 2c1cd35c 2c352c50 2c0eeb8c 2 2c1cd35c 2c352c6c 2c2bb05c DATA BLOCKS: data# heap pointer status pins change ----- -------- -------- ------ ---- ------ 0 2c2e8c58 2c1cd224 I/-/A 0 NONE LIBRARY OBJECT HANDLE: handle=2c3a6484 name=SYS.TS$ hash=bb42852e timestamp=04-24-2002 00:04:15 namespace=TABL/PRCD/TYPE flags=PKP/TIM/KEP/SML/[02900000] kkkk-dddd-llll=0111-0111-0119 lock=0 pin=0 latch=0 lwt=2c3a649c[2c3a649c,2c3a649c] ltm=2c3a64a4[2c3a64a4,2c3a64a4] pwt=2c3a64b4[2c3a64b4,2c3a64b4] ptm=2c3a650c[2c3a650c,2c3a650c] ref=2c3a648c[2c0d4b14,2c09353c] LIBRARY OBJECT: object=2c3a626c type=TABL flags=EXS/LOC[0005] pflags= [00] status=VALD load=0 DATA BLOCKS: data# heap pointer status pins change ----- -------- -------- ------ ---- ------ 0 2c3a8ea4 2c3a63b0 I/P/A 0 NONE 3 2c3a5828 0 -/P/- 0 NONE 4 2c3a6300 2c3a5960 I/P/A 0 NONE 8 2c3a6360 2c3a4f00 I/P/A 0 NONE |
我们再来看看Oracle9i中的情况:
参考文件:
hsjf_ora_15800.trc
LIBRARY CACHE HASH TABLE: size=131072 count=217 Buckets with more than 20 objects: NONE Hash Chain Size Number of Buckets --------------- ----------------- 0 130855 1 217 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 10 0 11 0 12 0 13 0 14 0 15 0 16 0 17 0 18 0 19 0 20 0 >20 0 |
按不同的Hash Chain Size代表Library Cache中包含不同对象的个数.0表示Free的Bucket,>20表示包含超过20个对象的Bucket的个数.
从以上列表中我们看到,包含一个对象的Buckets有217个,包含0个对象的Buckets有130855个.
我们来验证一下:
[oracle@jumper udump]$ cat hsjf_ora_15800.trc |grep BUCKET|more BUCKET 12: BUCKET 12 total object count=1 BUCKET 385: BUCKET 385 total object count=1 BUCKET 865: BUCKET 865 total object count=1 ... [oracle@jumper udump]$ cat hsjf_ora_15800.trc |grep BUCKET|wc -l 434 [oracle@jumper udump]$ |
通过HASH TABLE算法的改进,Oracle Library Cache管理的效率大大提高.
关于shared pool的深入探讨(四)
我们进一步来讨论一下shared pool的处理:先进行相应查询,获得测试数据:
[oracle@jumper udump]$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on Thu Aug 26 10:21:54 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.3.0 - Production SQL> connect eygle/eygle Connected. SQL> create table emp as select * from scott.emp; Table created. SQL> SQL> connect / as sysdba Connected. SQL> startup force; ORACLE instance started. Total System Global Area 47256168 bytes Fixed Size 451176 bytes Variable Size 29360128 bytes Database Buffers 16777216 bytes Redo Buffers 667648 bytes Database mounted. Database opened. SQL> set linesize 120 SQL> connect scott/tiger Connected. SQL> select * from emp; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7369 SMITH CLERK 7902 17-DEC-80 800 20 8888 EYGLE MANAGER 11-AUG-04 9999 10 10 7566 JONES MANAGER 7839 02-APR-81 2975 20 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 7839 KING PRESIDENT 17-NOV-81 5000 10 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 7900 JAMES CLERK 7698 03-DEC-81 950 30 7902 FORD ANALYST 7566 03-DEC-81 3000 20 7934 MILLER CLERK 7782 23-JAN-82 1300 10 11 rows selected. SQL> connect eygle/eygle Connected. SQL> select * from emp; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7369 SMITH CLERK 7902 17-DEC-80 800 20 8888 EYGLE MANAGER 11-AUG-04 9999 10 10 7566 JONES MANAGER 7839 02-APR-81 2975 20 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 7839 KING PRESIDENT 17-NOV-81 5000 10 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 7900 JAMES CLERK 7698 03-DEC-81 950 30 7902 FORD ANALYST 7566 03-DEC-81 3000 20 7934 MILLER CLERK 7782 23-JAN-82 1300 10 11 rows selected. SQL> select SQL_TEXT,VERSION_COUNT,HASH_VALUE,to_char(HASH_VALUE,'xxxxxxxxxx') HEX,ADDRESS 2 from v$sqlarea where sql_text like 'select * from emp%'; SQL_TEXT VERSION_COUNT HASH_VALUE HEX ADDRESS -------------------- ------------- ---------- ----------- -------- select * from emp 2 2648707557 9de011e5 52D9EA28 SQL> select sql_text,username,ADDRESS,HASH_VALUE,to_char(HASH_VALUE,'xxxxxxxxxx') HEX_HASH_VALUE,CHILD_NUMBER,CHILD_LATCH 2 from v$sql a,dba_users b where a.PARSING_USER_ID = b.user_id and sql_text like 'select * from emp%'; SQL_TEXT USERNAME ADDRESS HASH_VALUE HEX_HASH_VA CHILD_NUMBER CHILD_LATCH -------------------- ------------------------------ -------- ---------- ----------- ------------ ----------- select * from emp SCOTT 52D9EA28 2648707557 9de011e5 0 1 select * from emp EYGLE 52D9EA28 2648707557 9de011e5 1 1 注意:这里我们可以看出v$sqlarea和v$sql两个视图的不同之处 v$sql中为每一条sql保留一个条目,而v$sqlarea中根据sql_text进行group by,通过version_count计算子指针的个数. 我们注意到,这两条sql语句因为其代码完全相同,所以其ADDRESS,HASH_VALUE也完全相同. 这就意味着,这两条sql语句在shared pool中的存储位置是相同的(尽管其执行计划可能不同),代码得以共享. 在此过程中Oracle完成sql解析的第一个步骤:语法解析 Oracle进行语法检查时遵循自右向左,自下向上的原则,如果发现语法错误就马上返回错误. 语法检查通过以后,Oracle将sql文本转换为相应的ASCII数值,然后根据数值通过Hash函数计算其HASH_VALUE 在shared pool中寻找是否存在相同的sql语句,如果存在,则进入下一步骤;如果不存在则尝试获取shared pool latch 请求内存,存储该sql代码 在这里有一个问题需要说明一下: 因为大小写字母的ascii值是不同的,所以Oracle会把大小写不同的代码作为不同的sql来处理,我们看一下测试: SQL> select * from scott.dept; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON SQL> select * from scott.DEPT; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON SQL> col sql_text for a30 SQL> select sql_text,hash_value from v$sql where sql_text like 'select * from scott%'; SQL_TEXT HASH_VALUE ------------------------------ ---------- select * from scott.DEPT 4096614922 select * from scott.dept 2089404358 我们注意到,仅只大小写的不同使得原本相同的sql语句变成了两条"不同的代码" 所以从这里我们可以看出,sql的规范编写非常重要. 完成这一个步骤以后,Oracle需要进行的是语义分析: 在此步骤中,Oracle需要验证对象是否存在,相关用户是否具有权限,引用的是否是相同的对象... 对于我们第一个查询,实际上emp表来自不同的用户,那么sql的执行计划也就不同了 当然影响sql执行计划的因素还有很多,包括优化器模式等 SQL> select a.*,to_char(to_hash,'xxxxxxxxxx') Hex_HASH_VALUE 2 from V$OBJECT_DEPENDENCY a where to_name='EMP'; FROM_ADD FROM_HASH TO_OWNER TO_NAME TO_ADDRE TO_HASH TO_TYPE HEX_HASH_VA -------- ---------- -------------------- -------------------- -------- ---------- ---------- ----------- 52D9EA28 2648707557 SCOTT EMP 52D9DEBC 828228010 2 315dc1aa 52D9EA28 2648707557 EYGLE EMP 52D82E58 1930491453 2 7310f63d SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 1'; Session altered. SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 2'; Session altered. SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4'; Session altered. SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 8'; Session altered. SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 16'; Session altered. SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 32'; Session altered. SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 100'; Session altered. |
在转向trace文件之前,我们通过下图来看一下Library CACHE的结构:
见上传图片
Library Cache Handle指向Library Cache Object,含对象名,namespace,时间戳,引用列表,锁定对象及pin对象列表等.
我们从dump文件中看看具体的信息,由以上v$sql视图我们得到以上查询的hash_value为9de011e5,ADDRESS为52D9EA28
在bucket 4851中,我们找到了select * from emp 这条sql语句.
BUCKET 4581: LIBRARY OBJECT HANDLE: handle=52d9ea28 name=select * from emp hash=9de011e5 timestamp=08-26-2004 10:24:43 ==>这个hash正是v$sql中该sql语句的hash_value值 namespace=CRSR flags=RON/TIM/PN0/SML/[12010000] kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch#=1 lwt=0x52d9ea40[0x52d9ea40,0x52d9ea40] ltm=0x52d9ea48[0x52d9ea48,0x52d9ea48] pwt=0x52d9ea58[0x52d9ea58,0x52d9ea58] ptm=0x52d9eab0[0x52d9eab0,0x52d9eab0] ref=0x52d9ea30[0x52d9ea30, 0x52d9ea30] lnd=0x52d9eabc[0x52d9eabc,0x52d9eabc] LIBRARY OBJECT: object=52d9e7b0 type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0 CHILDREN: size=16 child# table reference handle ------ -------- --------- -------- 0 52d9e96c 52d9e6cc 52d9e4ac 1 52d9e96c 52d9e70c 52d885cc ==>这就是我们前边提到过的子指针,每个都指向了一个不同的handle DATA BLOCKS: data# heap pointer status pins change alloc(K) size(K) ----- -------- -------- ------ ---- ------ -------- -------- 0 52d9e9b8 52d9e838 I/-/A 0 NONE 0.86 1.09 ==>此处的heap就是指内存地址. ==>这里存放的就是SQL代码及用户连接信息 HEAP DUMP OF DATA BLOCK 0: ****************************************************** HEAP DUMP heap name="library cache" desc=0x52d9e9b8 extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0 parent=0x5000002c owner=0x52d9e7b0 nex=(nil) xsz=0x224 EXTENT 0 addr=0x52d9e558 Chunk 52d9e560 sz= 540 perm "perm " alo=448 52D9E560 5000021D 00000000 52D9E7A0 000001C0 [...P.......R....] 52D9E570 52D9E704 52D9E660 00020002 52D9E57C [...R`..R....|..R] 52D9E580 52D9E57C 00000000 52D9E588 52D9E588 [|..R.......R...R] 52D9E590 00000000 52D9E594 52D9E594 00000000 [.......R...R....] 52D9E5A0 52D9E5A0 52D9E5A0 00000000 52D9E5AC [...R...R.......R] 52D9E5B0 52D9E5AC 00000000 52D9E5B8 52D9E5B8 [...R.......R...R] ……. 52D9E8D0 52D9E630 00000000 52D9E570 00000000 [0..R....p..R....] 52D9E8E0 00000000 00000000 00000000 00000000 [................] 52D9E8F0 00000000 00000000 00000002 0000000D [................] 52D9E900 00000001 00000000 [........] Permanent space = 900 ****************************************************** BUCKET 4581 total object count=1 我们以handle:52d885cc为例看一下Library Cache Object的结构: ****************************************************** LIBRARY OBJECT HANDLE: handle=52d885cc namespace=CRSR flags=RON/KGHP/PN0/[10010000] kkkk-dddd-llll=0000-0041-0041 lock=0 pin=0 latch#=1 lwt=0x52d885e4[0x52d885e4,0x52d885e4] ltm=0x52d885ec[0x52d885ec,0x52d885ec] pwt=0x52d885fc[0x52d885fc,0x52d885fc] ptm=0x52d88654[0x52d88654,0x52d88654] ref=0x52d885d4[0x52d9e70c, 0x52d9e70c] lnd=0x52d88660[0x52d88660,0x52d88660] LIBRARY OBJECT: object=52d82a24 type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0 DEPENDENCIES: count=1 size=16 dependency# table reference handle position flags ----------- -------- --------- -------- -------- ------------------- 0 52d82be0 52d82b20 52d82e58 14 DEP[01] ==>在dependency部分我们看到,这个cursor依赖的对象handle: 52d82e58 ==>这个handle指向的就是EYGLE.EMP表,如果以上两个CRSR访问的是同一个对象, ==>那么这两个SQL才会是真的共享.这里我们的sql虽然是相同的,访问的却是不同用户 ==>的数据表, 子指针的概念就出来了. ==>在这里我们看到52d82e58指向的是EYGLE.EMP这个对象,也就是EYGLE所查询的数据表. ACCESSES: count=1 size=16 dependency# types ----------- ----- 0 0009 TRANSLATIONS: count=1 size=16 original final -------- -------- 52d82e58 52d82e58 DATA BLOCKS: data# heap pointer status pins change alloc(K) size(K) ----- -------- -------- ------ ---- ------ -------- -------- 0 52d8c244 52d827e4 I/-/A 0 NONE 1.09 1.64 6 52d82ac0 52d817c4 I/-/A 0 NONE 3.70 3.73 ==>这里的Data Blocks是个重要的部分 ==>每个控制块包含一个heap descriptor,指向相应的heap memory,这个heap memory ==>包含的就是Diana Tree,P-Code,Source Code,Shared Cursor Context area等重要==>数据,也就是我们通常。所说的,解析过的SQL及执行计划树,真正到这里以后,sql才得==>以共享.也就真正的避免了硬解析 HEAP DUMP OF DATA BLOCK 0: ****************************************************** HEAP DUMP heap name="library cache" desc=0x52d8c244 extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0 parent=0x5000002c owner=0x52d82a24 nex=(nil) xsz=0x224 EXTENT 0 addr=0x52d80ff0 Chunk 52d80ff8 sz= 464 free " " 52D80FF0 C00001D1 00000000 [........] 52D81000 52D8C28C 52D8C28C 00000000 00000000 [...R...R........] 52D81010 00000000 00000000 00000000 00000000 [................] Repeat 26 times 52D811C0 00000000 00000000 [........] Chunk 52d811c8 sz= 76 freeable "kgltbtab " 52D811C0 1000004D 52D80FF8 [M......R] 52D811D0 0A857928 52D82B68 00000000 00000000 [(y..h+.R........] 52D811E0 00000000 00000000 00000000 00000000 [................] Repeat 2 times 52D81210 00000000 [....] EXTENT 1 addr=0x52d827cc Chunk 52d827d4 sz= 540 perm "perm " alo=532 52D827D0 5000021D 00000000 52D82A14 [...P.....*.R] 52D827E0 00000214 00000000 0000001A 00000069 [............i...] ……………. 52D825D0 00000000 00000000 00000000 52D81ACF [...............R] 52D825E0 00000000 00000000 00000000 00000000 [................] Repeat 3 times 52D82620 52D8263C 00000010 000006D4 00000010 [<&.R............] 52D82630 000006F4 00000010 000006F8 00000010 [................] 52D82640 000006FC 00000010 00000714 00000010 [................] 52D82650 00000720 00000010 0000072C 00000010 [ .......,.......] 52D82660 00000744 00000010 0000074C 00000010 [D.......L.......] 52D82670 00000764 00000010 0000077C [d.......|...] Permanent space = 3784 MARKS: Mark 0x52d8237c ****************************************************** 这里的handle=52d82e58就是sql依赖的对象信息: BUCKET 63037: LIBRARY OBJECT HANDLE: handle=52d82e58 name=EYGLE.EMP hash=7310f63d timestamp=08-26-2004 10:23:40 namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000] kkkk-dddd-llll=0000-0501-0501 lock=0 pin=0 latch#=1 lwt=0x52d82e70[0x52d82e70,0x52d82e70] ltm=0x52d82e78[0x52d82e78,0x52d82e78] pwt=0x52d82e88[0x52d82e88,0x52d82e88] ptm=0x52d82ee0[0x52d82ee0,0x52d82ee0] ref=0x52d82e60[0x52d82e60, 0x52d82e60] lnd=0x52d82eec[0x52d7dcf0,0x52d89fc8] LIBRARY OBJECT: object=52d81594 type=TABL flags=EXS/LOC[0005] pflags= [00] status=VALD load=0 ==>Type:对象类型,这里是一张表 ==>flags:代表对象状态 DATA BLOCKS: data# heap pointer status pins change alloc(K) size(K) ----- -------- -------- ------ ---- ------ -------- -------- 0 52d8c1e4 52d8161c I/-/A 0 NONE 0.66 1.09 8 52d81238 52d80a18 I/-/A 0 NONE 1.10 1.13 10 52d8129c 52d80ea0 I/-/A 0 NONE 0.12 0.37 HEAP DUMP OF DATA BLOCK 0: ****************************************************** HEAP DUMP heap name="library cache" desc=0x52d8c1e4 extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0 parent=0x5000002c owner=0x52d81594 nex=(nil) xsz=0x224 ==>每个heap descriptor 都包含一个owner部分,指向所有者,这里的 52d81594 也就==>是EYGLE.EMP指向的Library对象: LIBRARY OBJECT: object=52d81594 EXTENT 0 addr=0x52d81220 Chunk 52d81228 sz= 540 perm "perm " alo=196 52D81220 5000021D 00000000 [...P....] 52D81230 52D81584 000000C4 5000002C 00000824 [...R....,..P$...] 52D81240 52D81594 52D80A00 52D80A08 00000000 [...R...R...R....] 52D81250 00000000 05010200 00000000 00000000 [................] 52D81260 534C474B 61656820 00000070 00000000 [KGLS heap.......] 52D81270 00107FFF 7FFF7FFF 00000401 00000000 [................] 52D81280 52D81280 52D81280 52D8129C 00000000 [...R...R...R....] 52D81290 52D80EA0 00040000 52FF5C14 5000002C [...R.....\.R,..P] …….. 52D80EE0 00000000 00000000 00000000 00000000 [................] Repeat 15 times 52D80FE0 00000000 [....] Total heap size = 340 FREE LISTS: Bucket 0 size=0 Total free space = 0 UNPINNED RECREATABLE CHUNKS (lru first): PERMANENT CHUNKS: Chunk 52d80e90 sz= 340 perm "perm " alo=120 52D80E90 50000155 00000000 00000000 00000078 [U..P........x...] 52D80EA0 00000000 00000002 00000068 00000004 [........h.......] 52D80EB0 52D80EB0 52D80EB0 52D80EB8 52D80EB8 [...R...R...R...R] 52D80EC0 00000000 00000000 00000005 52D80ECC [...............R] 52D80ED0 52D80ECC 52D80ED4 52D80ED4 00000000 [...R...R...R....] 52D80EE0 00000000 00000000 00000000 00000000 [................] Repeat 15 times 52D80FE0 00000000 [....] Permanent space = 340 ****************************************************** BUCKET 63037 total object count=1 |
关于shared pool的深入探讨(五)
Oracle使用两种数据结构来进行shared pool的并发控制:lock 和 pin.Lock比pin具有更高的级别.
Lock在handle上获得,在pin一个对象之前,必须首先获得该handle的锁定.
锁定主要有三种模式: Null,share,Exclusive.
在读取访问对象时,通常需要获取Null(空)模式以及share(共享)模式的锁定.
在修改对象时,需要获得Exclusive(排他)锁定.
在锁定了Library Cache对象以后,一个进程在访问之前必须pin该对象.
同样pin有三种模式,Null,shared和exclusive.
只读模式时获得共享pin,修改模式获得排他pin.
通常我们访问、执行过程、Package时获得的都是共享pin,如果排他pin被持有,那么数据库此时就要产生等待.
在很多statspack的report中,我们可能看到以下等待事件:
Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time -------------------------------------------- ------------ ------------ ------- library cache lock 75,884 1,409,500 48.44 latch free 34,297,906 1,205,636 41.43 library cache pin 563 142,491 4.90 db file scattered read 146,283 75,871 2.61 enqueue 2,211 13,003 .45 ------------------------------------------------------------- |
(一).LIBRARY CACHE PIN等待事件
Oracle文档上这样介绍这个等待事件:
"library cache pin" 是用来管理library cache的并发访问的,pin一个object会引起相应的heap被载入内存中(如果此前没有被加载),Pins可以在三个模式下获得:NULL,SHARE,EXCLUSIVE,可以认为pin是一种特定形式的锁.
当Library Cache Pin等待事件出现时,通常说明该Pin被其他用户已非兼容模式持有.
"library cache pin"的等待时间为3秒钟,其中有1秒钟用于PMON后台进程,即在取得pin之前最多等待3秒钟,否则就超时.
"library cache pin"的参数如下,有用的主要是P1和P2:
P1 - KGL Handle address.
P2 - Pin address
P3 - Encoded Mode & Namespace
"LIBRARY CACHE PIN"通常是发生在编译或重新编译PL/SQL,VIEW,TYPES等object时.编译通常都是显性的,如安装应用程序,升级,安装补丁程序等,另外,"ALTER","GRANT","REVOKE"等操作也会使object变得无效, 可以通过object的"LAST_DDL"观察这些变化.
当object变得无效时,Oracle 会在第一次访问此object时试图去重新编译它,如果此时其他session已经把此object pin到library cache中,就会出现问题,特别时当有大量的活动session并且存在较复杂的dependence时.在某种情况下,重新编译object可能会花几个小时时间,从而阻塞其它试图去访问此object的进程.
下面让我们通过一个例子来模拟及解释这个等待:
1.创建测试用存储过程
[oracle@jumper udump]$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 14:16:57 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to an idle instance. SQL> startup ORACLE instance started. Total System Global Area 47256168 bytes Fixed Size 451176 bytes Variable Size 29360128 bytes Database Buffers 16777216 bytes Redo Buffers 667648 bytes Database mounted. Database opened. SQL> create or replace PROCEDURE pining 2 IS 3 BEGIN 4 NULL; 5 END; 6 / Procedure created. SQL> SQL> create or replace procedure calling 2 is 3 begin 4 pining; 5 dbms_lock.sleep(3000); 6 end; 7 / Procedure created. SQL> |
首先执行calling过程,在calling过程中调用pining过程。此时pining过程上获得共享Pin,如果此时尝试对pining进行授权或重新编译,将产生Library Cache Pin等待,直到calling执行完毕.
session 1:
[oracle@jumper oracle]$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 16:13:43 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.3.0 - Production SQL> exec calling |
session 2:
[oracle@jumper udump]$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 16:14:16 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.3.0 - Production SQL> grant execute on pining to eygle; |
ok,我们开始我们的研究:
从v$session_wait入手,我们可以得到哪些session正在经历library cache pin的等待
SQL> select sid,seq#,event,p1,p1raw,p2,p2raw,p3,p3raw,state 2 from v$session_wait where event like 'library%'; SID SEQ# EVENT P1 P1RAW P2 P2RAW P3 WAIT_TIME SECONDS_IN_WAIT STATE ---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- ------------------- 8 268 library cache pin 1389785868 52D6730C 1387439312 52B2A4D0 301 0 2 WAITING 等待3秒就超时,seq#会发生变化 SQL> SID SEQ# EVENT P1 P1RAW P2 P2RAW P3 WAIT_TIME SECONDS_IN_WAIT STATE ---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- ------------------- 8 269 library cache pin 1389785868 52D6730C 1387439312 52B2A4D0 301 0 2 WAITING SQL> SID SEQ# EVENT P1 P1RAW P2 P2RAW P3 WAIT_TIME SECONDS_IN_WAIT STATE ---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- ------------------- 8 270 library cache pin 1389785868 52D6730C 1387439312 52B2A4D0 301 0 0 WAITING |
我们看到,library cache pin等待的对象的handle地址为:52D6730C
通过这个地址,我们查询X$KGLOB视图就可以得到对象的具体信息:
Note: X$KGLOB--[K]ernel [G]eneric [L]ibrary Cache Manager [OB]ject
col KGLNAOWN for a10 col KGLNAOBJ for a20 select ADDR,KGLHDADR,KGLHDPAR,KGLNAOWN,KGLNAOBJ,KGLNAHSH,KGLHDOBJ from X$KGLOB where KGLHDADR ='52D6730C' / ADDR KGLHDADR KGLHDPAR KGLNAOWN KGLNAOBJ KGLNAHSH KGLHDOBJ -------- -------- -------- ---------- -------------------- ---------- -------- 404F9FF0 52D6730C 52D6730C SYS PINING 2300250318 52D65BA4 |
由此我们知道,在PINING对象上正经历library cache pin的等待.
然后我们引入另外一个内部视图X$KGLPN:
Note:X$KGLPN--[K]ernel [G]eneric [L]ibrary Cache Manager object [P]i
s
select a.sid,a.username,a.program,b.addr,b.KGLPNADR,b.KGLPNUSE,b.KGLPNSES,b.KGLPNHDL, b.kGLPNLCK, b.KGLPNMOD, b.KGLPNREQ from v$session a,x$kglpn b where a.saddr=b.kglpnuse and b.kglpnhdl = '52D6730C' and b.KGLPNMOD<>0 / SID USERNAME PROGRAM ADDR KGLPNADR KGLPNUSE KGLPNSES KGLPNHDL KGLPNLCK KGLPNMOD KGLPNREQ ----- ----------- ---------------------------------------- -------- -------- -------- -------- -------- -------- ---------- ---------- 13 SYS sqlplus@jumper.hurray.com.cn (TNS V1-V3) 404FA034 52B2A518 51E2013C 51E2013C 52D6730C 52B294C8 2 0 |
那么这个用户正在等什么呢?
SQL> select * from v$session_wait where sid=13; SID SEQ# EVENT P1TEXT P1 P1RAW P2TEXT P2 P2RAW P3TEXT P3 P3RAW WAIT_TIME SECONDS_IN_WAIT STATE ---------- ---------- ------------------- --------- ---------- -------- ------- ---------- -------- ------- ---------- -------- ---------- --------------- ------- 13 25 PL/SQL lock timer duration 120000 0001D4C0 0 00 0 00 0 1200 WAITING |
SQL> select sql_text from v$sqlarea where v$sqlarea.hash_value='3045375777'; SQL_TEXT -------------------------------------------------------------------------------- BEGIN calling; END; |
我们这个calling作的工作是dbms_lock.sleep(3000)也就是PL/SQL lock timer正在等待的原因。
至此就找到了Library Cache Pin的原因.
简化一下以上查询:
1.获得Library Cache Pin等待的对象
SELECT addr, kglhdadr, kglhdpar, kglnaown, kglnaobj, kglnahsh, kglhdobj FROM x$kglob WHERE kglhdadr IN (SELECT p1raw FROM v$session_wait WHERE event LIKE 'library%') / ADDR KGLHDADR KGLHDPAR KGLNAOWN KGLNAOBJ KGLNAHSH KGLHDOBJ -------- -------- -------- ---------- -------------------- ---------- -------- 404F2178 52D6730C 52D6730C SYS PINING 2300250318 52D65BA4 |
SELECT a.SID, a.username, a.program, b.addr, b.kglpnadr, b.kglpnuse, b.kglpnses, b.kglpnhdl, b.kglpnlck, b.kglpnmod, b.kglpnreq FROM v$session a, x$kglpn b WHERE a.saddr = b.kglpnuse AND b.kglpnmod <> 0 AND b.kglpnhdl IN (SELECT p1raw FROM v$session_wait WHERE event LIKE 'library%') / SQL> SID USERNAME PROGRAM ADDR KGLPNADR KGLPNUSE KGLPNSES KGLPNHDL KGLPNLCK KGLPNMOD KGLPNREQ ---------- ---------- ------------------------------------------------ -------- -------- -------- -------- -------- -------- ---------- ---------- 13 SYS sqlplus@jumper.hurray.com.cn (TNS V1-V3) 404F6CA4 52B2A518 51E2013C 51E2013C 52D6730C 52B294C8 2 0 |
SELECT sql_text FROM v$sqlarea WHERE (v$sqlarea.address, v$sqlarea.hash_value) IN ( SELECT sql_address, sql_hash_value FROM v$session WHERE SID IN ( SELECT SID FROM v$session a, x$kglpn b WHERE a.saddr = b.kglpnuse AND b.kglpnmod <> 0 AND b.kglpnhdl IN (SELECT p1raw FROM v$session_wait WHERE event LIKE 'library%'))) / SQL_TEXT -------------------------------------------------------------------------------- BEGIN calling; END; |
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 32'; Session altered. |
从前面的查询获得pining的Handle是52D6730C:
***************************************************** BUCKET 67790: LIBRARY OBJECT HANDLE: handle=52d6730c name=SYS.PINING hash=891b08ce timestamp=09-06-2004 16:43:51 namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000] kkkk-dddd-llll=0000-0011-0011 lock=N pin=S latch#=1 --在Object上存在共享pin --在handle上存在Null模式锁定,此模式允许其他用户继续以Null/shared模式锁定该对象 lwt=0x52d67324[0x52d67324,0x52d67324] ltm=0x52d6732c[0x52d6732c,0x52d6732c] pwt=0x52d6733c[0x52b2a4e8,0x52b2a4e8] ptm=0x52d67394[0x52d67394,0x52d67394] ref=0x52d67314[0x52d67314, 0x52d67314] lnd=0x52d673a0[0x52d67040,0x52d6afcc] LIBRARY OBJECT: object=52d65ba4 type=PRCD flags=EXS/LOC[0005] pflags=NST [01] status=VALD load=0 DATA BLOCKS: data# heap pointer status pins change alloc(K) size(K) ----- -------- -------- ------ ---- ------ -------- -------- 0 52d65dac 52d65c90 I/P/A 0 NONE 0.30 0.55 4 52d65c40 52d67c08 I/P/A 1 NONE 0.44 0.48 |
****************************************************** BUCKET 67790: LIBRARY OBJECT HANDLE: handle=52d6730c name=SYS.PINING hash=891b08ce timestamp=09-06-2004 16:43:51 namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000] kkkk-dddd-llll=0000-0011-0011 lock=X pin=S latch#=1 --由于calling执行未完成,在object上仍让保持共享pin --由于grant会导致重新编译该对象,所以在handle上的排他锁已经被持有 --进一步的需要获得object上的Exclusive pin,由于shared pin被calling持有,所以library cache pin等待出现. lwt=0x52d67324[0x52d67324,0x52d67324] ltm=0x52d6732c[0x52d6732c,0x52d6732c] pwt=0x52d6733c[0x52b2a4e8,0x52b2a4e8] ptm=0x52d67394[0x52d67394,0x52d67394] ref=0x52d67314[0x52d67314, 0x52d67314] lnd=0x52d673a0[0x52d67040,0x52d6afcc] LIBRARY OBJECT: object=52d65ba4 type=PRCD flags=EXS/LOC[0005] pflags=NST [01] status=VALD load=0 DATA BLOCKS: data# heap pointer status pins change alloc(K) size(K) ----- -------- -------- ------ ---- ------ -------- -------- 0 52d65dac 52d65c90 I/P/A 0 NONE 0.30 0.55 4 52d65c40 52d67c08 I/P/A 1 NONE 0.44 0.48 |
1.存储过程的library cache object以排他模式被锁定,这个锁定是在handle上获得的
exclusive锁定可以防止其他用户执行同样的操作,同时防止其他用户创建新的引用此过程的对象.
2.以shared模式pin该对象,以执行安全和错误检查.
3.共享pin被释放,重新以排他模式pin该对象,执行重编译.
4.使所有依赖该过程的对象失效
5.释放exclusive lock和exclusive pin
(二).LIBRARY CACHE LOCK等待事件
如果此时我们再发出一条grant或compile的命令,那么library cache lock等待事件将会出现:
session 3:
[oracle@jumper oracle]$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on Tue Sep 7 17:05:25 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. Connected to: Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.3.0 - Production SQL> alter procedure pining compile; |
SQL> select * from v$session_wait; SID SEQ# EVENT P1TEXT P1 P1RAW P2TEXT P2 P2RAW P3TEXT P3 P3RAW WAIT_TIME SECONDS STATE ---- ---- ------------------- --------------- ---------- -------- ------------ ---------- -------- ---------------- -------------- ---------- ------ --- 11 143 library cache pin handle address 1390239716 52DD5FE4 pin address 1387617456 52B55CB0 100*mode+namespace 301 0000012D 0 6 WAITING 13 18 library cache lock handle address 1390239716 52DD5FE4 lock address 1387433984 52B29000 100*mode+namespace 301 0000012D 0 3 WAITING 8 415 PL/SQL lock timer duration 120000 0001D4C0 0 00 0 00 0 63 WAITING .... 13 rows selected |
另外我们测试的案例本身就说明:如果Package或过程中存在复杂的、交互的依赖以来关系极易导致library cache pin的出现。所以在应用开发的过程中,我们也应该注意这方面的内容。
关于shared pool的深入探讨(六)
研究了几天shared pool,没想到忽然就撞到问题上来了。作为一个案例写出来给大家参考一下吧。问题起因是公司做短信群发,就是那个18万买的4000字的短信小说(嘘,小声点,我也没看过...)。群发的时候每隔一段时间就会发生一次消息队列拥堵的情况,在数据库内部实际上是向一个数据表中记录发送日志。
我们介入来检查数据库的问题,在一个拥堵时段我开始诊断:
SQL> select sid,event,p1,p1raw from v$session_wait; SID EVENT P1 P1RAW ---------- ---------------------------------------------------------------- ---------- -------- 76 latch free 2147535824 8000CBD0 83 latch free 2147535824 8000CBD0 148 latch free 3415346832 CB920E90 288 latch free 2147535824 8000CBD0 285 latch free 2147535824 8000CBD0 196 latch free 2147535824 8000CBD0 317 latch free 2147535824 8000CBD0 2 pmon timer 300 0000012C 1 rdbms ipc message 300 0000012C 4 rdbms ipc message 300 0000012C 6 rdbms ipc message 180000 0002BF20 SID EVENT P1 P1RAW ---------- ---------------------------------------------------------------- ---------- -------- 18 rdbms ipc message 6000 00001770 102 rdbms ipc message 6000 00001770 311 rdbms ipc message 6000 00001770 194 rdbms ipc message 6000 00001770 178 rdbms ipc message 6000 00001770 3 log file parallel write 1 00000001 13 log file sync 2705 00000A91 16 log file sync 2699 00000A8B 104 log file sync 2699 00000A8B 308 log file sync 2694 00000A86 262 log file sync 2705 00000A91 SID EVENT P1 P1RAW ---------- ---------------------------------------------------------------- ---------- -------- 172 log file sync 2689 00000A81 169 log file sync 2705 00000A91 108 log file sync 2694 00000A86 38 log file sync 2707 00000A93 34 db file scattered read 63 0000003F 5 smon timer 300 0000012C 27 SQL*Net message to client 1413697536 54435000 60 SQL*Net message to client 1413697536 54435000 239 SQL*Net message to client 1413697536 54435000 ...ignore some idle waiting here... 11 SQL*Net message from client 675562835 28444553 12 SQL*Net message from client 1413697536 54435000 170 rows selected. |
SQL> select sid,event,p1,p1raw from v$session_wait where event not like 'SQL*Net%'; SID EVENT P1 P1RAW ---------- ---------------------------------------------------------------- ---------- -------- 2 pmon timer 300 0000012C 1 rdbms ipc message 300 0000012C 4 rdbms ipc message 300 0000012C 6 rdbms ipc message 180000 0002BF20 18 rdbms ipc message 6000 00001770 102 rdbms ipc message 6000 00001770 178 rdbms ipc message 6000 00001770 194 rdbms ipc message 6000 00001770 311 rdbms ipc message 6000 00001770 3 log file parallel write 1 00000001 148 log file sync 2547 000009F3 SID EVENT P1 P1RAW ---------- ---------------------------------------------------------------- ---------- -------- 273 log file sync 2544 000009F0 190 log file sync 2545 000009F1 5 smon timer 300 0000012C 14 rows selected. |
SQL> select addr,latch#,name,gets,spin_gets from v$latch order by spin_gets; ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 80001398 3 session switching 111937 0 80002010 6 longop free list 37214 0 800023A0 7 cached attr list 0 0 80002628 10 event group latch 2391668 0 ..... 80003F3C 28 message pool operations parent latch 3 0 ..... 80006030 60 mostly latch-free SCN 19 0 80005F8C 59 file number translation table 68 0 80005F14 58 dlm cr bast queue latch 0 0 80005E8C 57 name-service request 0 0 80005E14 56 name-service memory objects 0 0 80005DA0 55 name-service namespace bucket 0 0 ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 80005D2C 54 name-service pending queue 0 0 80005CB4 53 name-service request queue 0 0 80004E08 52 name-service entry 0 0 80008AB0 76 KCL lock element parent latch 0 0 80008A48 75 KCL instance latch 0 0 80007F18 73 redo copy 816 0 80007BBC 71 archive process latch 0 0 80007B54 70 archive control 1 0 80006A10 68 Active checkpoint queue latch 2003308 0 800064B0 66 large memory latch 0 0 80006448 65 cache protection latch 0 0 ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 800060EC 61 batching SCNs 0 0 8000CAB0 96 global transaction 6833807 0 8000CA48 95 global tx free list 58258 0 8000C238 93 cost function 0 0 80009FCC 91 temp lob duration state obj allocation 0 0 8000995C 87 ktm global data 8118 0 80009228 84 transaction branch allocation 282388 0 80008EC4 80 begin backup scn array 6968 0 80008D54 79 loader state object freelist 42712 0 80008B80 78 KCL freelist latch 0 0 80008B18 77 KCL name table latch 0 0 ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 8000D484 118 presentation list 0 0 8000D41C 117 session timer 855944 0 ..... 8000E9D0 129 process queue 44 0 8000E900 127 query server freelists 66 0 8000FC84 140 AQ Propagation Scheduling System Load 0 0 8000E898 126 query server process 10 0 8000E27C 125 job_queue_processes parameter latch 111937 0 8000DA1C 124 NLS data objects 2 0 ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 8000D95C 123 ncodef allocation latch 111937 0 8000D674 122 virtual circuits 0 0 8000D60C 121 virtual circuit queues 159877 0 8000D5A4 120 virtual circuit buffers 0 0 8000D4EC 119 address list 2 0 ..... 8000CD70 102 Direct I/O Adaptor 2 0 ..... 80002408 8 GDS latch 30 0 800092E4 85 sort extent pool 69834 1 8000EC38 132 parallel query alloc buffer 80 1 8000E968 128 error message lists 22 1 80001400 4 process group creation 2615542 2 8000EAA0 131 parallel query stats 14 2 ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 8000CD08 101 Token Manager 1151107 2 8000CB18 97 global tx hash mapping 507846 2 80006378 63 cache buffer handles 315924 4 8000EA38 130 process queue reference 190993 5 80003E3C 26 channel handle pool latch 2391680 18 80003EAC 27 channel operations parent latch 4783425 24 80009B90 89 intra txn parallel recovery 32654 33 8000FCF8 141 fixed table rows for x$hs_session 161368 41 800012C8 1 process allocation 2391688 154 80009B28 88 parallel txn reco latch 174519 271 8000CCA0 100 library cache load lock 14947545 5958 ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 8000C8D0 94 user lock 13086412 6078 8000914C 82 list of block allocation 120650357 12024 80006A78 69 Checkpoint queue latch 154361751 17686 80009D34 90 sequence cache 64611720 32027 80009090 81 dml lock allocation 234465024 45351 800091C0 83 transaction allocation 214227648 48345 800096AC 86 undo global data 188271244 49641 800028A0 13 enqueue hash chains 373244264 131322 80007E04 72 redo allocation 439389808 201498 80001468 5 session idle bit 2039097976 204969 80002838 12 enqueues 471338482 273695 ADDR LATCH# NAME GETS SPIN_GETS -------- ---------- ------------------------------------------------ ---------- 80001330 2 session allocation 261826230 428312 800063E0 64 multiblock read objects 1380614923 1366278 800026B8 11 messages 207935758 1372606 80001218 0 latch wait list 203479569 1445342 80006310 62 cache buffers chains 3.8472E+10 2521699 8000A17C 92 row cache objects 1257586714 2555872 80007F80 74 redo writing 264722932 4458044 80006700 67 cache buffers lru chain 5664313769 30046921 8000CBD0 98 shared pool 122433688 59070585 8000CC38 99 library cache 4414533796 1037032730 142 rows selected. SQL> select startup_time from v$instance; STARTUP_T --------- 13-AUG-04 检查数据库启动时间 |
显然这极有可能是SQL的过度解析造成的。进一步我们检查v$sqlarea发现:
SQL> select sql_text,VERSION_COUNT,INVALIDATIONS,PARSE_CALLS,OPTIMIZER_MODE, PARSING_USER_ID,PARSING_SCHEMA_ID,ADDRESS,HASH_VALUE from v$sqlarea where version_count >1000; 2 SQL_TEXT ------------------------------------------------------------------------------------------------------------------------ VERSION_COUNT INVALIDATIONS PARSE_CALLS OPTIMIZER_MODE PARSING_USER_ID PARSING_SCHEMA_ID ADDRESS HASH_VALUE ------------- ------------- ----------- ------------------------- --------------- ----------------- -------- ---------- insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT, MSGLEN,MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE, ADDRUSER,SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN, SMCONT) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22) 7023 0 1596 MULTIPLE CHILDREN PRESENT 36 36 C82AF1C8 3974744754 |
通过前面几节的研究我们知道,如果这个sql有7023个子指针,那么意味着这些子指针都将存在于同一个Bucket的链表上。那么这也就意味着,如果同样SQL再次执行,Oracle将不得不搜索这个链表以寻找可以共享的SQL。这将导致大量的library cache latch的竞争。
这时候我开始猜测原因:
1.可能代码存在问题,在每次执行之前程序修改某些session参数,导致sql不能共性
2.可能是8.1.5的v$sqlarea记录存在问题,我们看到的结果是假象:)
3.Bug
Ok,我们的诊断不能停。最直接的我dump内存来看:
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4';
察看trace文件得到如下结果(摘录包含该段代码的片断):
BUCKET 21049: LIBRARY OBJECT HANDLE: handle=c82af1c8 name= insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,MSGLEN, MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER, SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22) hash=ece9cab2 timestamp=09-09-2004 12:51:29 namespace=CRSR flags=RON/TIM/PN0/LRG/[10010001] kkkk-dddd-llll=0000-0001-0001 lock=N pin=S latch=5 lwt=c82af1e0[c82af1e0,c82af1e0] ltm=c82af1e8[c82af1e8,c82af1e8] pwt=c82af1f8[c82af1f8,c82af1f8] ptm=c82af250[c82af250,c82af250] ref=c82af1d0[c82af1d0,c82af1d0] LIBRARY OBJECT: object=c1588e84 type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0 CHILDREN: size=7024 child# table reference handle ------ -------- --------- -------- 0 c1589040 c1589008 c668c2bc 1 c1589040 bfd179c4 c6ec9ee8 2 c1589040 bfd179e0 c2dd9b3c 3 c1589040 bfd179fc c5a46614 4 c1589040 bfd17a18 c35f1388 5 c1589040 bfd17a34 c77401bc 6 c1589040 bfd17a50 c4092838 7 c1589040 bfddb310 c6cd5258 8 c1589040 bfddb32c c63c6650 9 c1589040 bfddb348 c7e4e3d0 10 c1589040 bfddb364 c4c4b110 11 c1589040 bfddb380 c5950348 12 c1589040 bfddb39c c6c33aa4 13 c1589040 bfddb3b8 c672b0bc ........................................... .....ignore losts of child cursor here..... ........................................... 7001 bf595bc8 c641fba0 c6467890 7002 bf595bc8 c641fbbc c3417168 7003 bf595bc8 c641fbd8 c3417bb0 7004 bf595bc8 c641fbf4 c2fdccbc 7005 bf595bc8 c641fc10 c7f7ca50 7006 bf595bc8 c641fc2c c7f508ec 7007 bf595bc8 c641fc48 c268d8d8 7008 c641fcb8 c641fc64 bec61ed8 7009 c641fcb8 c641fc80 c4a6cc5c 7010 c641fcb8 c641fc9c c1a8aa34 7011 c641fcb8 c0ae4ea0 c0ae4ddc 7012 c641fcb8 c0ae4ebc bd55fe60 7013 c641fcb8 c0ae4ed8 c226914c 7014 c641fcb8 c0ae4ef4 c51dd2e0 7015 c641fcb8 c0ae4f10 c480c468 7016 c641fcb8 c0ae4f2c c60196d0 7017 c641fcb8 c0ae4f48 c4675d2c 7018 c641fcb8 c0ae4f64 bd5e2750 7019 c641fcb8 c0ae4f80 c09b1bb0 7020 c641fcb8 c0ae4f9c bf2d6044 7021 c641fcb8 c0ae4fb8 c332c1c4 7022 c641fcb8 c0ae4fd4 cbdde0f8 DATA BLOCKS: data# heap pointer status pins change ----- -------- -------- ------ ---- ------ 0 c3ef2c50 c1588f08 I/P/A 0 NONE |
查询v$sql得到相同的结果:
SQL> select CHILD_NUMBER,EXECUTIONS,OPTIMIZER_MODE,OPTIMIZER_COST,PARSING_USER_ID, 2 PARSING_SCHEMA_ID,ADDRESS,HASH_VALUE 3 from v$sql where HASH_VALUE='3974744754'; CHILD_NUMBER EXECUTIONS OPTIMIZER_ OPTIMIZER_COST PARSING_USER_ID PARSING_SCHEMA_ID ADDRESS HASH_VALUE ------------ ---------- ------ ---------- ------------- ------------ -------- ---------- 0 12966 CHOOSE 238150 36 36 C82AF1C8 3974744754 1 7111 CHOOSE 238150 36 36 C82AF1C8 3974744754 2 9160 CHOOSE 238150 36 36 C82AF1C8 3974744754 3 9127 CHOOSE 238150 36 36 C82AF1C8 3974744754 4 8109 CHOOSE 238150 36 36 C82AF1C8 3974744754 5 4386 CHOOSE 238150 36 36 C82AF1C8 3974744754 6 4913 CHOOSE 238150 36 36 C82AF1C8 3974744754 7 3764 CHOOSE 238150 36 36 C82AF1C8 3974744754 8 3287 CHOOSE 238150 36 36 C82AF1C8 3974744754 9 3156 CHOOSE 238150 36 36 C82AF1C8 3974744754 ..... 7015 1 CHOOSE 238150 36 36 C82AF1C8 3974744754 7016 1 CHOOSE 238150 36 36 C82AF1C8 3974744754 7017 0 CHOOSE 238150 36 36 C82AF1C8 3974744754 7018 9396 NONE 0 0 C82AF1C8 3974744754 7019 5008 CHOOSE 237913 36 36 C82AF1C8 3974744754 7020 625 CHOOSE 237913 36 36 C82AF1C8 3974744754 7021 10101 CHOOSE 237913 36 36 C82AF1C8 3974744754 7022 7859 CHOOSE 237913 36 36 C82AF1C8 3974744754 7023 rows selected. |
搜索Metalink,发现Bug:1210242,该Bug描述为:
On certain SQL statements cursors are not shared when TIMED_STATISTICS is enabled.
碰巧我这个数据库的TIMED_STATISTICS设置为True,修改TIMED_STATISTICS为False以后,观察v$sql,发现有效子指针很快下降到2个。
SQL> select CHILD_NUMBER,OPTIMIZER_COST,OPTIMIZER_MODE,EXECUTIONS,ADDRESS from v$sql where hash_value=3974744754 and OPTIMIZER_MODE='CHOOSE'; CHILD_NUMBER OPTIMIZER_COST OPTIMIZER_ EXECUTIONS ADDRESS ------------ -------------- ---------- ---------- -------- 0 238167 CHOOSE 63943 C82AF1C8 1 238300 CHOOSE 28915 C82AF1C8 |
SQL> select CHILD_NUMBER,OPTIMIZER_COST,OPTIMIZER_MODE,EXECUTIONS,ADDRESS from v$sql where hash_value=3974744754 and OPTIMIZER_MODE='CHOOSE'; CHILD_NUMBER OPTIMIZER_COST OPTIMIZER_ EXECUTIONS ADDRESS ------------ -------------- ---------- ---------- -------- 0 238702 CHOOSE 578124 C82AF1C8 |
对于这个问题,另外一个可选的方法是设置一个隐含参数:
_sqlexec_progression_cost = 0
这个参数的具体含义为:
SQL execution progression monitoring cost threshold
即:SQL执行进度监控成本阀值
这个参数根据COST来决定需要监控的SQL.执行进度监控会引入额外的函数调用和Row Sources这可能导致SQL的执行计划或成本发生改变,从而产生不同的子指针.
_sqlexec_progression_cost 的缺省值为1000,成本大于1000的所有SQL都会被监控
如果该参数设置为0,那么SQL的执行进度将不会被跟踪.
执行进度监控信息会被记录到V$SESSION_LONGOPS视图中,如果Time_statistics参数设置为False,那么这个信息就不会被记录.
所以,Time_statistics参数和_sqlexec_progression_cost是解决问题的两个途径。
通过查询我们也可以看到,在这个数据库中,OPTIMIZER_COST >1000的SQL主要有以下五个:
SQL> select distinct(sql_text) from v$sql where OPTIMIZER_COST >1000; SQL_TEXT -------------------------------------------------------------------------------- insert into sms_detail_error (msgdate,addruser,msgid,areaid,reason,spnumber,msgt ime,ifiddest,msqkey,servicecode,planid,feetype,feevalue,smcont,submittimes,submi tdate,submittime,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_rept) valu es (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,: b17,:b18,:b19,:b20) insert into sms_detail_success (msgdate,addruser,msgid,areaid,spnumber,msgtime,i fiddest,servicecode,planid,feetype,feevalue,smcont,submittimes,submitdate,submit time,respdate,resptime,reptdate,repttime,msqkey) values (:b0,:b1,:b2,:b3,:b4,:b5 ,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19) insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT, MSGLEN,MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER, SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values (:b0,:b1,:b2,:b3,:b 4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b2 1,:b22) insert into sms_resprept_error (msgdate,areaid,addruser,msgid,submittimes,submit date,submittime,msgid_gw,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_re pt,servicecode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12) insert into sms_statusrept (reptdate,addruser,msgid_gw,repttime,statustype,msgid _stus,msgstate,errorcode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7) |
SQL> select sql_text,version_count from v$sqlarea where version_count>10; SQL_TEXT VERSION_COUNT -------------------------------------------------------------------------------- insert into sms_detail_error (msgdate,addruser,msgid,areaid,reason,spnumber,msgt ime,ifiddest,msqkey,servicecode,planid,feetype,feevalue,smcont,submittimes,submi tdate,submittime,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_rept) valu es (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,: b17,:b18,:b19,:b20) 42 insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,MSGLEN, MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER, SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22) 7026 insert into sms_resprept_error (msgdate,areaid,addruser,msgid,submittimes,submit date,submittime,msgid_gw,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_re pt,servicecode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12) 301 insert into sms_statusrept (reptdate,addruser,msgid_gw,repttime,statustype,msgid _stus,msgstate,errorcode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7) 41 |
至此这个关于shared pool的问题找到了原因,并得以及时解决。(End)
编者注:您可以通过以下链接参与关于本文的讨论,直接和作者对话:
http://www.eygle.com/index-internal.htm 或
http://www.itpub.net/255975.html 关于shared pool的深入探讨(一)
http://www.itpub.net/257107.html 关于shared pool的深入探讨(二)
http://www.itpub.net/260363.html 关于shared pool的深入探讨(三)
http://www.itpub.net/261764.html 关于shared pool的深入探讨(四)
http://www.itpub.net/263051.html 关于shared pool的深入探讨(五)
http://www.itpub.net/265161.html 关于shared
pool的深入探讨(六)
http://epub.itpub.net/9/1.htm
相关文章推荐
- 关于shared pool的深入探讨(六)
- 关于shared pool的深入探讨(四)
- 关于shared pool的深入探讨(三)
- 关于shared pool的深入探讨(二)
- 关于shared pool的深入探讨(六)-高Latch竞争案例(version count 大量的子游标)
- 关于shared pool的深入探讨(三)
- 关于shared pool的深入探讨(一)
- 关于shared pool的深入探讨(六)-高Latch竞争案例 (摘自eygle的博客)
- 关于shared pool的深入探讨(四)
- 关于shared pool的深入探讨(六)-高Latch竞争案例
- 关于shared pool的深入探讨(一)
- 关于shared pool的深入探讨(二)
- 关于shared pool的深入探讨(五)
- 关于shared pool的深入探讨(一)
- 关于shared pool的深入探讨(二)
- 关于shared pool的深入探讨(六)-高Latch竞争案例
- 关于shared pool的深入探讨
- 关于shared pool的深入探讨(五)
- 关于String内存分配的深入探讨 (转)
- 关于ISA WPAD的深入探讨