您的位置:首页 > 其它

ORA-1652 临时表空间满了导致新会话数据不能入库诊断案例

2014-09-12 15:51 337 查看

2.定位问题

报错现象:

Fri Aug 17 13:37:39 EAT 2012

ORA-1652: unable to extend temp segment by 128 in tablespace MDSTEMP 显示不能扩展临时段,说明临时表空间已经被使用满了,空间不够。

说明:从metalink上官方解释,没有更多的空闲区分给这个临时段了,可以给表空间添加数据文件的方式来解决此问题,表面上是这样,我们更加的深入了解,是什么原因导致的临时段没有空间了呢,我们都知道临时段是记录排序和数据迁移的,现在深层次问题不是空间不够,过一会再执行sql可能就不报错了。是sql语句不够优化。因为当sql在批量DML操作的时候,会突发性占用大量临时空间排序,就会报临时段不够用,新数据此时不能入库!过一会空间释放后又可以入库了,要想解决此问题就需要sql优化。

The below is from metalink:

Error: ORA-1652

Text: unable to extend temp segment by %s in tablespace %s

------- -----------------------------------------------------------------------

Cause: Failed to allocate an extent for temp segment in tablespace.

Action: Use ALTER TABLESPACE ADD DATAFILE statement to add one or more

files to the tablespace indicated or create the object in another

tablespace.

select * from gnwebbrw12081720; 此时是有数据的,说明空间已经释放了

colfile_name for a35

selectfile_name,file_id,bytes/1024/1024,status,autoextensible TABLESPACE_NAME from DBA_TEMP_FILES;

FILE_NAME FILE_ID BYTES/1024/1024 STATUS TAB

----------------------------------- ---------- --------------- --------- ---

/oradata/mdsoss/temp01.dbf 1 24671 AVAILABLE YES

/oradata/mdsoss/mdstmp.dbf 2 20000 AVAILABLE NO MDSTEMP 不是自动扩展,如果是就没有上述问题了,但我们不建议使用数据文件自动扩展功能,不容易监控。看 24G + 20G 空间是没有问题的,一般都是sql写的不够好导致不必要排序。

3.解决方案

(1)重启实例,7*24 重启实例 smon进程可以释放sort段,但我们的库是不能down的

(2)增加数据文件,我的空间很紧张,不可以

(3)配置合理sort_area大小 已经配置完毕了,现PGA 4G sort_area_size 208M

(4)sql optimization 最佳方案

(5)总结哪些操作会导致临时表空间暴涨呢

什么操作在使用temp

- 索引创建或重建.

- ORDER BY or GROUP BY

- DISTINCT 操作.

- UNION & INTERSECT & MINUS

- Sort-Merge joins.

- Analyze 操作

- 有些异常将会引起temp暴涨

当处理以上操作时候呢,dba需要加倍关注temp使用情况?我们现在来看看谁使用这些临时段。

(5)临时表空间使用情况

select tablespace_name,current_users,total_blocks,used_blocks,free_blocks from v$sort_segment;

TABLESPACE_NAME CURRENT_USERS TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS

------------------- ------------- ------------ ----------- -----------

TEMP 1 3157760 128 3157632

MDSTEMP 24 2559872 2337152 222720 已经使用了92%

(6)谁在使用这些sort段

select username,session_addr,sqladdr,sqlhash from v$sort_usage;

USERNAME SESSION_ADDR SQLADDR SQLHASH

------------------------------ ---------------- ---------------- ----------

MDSOSS C0000008483ECFB8 C0000008512150B8 3342809064

SABOCOUSR C00000084B405E50 C00000033F867510 141205382

MDSOSS C00000084740E988 C0000008508AB1C0 409467952

MDSOSS C0000008483DE390 C00000033B8914F0 2951877480

MDSOSS C00000084A404460 C0000003404007A0 2584373469

MDSOSS C0000008483F5088 C00000033FA63E18 2245874020

MDSOSS C0000008483FFC48 C00000084D5B5F98 3000467390

MDSOSS C0000008483F5088 C00000033FA63E18 2245874020

MDSOSS C000000852404A60 C00000084DD6F598 1491833069

MDSOSS C0000008483EBA40 C00000084DE28990 1530468420

MDSOSS C0000008483FD158 C00000084E13A648 3459409074

MDSOSS C000000852404A60 C00000084DD6F598 1491833069

MDSOSS C000000852404A60 C00000084DD6F598 1491833069

MDSOSS C000000852404A60 C00000084DD6F598 1491833069

MDSOSS C0000008523DDBC8 C000000850AE75E0 2349095001

MDSOSS C0000008474068B8 C00000084CCC7790 2982079417

MDSOSS C0000008494501E0 C00000085098B400 3836056470

MDSOSS C00000084B3E5B10 C00000084DA9B990 719858768

MDSOSS C0000008523F3348 C00000084D96F830 2434343698

MDSOSS C00000084740D410 C0000008174B3540 2103182003

USERNAME SESSION_ADDR SQLADDR SQLHASH

------------------------------ ---------------- ---------------- ----------

MDSOSS C00000084A3FD908 C00000084C944888 3846639713

MDSOSS C0000008523D5AF8 C00000084DC6BB30 3158920754

MDSOSS C0000008483FA668 C00000032FCDF3D8 1691040305

MDSOSS C00000084943BFD8 C00000084CE4CA70 2036150049

MDSOSS C00000084A4170F0 C00000084D75FF68 2058192145

MDSOSS C0000008483CF768 C00000084DC096B8 3375505524

MDSOSS C000000849452CD0 C00000031E5A32F8 346930689

MDSOSS C000000849433F08 C000000340106C48 2344782277

MDSOSS C0000008484067A0 C000000344414D88 4097260861

MDSOSS C00000084B3C57D0 C000000850BF8528 3690121153

MDSOSS C00000084740D410 C0000008174B3540 2103182003

MDSOSS C0000008523DDBC8 C000000850AE75E0 2349095001

MDSOSS C0000008483EBA40 C00000084DE28990 1530468420

MDSOSS C0000008483FD158 C00000084E13A648 3459409074

MDSOSS C00000084B3CC328 C00000084D2185C8 497018778

MDSOSS C0000008523F73B0 C00000034046B670 3944536657

MDSOSS C00000084B3CC328 C00000084D2185C8 497018778

MDSOSS C00000084B3CC328 C00000084D2185C8 497018778

MDSOSS C0000008473E3A88 C00000084F13BD98 1450489357

MDSOSS C0000008473E2510 C0000008518A3DA0 2446627624

MDSOSS C0000008473E2510 C0000008518A3DA0 2446627624

USERNAME SESSION_ADDR SQLADDR SQLHASH

------------------------------ ---------------- ---------------- ----------

MDSOSS C0000008473E2510 C0000008518A3DA0 2446627624

MDSOSS C0000008523DDBC8 C000000850AE75E0 2349095001

MDSOSS C0000008523DDBC8 C000000850AE75E0 2349095001

MDSOSS C0000008473E2510 C0000008518A3DA0 2446627624

MDSOSS C0000008473E2510 C0000008518A3DA0 2446627624

MDSOSS C00000084B3CC328 C00000084D2185C8 497018778

MDSOSS C00000084A3F98A0 C00000030871D7E0 1117602678

MDSOSS C00000084941A720 C00000084F2EB838 1495689429

MDSOSS C00000084A3F98A0 C00000030871D7E0 1117602678

MDSOSS C00000084A3F98A0 C00000030871D7E0 1117602678

MDSOSS C00000084941A720 C00000084F2EB838 1495689429

MDSOSS C00000084941A720 C00000084F2EB838 1495689429

MDSOSS C00000084B3CC328 C00000084D2185C8 497018778

MDSOSS C00000084941A720 C00000084F2EB838 1495689429

MDSOSS C00000084941A720 C00000084F2EB838 1495689429

MDSOSS C00000084A3F98A0 C00000030871D7E0 1117602678

MDSOSS C00000084A3F98A0 C00000030871D7E0 1117602678

MDSOSS C00000084A3F8328 C000000851A28740 87302580

MDSOSS C00000084A3F8328 C000000851A28740 87302580

MDSOSS C00000084A3F8328 C000000851A28740 87302580

61 rows selected

此时我们知道MDSOSS用户的sql是导致临时段爆满的罪魁祸首

(7)找出哪些SQL语句在使用sort段,利用多表关联查询

select se.username,se.sid,su.extents,su.blocks*to_number(rtrim(p.value)) as Space,tablespace,segtype,sql_text from v$sort_usage su,v

$parameter p,v$session se,v$sql s where p.name='db_block_size' and su.session_addr=se.saddr and s.hash_value=su.sqlhash and

s.address=su.sqladdr order by se.username,se.sid;

USERNAME SID EXTENTS SPACE TABLESPACE SEGTYPE SQL_TEXT

MDSOSS 840 1 1048576 MDSTEMP DATA insert into tmp1768202 select * from

TPA_F_EMAIL_SMTP_SUM_5 where first_result

MDSOSS 840 1 1048576 MDSTEMP INDEX insert into tmp1768202 select * from

TPA_F_EMAIL_SMTP_SUM_5 where first_result

MDSOSS 840 1 1048576 MDSTEMP INDEX insert into tmp1768202 select * from

TPA_F_EMAIL_SMTP_SUM_5 where first_result

MDSOSS 840 1 1048576 MDSTEMP DATA insert into tmp1768202 select * from

TPA_F_EMAIL_SMTP_SUM_5 where first_result

MDSOSS 840 1 1048576 MDSTEMP DATA insert into tmp1768202 select * from

TPA_F_EMAIL_SMTP_SUM_5 where first_result

MDSOSS 877 14 14680064 MDSTEMP DATA CREATE GLOBAL TEMPORARY TABLE tmp708304 AS SELECT * from

TPA_S_SP_SUM_5 where

MDSOSS 879 21 22020096 MDSTEMP DATA insert into tmp521803

(ne_id,ne_type,first_result,sum_level,compress_date,regio

MDSOSS 879 1 1048576 MDSTEMP DATA insert into tmp521803

(ne_id,ne_type,first_result,sum_level,compress_date,regio

MDSOSS 879 1 1048576 MDSTEMP INDEX insert into tmp521803

(ne_id,ne_type,first_result,sum_level,compress_date,regio

MDSOSS 879 1 1048576 MDSTEMP DATA insert into tmp521803

(ne_id,ne_type,first_result,sum_level,compress_date,regio

MDSOSS 922 389 407896064 MDSTEMP DATA insert into tmp2489701 select * from GNWEBBRW12081720 s

where capturetime >= to

MDSOSS 946 6 6291456 MDSTEMP DATA select * from dual

MDSOSS 948 417 437256192 MDSTEMP DATA insert into tmp2462001 SELECT s.*,decode(m.flag,1,0,1)

result1_flag FROM HTTP_

MDSOSS 1028 1 1048576 MDSTEMP DATA CREATE GLOBAL TEMPORARY TABLE tmp1804004 AS SELECT * from

TPA_S_SP_SUM_5 where

MDSOSS 1050 28 29360128 MDSTEMP DATA insert into tmp2407302 select substr(imei,1,8)

imei,useragent brand from GnWebb

MDSOSS 1050 434 455081984 MDSTEMP DATA insert into tmp2407302 select substr(imei,1,8)

imei,useragent brand from GnWebb

MDSOSS 1066 1 1048576 MDSTEMP INDEX update tmp1737203 t set apn= nvl(apn,'-1'),lac= nvl

(lac,'-1'),ci= nvl(ci,'-1')

MDSOSS 1066 2 2097152 MDSTEMP DATA update tmp1737203 t set apn= nvl(apn,'-1'),lac= nvl

(lac,'-1'),ci= nvl(ci,'-1')

MDSOSS 1066 1 1048576 MDSTEMP DATA update tmp1737203 t set apn= nvl(apn,'-1'),lac= nvl

(lac,'-1'),ci= nvl(ci,'-1')

MDSOSS 1066 2 2097152 MDSTEMP DATA update tmp1737203 t set apn= nvl(apn,'-1'),lac= nvl

(lac,'-1'),ci= nvl(ci,'-1')

MDSOSS 1066 1 1048576 MDSTEMP INDEX update tmp1737203 t set apn= nvl(apn,'-1'),lac= nvl

(lac,'-1'),ci= nvl(ci,'-1')

MDSOSS 1074 1 1048576 MDSTEMP DATA insert into tmp2336801 SELECT s.*,decode(m.flag,1,0,1)

result1_flag FROM WAP_R

MDSOSS 1074 34 35651584 MDSTEMP DATA insert into tmp2336801 SELECT s.*,decode(m.flag,1,0,1)

result1_flag FROM WAP_R

MDSOSS 1086 4 4194304 MDSTEMP DATA insert into tmp1261403

(ne_id,ne_type,first_result,sum_level,compress_date,regi

MDSOSS 1086 4 4194304 MDSTEMP DATA insert into tmp1261403

(ne_id,ne_type,first_result,sum_level,compress_date,regi

MDSOSS 1086 2 2097152 MDSTEMP INDEX insert into tmp1261403

(ne_id,ne_type,first_result,sum_level,compress_date,regi

MDSOSS 1086 900 943718400 MDSTEMP DATA insert into tmp1261403

(ne_id,ne_type,first_result,sum_level,compress_date,regi

我把占用sort段空间最多的几个SQL列举出来,请看SID:922 948 434 会话ID 不外乎都与gnwebbrw http XDR数据有关,当用gnweb数据插入临时表(数据分析)时产生了大量的排序从而占用大量排序区,此时我们要分析了这种排序是否是有必要的。

(8)排序区分配

排序区域的分配 - 专用服务器分配sort area,排序区域在PGA! - 共享服务器分配sort area,排序区域在UGA. (UGA在shared_pool中分配).

我们采用的全是dedicated server 模式,pga_aggregate_target参数决定sort_area的大于,这时sort_area应该是pga总内存的5%,我们PGA=4G,

sort_area=4G*5%=204.8M 和我们从spotlight上监控的结果一样。

小结,如何从根本上降低临时表空间的膨胀呢?方法有2个:

1 设置合理的pga或sort_area_size

2 优化引起disk sort的sql语句

最后我把发现占用sort段较多的process kill掉,来临时缓解排序段的使用,提交给研发,调整SQL

selectsid,paddr from v$session where sid=1177; 查询进程地址

select p.spid,se.sid,se.username,se.machine from v$sort_usage su,v$process p,v$session se,v$sql s where se.paddr='C0000008482BF9B0' and p.ADDR='C0000008482BF9B0'; 通过进程地址查询进程号,kill 进程

Oracle@TJGRDB:[/oracle/admin/mdsoss/bdump] kill -9 29944

oracle@TJGRDB:[/oracle/admin/mdsoss/bdump] ps -ef | grep 15072

oracle 10587 8420 1 22:58:58 pts/tb 0:00 grep 15072

oracle 15072 1 250 20:40:04 ? 127:20 oraclemdsoss (LOCAL=NO)

oracle@TJGRDB:[/oracle/admin/mdsoss/bdump] kill -9 15072

oracle@TJGRDB:[/oracle/admin/mdsoss/bdump] ps -ef | grep 23259

oracle 11625 8420 1 23:02:02 pts/tb 0:00 grep 23259

oracle 23259 1 253 21:00:15 ? 53:31 oraclemdsoss (LOCAL=NO)

oracle@TJGRDB:[/oracle/admin/mdsoss/bdump] kill -9 23259

*******************************************************************************************************************************************

“ORA-1652: unable to extend temp segment”

临时表空间被资源中的多个会话共享,并且quotas不能限制每个用户使用的临时表空间数量,当临时表空间被填充满时,任何尝试获得更多的临时表空间的用户将会得到“ORA-1652: unable to extend temp segment”错误。

Oracle排序的基础

Oracle会话首先在内存中进行排序,当Oracle需要存储数据到临时表或者为哈西排序建立哈希表时,并且也会首先在内存中进行操作,虽然这两个操作不需要排序操作,但是它们在Oracle中的处理方式是相同的。

如果操作使用内存超过了阈值,Oracle会将操作分为多个较小的操作以使每个可以在内存中操作。部分结果将会被写入磁盘的临时表空间,任何一个会话可以使用的内存数依赖于初始化参数的设置,如果workarea_size_policy为auto,则由pga_aggregate_target控制,否则由sort_area_size, hash_area_size,和bitmap_merge_area_size控制内存的使用。

当排序操作太大以至于不能在内存中执行时,Oracle将在临时表空间中分配空间以执行操作。临时段在临时表空间中—也称为“排序段”,sys拥有,而不是执行排序操作的用户。通常每个表空间中只有一个排序段,因为多个会话可以共享排序段,用户使用临时表空间不需要在其上有quota,事实上会被Oracle忽略。

临时表空间中只能包含临时段,因此临时段上的操作不会产生undo和redo,同时分配临时段给用户也不需要记录在dd或位图块上。因为临时表空间不会超过创建它的会话的生命周期。

一个SQL可以有多个排序操作,一个数据库会话同时可以有多个活动的SQL,当到磁盘上的排序结果不再需要时,其在排序段中的块会标记为不再使用并可以被分配给其他排序操作。

如果发生以下情况排序操作将会失败:排序段中没有不再使用的块;临时表空间中没有空间可以供排序段分配额外的分区。这在大多数情况下会导致语句发生以下错误:“ORA-1652: unable to extend temp segment.”并记录在实例的alert log中。

不过需要注意的是ORA-1652并不全部指示临时表空间问题,ALTER TABLE…MOVE也会发生该错误,如果目标表空间没有足够的空间容纳移动的表空间。

识别由于缺少临时表空间失败的SQL语句

虽然Oracle logs ORA-1652错误到警告日志中通知dba发生了空间问题,但是Oracle不会识别那条错误的语句。

可以使用Oracle诊断事件跟踪ORA-1652事件,该诊断事件的影响很小,仅在发生ORA-1652错误时才会写入信息。

ALTER SESSION SET EVENTS '1652 trace name errorstack';

在会话范围内设置;

ALTER SYSTEM SET EVENTS '1652 trace name errorstack';

永久性设置:

ALTER SYSTEM SET EVENT = '1652 trace name errorstack' SCOPE = SPFILE;

还可以在其他会话内使用“oradebug event”进行跟踪。

可以使用以下语句关闭:

ALTER SYSTEM RESET EVENT SCOPE = SPFILE SID = '*';

ALTER SYSTEM SET EVENTS '1652 trace name context off';

ALTER SESSION SET EVENTS '1652 trace name context off';

如果一个SQL语句由于缺少临时表空间失败并且ORA-1652诊断事件已经激活,那么

Oracle服务器进程将会在遇到错误时在user_dump_dest目录的跟踪文件写入错误信息,并且警告日志会指示出相关跟踪文件。如:

Tue Jan 2 17:21:14 2007

Errors in file

/u01/app/oracle/admin/rpkprod/udump/rpkprod_ora_10847.trc: ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

跟踪文件中将包含类似如下的信息:

Oracle Database 10g Release 10.2.0.2.0 - 64bit Production

ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_2

System name: SunOS

Node name: rpk

Release: 5.8

Version: Generic_108528-27

Machine: sun4u

Instance name: rpkprod

Redo thread mounted by this instance: 1

Oracle process number: 18

Unix process pid: 10847, image: [email=oracle@rpk]oracle@rpk[/email] (TNS V1-V3)

*** ACTION NAME:() 2007-01-02 17:21:14.871

*** MODULE NAME:(SQL*Plus) 2007-01-02 17:21:14.871

*** ERVICE NAME:(SYS$USERS) 2007-01-02 17:21:14.871

*** SESSION ID:(130.13512) 2007-01-02 17:21:14.871

*** 2007-01-02 17:21:14.871

ksedmp: internal or fatal error

ORA-01652: unable to extend temp segment by 128 in tablespace TEMP

Current SQL statement for this session:

SELECT "A1"."INVOICE_ID", "A1"."INVOICE_NUMBER", "A1"."INVOICE_DAT

E", "A1"."CUSTOMER_ID", "A1"."CUSTOMER_NAME", "A1"."INVOICE_AMOUNT",

"A1"."PAYMENT_TERMS", "A1"."OPEN_STATUS", "A1"."GL_DATE", "A1"."ITE

M_COUNT", "A1"."PAYMENTS_TOTAL"

FROM "INVOICE_SUMMARY_VIEW" "A1"

ORDER BY "A1"."CUSTOMER_NAME", "A1"."INVOICE_NUMBER"

----- Call Stack Trace -----

虽然使用这种方法可以得到相当详细的信息,但是需要注意的是,这种方法捕获到的语句并不一定是问题的根源,因为有可能前一个语句消耗了99.9%临时空间,而第二个语句被捕获到跟踪文件中。

跟踪文件同时还会包含如调用栈跟踪和二进制栈dump,该信息通常没有价值,除非想要了解Oracle内部。

通常不应该在实例级别设置该诊断事件。如果经常在批处理期间遇到该错误,可以在批处理开始设置alter session进行会话级跟踪。

监控临时表空间

可以在发生错误前实时监控数据库中临时表空间的使用情况,以避免出现错误。任何时候,Oracle都可以告诉dba数据库中的临时表空间,会话使用的排序空间,以及语句使用的排序空间。所有这些信息都可以通过v$得到。

临时段

Oracle会在第一次执行磁盘排序时创建排序段,并且根据需要扩展,但是不会收缩。

SELECT A.tablespace_name tablespace,

D.mb_total,

SUM(A.used_blocks * D.block_size) / 1024 / 1024 mb_used,

D.mb_total - SUM(A.used_blocks * D.block_size) / 1024 / 1024 mb_free

FROM v$sort_segment A,

(SELECT B.name, C.block_size, SUM(C.bytes) / 1024 / 1024 mb_total

FROM v$tablespace B, v$tempfile C

WHERE B.ts# = C.ts#

GROUP BY B.name, C.block_size) D

WHERE A.tablespace_name = D.name

GROUP by A.tablespace_name, D.mb_total;

会话使用的排序空间

SELECT S.sid || ',' || S.serial# sid_serial,

S.username,

S.osuser,

P.spid,

S.module,

S.program,

SUM(T.blocks) * TBS.block_size / 1024 / 1024 mb_used,

T.tablespace,

COUNT(*) sort_ops

FROM v$sort_usage T, v$session S, dba_tablespaces TBS, v$process P

WHERE T.session_addr = S.saddr

AND S.paddr = P.addr

AND T.tablespace = TBS.tablespace_name

GROUP BY S.sid,

S.serial#,

S.username,

S.osuser,

P.spid,

S.module,

S.program,

TBS.block_size,

T.tablespace

ORDER BY sid_serial;

语句使用的临时空间

SELECT S.sid || ',' || S.serial# sid_serial,

S.username,

T.blocks * TBS.block_size / 1024 / 1024 mb_used,

T.tablespace,

T.sqladdr address,

Q.hash_value,

Q.sql_text

FROM v$sort_usage T, v$session S, v$sqlarea Q, dba_tablespaces TBS

WHERE T.session_addr = S.saddr

AND T.sqladdr = Q.address(+)

AND T.tablespace = TBS.tablespace_name

ORDER BY S.sid;
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: