您的位置:首页 > 数据库 > Oracle

MAXAIO导致Oracle启动hang问题

2013-02-21 09:53 786 查看
Oracle数据库,10.2.0.4 for linux x86,在正常重启时,到open阶段僵死。在操作系统上看到一些因计划任务启动的用户进程CPU使用率几乎100%,很明显处于等待状态。在Oracle的bdump目录下也很快生成有trc文件。这些文件的内容关键点是这样:

WARNING:io_submit failed due to kernel limitations MAXAIO for process=0 pending aio=0

WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65536

WARNING:Oracle process running out of OS kernel I/O resources (1)

从字面上理解是,是操作系统的MAXAIO限制了Oracle用户进程操作。

查了查资料,又说是bug,但给出了两种解决方法:一,增加操作系统内核参数AIO-MAX-NR的值;二,禁用磁盘AIO机制。我采用了修改系统内核参数AIO-MAX-NR的方法来解决这个问题。

1、可以临时修改内核参数aio-max-nr

# echo > /proc/sys/fs/aio-max-nr 1048576

2、永久修改内核参数aio-max-nr,需要在/etc/sysctl.conf加上下面这句

fs.aio-max-nr = 1048576

用下列命令使参数生效

#/sbin/sysctl -p

附,top显示结果

Tasks: 568 total, 6 running, 562 sleeping, 0 stopped, 0 zombie

Cpu(s): 20.4%us, 0.1%sy, 0.0%ni, 79.1%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 132051284k total, 117157820k used, 14893464k free, 197072k buffers

Swap: 5751260k total, 2404292k used, 3346968k free, 114662552k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

12975 oracle 25 0 1687m 25m 19m R 99.8 0.0 9:38.01 ora_p004_oncz

12981 oracle 25 0 1687m 25m 19m R 99.8 0.0 9:38.00 ora_p007_oncz

12983 oracle 25 0 1687m 25m 19m R 99.8 0.0 9:38.01 ora_p008_oncz

12985 oracle 25 0 1687m 25m 19m R 99.8 0.0 9:38.00 ora_p009_oncz

12002 oracle 25 0 1968m 1.6g 1.3g R 90.5 1.3 21:25.03 ora_j000_ofdb

附,bdump目录下的trc文件信息

/u01/app/oracle/admin/oncz/bdump/oncz_p008_12983.trc

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options

ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1

System name: Linux

Node name: db-172-17-2-8

Release: 2.6.18-348.el5

Version: #1 SMP Tue Jan 8 17:53:53 EST 2013

Machine: x86_64

Instance name: oncz

Redo thread mounted by this instance: 1

Oracle process number: 29

Unix process pid: 12983, image: oracle@db-172-17-2-8 (P008)

*** SERVICE NAME:() 2013-02-19 15:55:08.764

*** SESSION ID:(142.1) 2013-02-19 15:55:08.764

ORA-27090: Message 27090 not found; product=RDBMS; facility=ORA

Additional information: 3

Additional information: 128

Additional information: 65536

WARNING:io_submit failed due to kernel limitations MAXAIO for process=0 pending aio=0

WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65536

WARNING:Oracle process running out of OS kernel I/O resources (1)

WARNING:Oracle process running out of OS kernel I/O resources (1)

WARNING:Oracle process running out of OS kernel I/O resources (1)

WARNING:Oracle process running out of OS kernel I/O resources (1)

附,参考资料

Bug 9949948 Linux: Process spin under ksfdrwat0 if OS Async IO not configured high enough

This note gives a brief overview of bug 9949948.

The content was last updated on: 28-OCT-2011

Click here for details of each of the sections below.

Affects:

Product (Component) Oracle Server (Rdbms)

Range of versions believed to be affected Versions >= 10.2.0.4 but BELOW 11.1

Versions confirmed as being affected

10.2.0.5

Platforms affected

Linux X86-64bit

Linux 32bit

It is believed to be a regression in default behaviour thus:

Regression introduced in 10.2.0.5

Fixed:

This issue is fixed in

11.1.0.6 (Base Release)

10.2.0.5.2 Patch Set Update

10.2.0.5 Patch 5 on Windows Platforms

Symptoms:

Related To:

Hang (Process Spins)

Waits for "i/o slave wait"

DISK_ASYNCH_IO

Description

This problem is introduced in 10.2.0.5

It only affects platforms where Oracle has to reserve async IO slots,

such as Linux platforms.

If the OS async IO layer is underconfigured and an Oracle process

cannot get sufficient AIO slots then rather than reverting to

using non AIO call the process may go into an infinite spin

under ksfdrwat0.

Rediscovery notes:

The spin will be preceded by messages in the trace

file of the form:

WARNING:io_submit failed due to kernel limitations MAXAIO

for process=0 pending aio=0

WARNING:asynch I/O kernel limits is set at AIO-MAX-NR=65536 AIO-NR=65518

WARNING:1 Oracle process running out of OS kernelI/O resources aiolimit=0

Notice specifically that the value for aiolimit is reported as "0"

for this bug.

The process then spins in ksfdrwat0 typically with a stack showing

skgfqio ()

ksfdgo ()

ksfdwtio ()

ksfdwat1 ()

ksfdrwat0 () <<< Spin point

ksfdblock ()

kcflwi ()

kcflci ()

kcblci ()

kcblcio ()

kcblgt ()

kcbldrget ()

It will show repeated waits for "i/o slave wait", which can be

misleading as that is normally considered an idle wait event.

Workaround

Raise the OS AIO limits such that the number of concurrent slot

requirements never exceeds the OS limit.

ie: Increase AIO-MAX-NR

OR

Disable async IO (Set DISK_ASYNCH_IO=FALSE)

See Note:1313555.1 for additional notes on this issue.

Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. For questions about this bug please consult Oracle Support.

References

Bug:9949948 (This link will only work for PUBLISHED bugs)

Note:245840.1 Information on the sections in this article
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐