您的位置:首页 > 其它

FAQ: How to read an AWR report. [ID 1359094.1]

2011-12-15 17:50 393 查看
FAQ: How to read an AWR report. [ID 1359094.1]
修改时间 14-NOV-2011 类型 HOWTO 状态 PUBLISHED
In this Document

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#GOAL]Goal

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#FIX]Solution

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref1]1. Load Profile

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref2]2. Instance Efficiency

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref3]3. Top 5 Timed Events

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref4]4. SQL Statistics

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref5]5. Latch Activity

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref6]5. ADDM reports can be reviewed along with AWR to assist in diagnosis.

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref7]6. Some notable wait events:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#REF]References

Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.2.0.2 - Release: 10.2 to 11.2

Information in this document applies to any platform.

Goal

The goal is to assist customers in how to interpret an AWR report.

Solution

Gather AWR reports during times when performance is acceptable as baseline. This would give statistics for comparison during the time of issue. Then, the load, top wait events, parsings statistics, etc can be compared. This will give better picture of what has changed. Furthermore, keep records of any other changes such as os, application, and database changes. This is especially important when opening srs to support, as performance issues can be complex with many variables. The comparison can frequently narrow down the issue.

When creating snapshots, make sure the snapshot time is not too long. By default, the snapshots are created in 1 hour increments which is fine. If comparing good performance to bad performance, make sure the snapshot are both 1 hour or same time of snapshot.

First, find exactly what is slow or slower:

a. application

b. certain program or sql

c. os

This will expedite the solution process and guide the DBA in running the right diagnostics.

If overall performance of the database is slow, AWR is good method of diagnostics.

To start reviewing AWR, check the load profile first under 'Load Profile':

1. Load Profile

Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
---------------       ---------------
Redo size:          4,585,414.80          3,165,883.14
Logical reads:             94,185.63             65,028.07
Block changes:             40,028.57             27,636.71
Physical reads:              2,206.12              1,523.16
Physical writes:              3,939.97              2,720.25
User calls:                 50.08                 34.58
Parses:                 26.96                 18.61
Hard parses:                  1.49                  1.03
Sorts:                 18.36                 12.68
Logons:                  0.13                  0.09
Executes:              4,925.89              3,400.96
Transactions:                  1.45

% Blocks changed per Read:   42.50    Recursive Call %:    99.19
Rollback per transaction %:   59.69       Rows per Sort:  1922.64


This load profile shows high redo activity with high physical writes. There are more writes than reads on this load with 42% block changes.

Furthermore, there is less hard parsing compared the soft parses. If there is a mutex wait as top wait such as LIBRARY CACHE: MUTEX X, then it may be related to the high parse rate.

When comparing to baseline, check to see if the load has changed by comparing redo size, users calls, and parsing.

2. Instance Efficiency

Next review the Instance Efficiency:

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %:   99.91       Redo NoWait %:  100.00
Buffer  Hit   %:   98.14    In-memory Sort %:   99.98
Library Hit   %:   99.91        Soft Parse %:   94.48
Execute to Parse %:   99.45         Latch Hit %:   99.97
Parse CPU to Parse Elapsd %:   71.23     % Non-Parse CPU:   99.00


The 94.48 soft parse shows small hard parsing. The high execute to parse % indicates good usage of cursors. Generally, we want the statistics here close to 100%, but remember that some systems this may not be the case. For example, in a data warehouse environment, hard parsing may be higher due to usage of materialized views and, or histograms. So again comparing to baseline report when performance was good is important.

3. Top 5 Timed Events

Next move on to the top waits:

Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
db file scattered read           10,152,564      81,327      8   29.6   User I/O
db file sequential read          10,327,231      75,878      7   27.6   User I/O
CPU time                                         56,207          20.5
read by other session             4,397,330      33,455      8   12.2   User I/O
PX Deq Credit: send blkd             31,398      26,576    846    9.7      Other
-------------------------------------------------------------


In this report, the top waits are I/O reads. The db file scattered read is usually a full tablescan I/O reads or index fast full scan. This wait is when session is waiting on multiblock IO. The db file sequential read is usually index I/O reads and single block reads.

If the top waits are I/O related, check for slow I/O from the 'Tablespace IO Stats ':

Tablespace IO Stats                       DB/Inst: VMWREP/VMWREP  Snaps: 1-15
-> ordered by IOs (Reads + Writes) desc

Tablespace
------------------------------
Av      Av     Av                       Av     Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd       Writes Writes/s      Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
TS_TX_DATA
14,246,367     283    7.6     4.6  145,263,880    2,883  3,844,161    8.3
USER
204,834       4   10.7     1.0   17,849,021      354     15,249    9.8
UNDOTS1
19,725       0    3.0     1.0   10,064,086      200      1,964    4.9
AE_TS
4,287,567      85    5.4     6.7          932        0    465,793    3.7
TEMP
2,022,883      40    0.0     5.8      878,049       17          0    0.0
UNDOTS3
1,310,493      26    4.6     1.0      941,675       19         43    0.0
TS_TX_IDX
1,884,478      37    7.3     1.0       23,695        0     73,703    8.3
SYSAUX
346,094       7    5.6     3.9      112,744        2          0    0.0
SYSTEM
101,771       2    7.9     3.5       25,098        0        653    2.7


Specifically, look for the timing under Rd(ms). If it is higher than 20 milliseconds per read and reads are high on busy tablespace, then start investigating the potential I/O bottleneck from the os. For further investigation, the following note may be helpful:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=223117.1]Note:223117.1 Troubleshooting I/O-related waits

Although db file scattered read and db file sequential I/O related, also remember these waits may be indicating that top sqls are not using optimal paths. If there is high db file scattered read, the sqls may not have optimal statistics to use indexes, or there may be missing indexes or not optimal indexes. Furthermore, high db file sequential may indicate sqls are using unselective indexes. So these waits may point to poor execution plans for sqls. So next step is to check top consuming sqls from the AWR report.

4. SQL Statistics

Awr shows a list of sql statistics:

* SQL ordered by Elapsed Time
* SQL ordered by CPU Time
* SQL ordered by Gets
* SQL ordered by Reads
* SQL ordered by Executions
* SQL ordered by Parse Calls
* SQL ordered by Sharable Memory
* SQL ordered by Version Count
* SQL ordered by Cluster Wait Time
* Complete List of SQL Text


The sql statistics may be related to the top wait events. So with top waits as db file scattered read and db file sequential, the 'SQL ordered by Reads' would be most relevant, as the buffer gets may indicate which sqls may need further investigation to tune.

So going to sql ordered by gets, the following sql is shown to have high buffer gets:

SQL ordered by Gets                       DB/Inst: VMWREP/VMWREP  Snaps: 1-15
-> Resources reported for PL/SQL code includes the resources used by all SQL
statements called by the code.
-> Total Buffer Gets:   4,745,943,815
-> Captured SQL account for     122.2% of Total

SQL ordered by Gets                       DB/Inst: VMWREP/VMWREP  Snaps: 1-15
-> Resources reported for PL/SQL code includes the resources used by all SQL
statements called by the code.
-> Total Buffer Gets:   4,745,943,815
-> Captured SQL account for     122.2% of Total

Gets              CPU     Elapsed
Buffer Gets   Executions    per Exec   %Total Time (s)  Time (s)    SQL Id
-------------- ------------ ------------ ------ -------- --------- -------------
1,228,753,877          168  7,314,011.2   25.9  8022.46   8404.73 5t1y1nvmwp2

SELECT ADDRESSID",CURRENT$."ADDRESSTYPEID",CURRENT$URRENT$."ADDRESS3",CURRENT$."CITY",CURRENT$."ZIP",CURRENT$."STATE",CURRENT$."PHO
NECOUNTRYCODE",CURRENT$."PHONENUMBER",CURRENT$."PHONEEXTENSION",CURRENT$."FAXCOU

1,039,875,759   62,959,363         16.5   21.9  5320.27   5618.96 grr4mg7ms81
Module: DBMS_SCHEDULER
INSERT INTO "ADDRESS_RDONLY" ("ADDRESSID","ADDRESSTYPEID","CUSTOMERID","
ADDRESS1","ADDRESS2","ADDRESS3","CITY","ZIP","STATE","PHONECOUNTRYCODE","PHONENU

854,035,223          168  5,083,543.0   18.0  5713.50   7458.95 4at7cbx8hnz
SELECT "CUSTOMERID",CURRENT$."ISACTIVE",CURRENT$."FIRSTNAME",CURRENT$."LASTNAME",CU
RRENT$."ORGANIZATION",CURRENT$."DATEREGISTERED",CURRENT$."CUSTOMERSTATUSID",CURR
ENT$."LASTMODIFIEDDATE",CURRENT$."SOURCE",CURRENT$."EMPLOYEEDEPT",CURRENT$.


Notice the buffer gets are pretty high totalling 4.7 billion buffer gets. It is worth investigating the top sqls to make sure the sqls are taking optimal path. So check the explain plan to see if the plan has changed. Remember this may be 'normal', as some environments are very busy. So again compare the baseline with the report with the issue to see if there is a difference.

If there are waits such as mutex waits such as "Cursor: pin S wait on X wait", then look for high parses and high version count under 'SQL ordered by Parse Calls' and 'SQL ordered by Version Count'.

The following note will further assist in troubleshooting such wait: .mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1349387.1]Note:1349387.1 Troubleshooting 'cursor: pin S wait on X' waits.

If latches come up as the top wait, then look under Latch Activity to review the statistics of the latches.

5. Latch Activity

Look for high latch sleeps under Latch Sleep Breakdown for latch free waits:

Latch Sleep Breakdown

* ordered by misses desc

Latch Sleep Breakdown                     DB/Inst: VMWREP/VMWREP  Snaps: 1-15
-> ordered by misses desc

Latch Name
----------------------------------------
Get Requests      Misses      Sleeps  Spin Gets   Sleep1   Sleep2   Sleep3
-------------- ----------- ----------- ---------- -------- -------- --------
cache buffers chains
2,881,936,948 	3,070,271      41,336  3,031,456        0        0        0
row cache objects
941,375,571   1,215,395         852  1,214,606        0        0        0
object queue header operation
763,607,977     949,376      30,484    919,782        0        0        0
cache buffers lru chain
376,874,990     705,162       3,192    702,090        0        0        0


Here the top latch is cache buffers chains. Although the gets are high at 2.8 billion buffer gets, the sleeps at 41,336 is low. Average number of sleeps per miss ratio (Avg Slps/Miss) is low.

For latch free waits, review the following note to identify what type of latches to investigate:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=413942.1]Note:413942.1 How to Identify Which Latch is Associated with a "latch free" wait

5. ADDM reports can be reviewed along with AWR to assist in diagnosis.

Here is sample ADDM report from .mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=250655.1]Note:250655.1How to use the Automatic Database Diagnostic Monitor:

DETAILED ADDM REPORT FOR TASK 'SCOTT_ADDM' WITH ID 5

----------------------------------------------------

Analysis Period: 17-NOV-2003 from 09:50:21 to 10:35:47

Database ID/Instance: 494687018/1

Snapshot Range: from 1 to 3

Database Time: 4215 seconds

Average Database Load: 1.5 active sessions

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FINDING 1: 65% impact (2734 seconds)

------------------------------------

PL/SQL execution consumed significant database time.

RECOMMENDATION 1: SQL Tuning, 65% benefit (2734 seconds)

ACTION: Tune the PL/SQL block with SQL_ID fjxa1vp3yhtmr. Refer to

the "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL

User's Guide and Reference"

RELEVANT OBJECT: SQL statement with SQL_ID fjxa1vp3yhtmr

BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END;

FINDING 2: 35% impact (1456 seconds)

------------------------------------

SQL statements consuming significant database time were found.

RECOMMENDATION 1: SQL Tuning, 35% benefit (1456 seconds)

ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID

gt9ahqgd5fmm2.

RELEVANT OBJECT: SQL statement with SQL_ID gt9ahqgd5fmm2 and

PLAN_HASH 547793521

UPDATE bigemp SET empno = ROWNUM

FINDING 3: 20% impact (836 seconds)

-----------------------------------

The throughput of the I/O subsystem was significantly lower than expected.

RECOMMENDATION 1: Host Configuration, 20% benefit (836 seconds)

ACTION: Consider increasing the throughput of the I/O subsystem.

Oracle's recommended solution is to stripe all data file using

the SAME methodology. You might also need to increase the

number of disks for better performance.

RECOMMENDATION 2: Host Configuration, 14% benefit (584 seconds)

ACTION: The performance of file

D:\ORACLE\ORADATA\V1010\UNDOTBS01.DBF was significantly worse

than other files. If striping all files using the SAME

methodology is not possible, consider striping this file over

multiple disks.

RELEVANT OBJECT: database file

"D:\ORACLE\ORADATA\V1010\UNDOTBS01.DBF"

SYMPTOMS THAT LED TO THE FINDING:

Wait class "User I/O" was consuming significant database time.

(34% impact [1450 seconds])

FINDING 4: 11% impact (447 seconds)

-----------------------------------

Undo I/O was a significant portion (33%) of the total database I/O.

NO RECOMMENDATIONS AVAILABLE

SYMPTOMS THAT LED TO THE FINDING:

The throughput of the I/O subsystem was significantly lower than

expected. (20% impact [836 seconds])

Wait class "User I/O" was consuming significant database time.

(34% impact [1450 seconds])

FINDING 5: 9.9% impact (416 seconds)

------------------------------------

Buffer cache writes due to small log files were consuming significant

database time.

RECOMMENDATION 1: DB Configuration, 9.9% benefit (416 seconds)

ACTION: Increase the size of the log files to 796 M to hold at

least 20 minutes of redo information.


Addm report gives possible recommendations in more readable format. However, ADDM should be interpreted along with AWR statistics for accurate diagnostics.

6. Some notable wait events:

CPU waits

Just because CPU comes as top wait in AWR may not indicate a problem. However, if performance is slow with high CPU usage, then start investigating the wait. First, check to see if a sql is taking most CPU under SQL ordered by CPU Time in AWR:

SQL ordered by CPU Time DB/Inst: VMWREP/VMWREP Snaps: 1-15

-> Resources reported for PL/SQL code includes the resources used by all SQL

statements called by the code.

-> % Total is the CPU Time divided into the Total CPU Time times 100

-> Total CPU Time (s): 56,207

-> Captured SQL account for 114.6% of Total

CPU Elapsed CPU per % Total

Time (s) Time (s) Executions Exec (s) % Total DB Time SQL Id

---------- ---------- ------------ ----------- ------- ------- -------------

20,349    24,884    168      121.12   36.2     9.1     7bbhgqykv3cm9

Module: DBMS_SCHEDULER

DECLARE job BINARY_INTEGER := :job; next_date TIMESTAMP WITH TIME ZONE := :myda

te; broken BOOLEAN := FALSE; job_name VARCHAR2(30) := :job_name; job_subname

VARCHAR2(30) := :job_subname; job_owner VARCHAR2(30) := :job_owner; job_start

TIMESTAMP WITH TIME ZONE := :job_start; job_scheduled_start TIMESTAMP WITH TIME


Although CPU time in seconds is 20,349, the total DB time is only 9.1%. Furthermore, the top waits from this AWR were I/O waits per above.

Check to see if other waits follow the high CPU wait. For example, cursor: pin S waits may cause the high CPU with following known issue:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=6904068.8]Note:6904068.8Bug 6904068 - High CPU usage when there are "cursor: pin S" waits

If a process outside of the database is taking high CPU, run oswatcher or other os diagnostic tools to find which process is taking high CPU:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=433472.1]Note:433472.1OS Watcher For Windows (OSWFW) User Guide

Here is a note on how to further diagnose high CPU:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=164768.1]Note:164768.1Troubleshooting: High CPU Utilization

Log file synch waits

When user session commits or rolls back, the log writer flushes the redo from log buffer to redo logs.

Check to see if there is too many frequent commits from the application and if batch commits can be done. Furthermore, make sure the redo logs are on fast disk and not on RAID 5 which is not appropriate for applications with frequent writes.

Reference Note .mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=34592.1]Note:34592.1WAITEVENT: "log file sync"

Buffer busy wait

This wait is when a session is waiting for buffer from the buffer cache, but the buffer is busy either being read by another session or another session is holding it in incompatible mode. In order to find which block is busy and why, use the following note:

Reference Note.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=34405.1]Note:34405.1WAITEVENT: "buffer busy waits"

If the block is data block, eliminate the hot block, increase the freelist on the table, or use locally managed tablespace. If the block is an index block, rebuild the index, partition the index, or use reverse key index.

Cursor: pin S wait on X

Cursor wait has to do with parsing. So check for high parses and sqls with high version count.

Also check for known bugs, in the following notes:

Reference Note.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1298015.1]Note:1298015.1WAITEVENT: "cursor: pin S wait on X"

Details in tuning and resolving this wait is in the following note:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1349387.1]Note:1349387.1 Troubleshooting 'cursor: pin S wait on X' waits.

Here are additional notes on how to read portions of AWR report:

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=786554.1]Note:786554.1 How to Read PGA Memory Advisory Section in AWR and Statspack Reports

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=754639.1]Note:754639.1 How to Read Buffer Cache Advisory Section in AWR and Statspack Reports.

Here is a link on how to read statspack report:

.mht!x-usc:http://www.oracle.com/technetwork/database/focus-areas/performance/statspack-opm4-134117.pdf]http://www.oracle.com/technetwork/database/focus-areas/performance/statspack-opm4-134117.pdf

References

.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=228913.1]NOTE:228913.1 - Systemwide Tuning using STATSPACK Report
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: