FAQ: How to read an AWR report. [ID 1359094.1]
2011-12-15 17:50
393 查看
FAQ: How to read an AWR report. [ID 1359094.1] | ||
修改时间 14-NOV-2011 类型 HOWTO 状态 PUBLISHED |
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#GOAL]Goal
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#FIX]Solution
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref1]1. Load Profile
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref2]2. Instance Efficiency
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref3]3. Top 5 Timed Events
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref4]4. SQL Statistics
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref5]5. Latch Activity
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref6]5. ADDM reports can be reviewed along with AWR to assist in diagnosis.
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#aref7]6. Some notable wait events:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=HOWTO&id=1359094.1#REF]References
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.2.0.2 - Release: 10.2 to 11.2Information in this document applies to any platform.
Goal
The goal is to assist customers in how to interpret an AWR report.Solution
Gather AWR reports during times when performance is acceptable as baseline. This would give statistics for comparison during the time of issue. Then, the load, top wait events, parsings statistics, etc can be compared. This will give better picture of what has changed. Furthermore, keep records of any other changes such as os, application, and database changes. This is especially important when opening srs to support, as performance issues can be complex with many variables. The comparison can frequently narrow down the issue.When creating snapshots, make sure the snapshot time is not too long. By default, the snapshots are created in 1 hour increments which is fine. If comparing good performance to bad performance, make sure the snapshot are both 1 hour or same time of snapshot.
First, find exactly what is slow or slower:
a. application
b. certain program or sql
c. os
This will expedite the solution process and guide the DBA in running the right diagnostics.
If overall performance of the database is slow, AWR is good method of diagnostics.
To start reviewing AWR, check the load profile first under 'Load Profile':
1. Load Profile
Load Profile ~~~~~~~~~~~~ Per Second Per Transaction --------------- --------------- Redo size: 4,585,414.80 3,165,883.14 Logical reads: 94,185.63 65,028.07 Block changes: 40,028.57 27,636.71 Physical reads: 2,206.12 1,523.16 Physical writes: 3,939.97 2,720.25 User calls: 50.08 34.58 Parses: 26.96 18.61 Hard parses: 1.49 1.03 Sorts: 18.36 12.68 Logons: 0.13 0.09 Executes: 4,925.89 3,400.96 Transactions: 1.45 % Blocks changed per Read: 42.50 Recursive Call %: 99.19 Rollback per transaction %: 59.69 Rows per Sort: 1922.64
This load profile shows high redo activity with high physical writes. There are more writes than reads on this load with 42% block changes.
Furthermore, there is less hard parsing compared the soft parses. If there is a mutex wait as top wait such as LIBRARY CACHE: MUTEX X, then it may be related to the high parse rate.
When comparing to baseline, check to see if the load has changed by comparing redo size, users calls, and parsing.
2. Instance Efficiency
Next review the Instance Efficiency:Instance Efficiency Percentages (Target 100%) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Buffer Nowait %: 99.91 Redo NoWait %: 100.00 Buffer Hit %: 98.14 In-memory Sort %: 99.98 Library Hit %: 99.91 Soft Parse %: 94.48 Execute to Parse %: 99.45 Latch Hit %: 99.97 Parse CPU to Parse Elapsd %: 71.23 % Non-Parse CPU: 99.00
The 94.48 soft parse shows small hard parsing. The high execute to parse % indicates good usage of cursors. Generally, we want the statistics here close to 100%, but remember that some systems this may not be the case. For example, in a data warehouse environment, hard parsing may be higher due to usage of materialized views and, or histograms. So again comparing to baseline report when performance was good is important.
3. Top 5 Timed Events
Next move on to the top waits:Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time Wait Class ------------------------------ ------------ ----------- ------ ------ ---------- db file scattered read 10,152,564 81,327 8 29.6 User I/O db file sequential read 10,327,231 75,878 7 27.6 User I/O CPU time 56,207 20.5 read by other session 4,397,330 33,455 8 12.2 User I/O PX Deq Credit: send blkd 31,398 26,576 846 9.7 Other -------------------------------------------------------------
In this report, the top waits are I/O reads. The db file scattered read is usually a full tablescan I/O reads or index fast full scan. This wait is when session is waiting on multiblock IO. The db file sequential read is usually index I/O reads and single block reads.
If the top waits are I/O related, check for slow I/O from the 'Tablespace IO Stats ':
Tablespace IO Stats DB/Inst: VMWREP/VMWREP Snaps: 1-15 -> ordered by IOs (Reads + Writes) desc Tablespace ------------------------------ Av Av Av Av Buffer Av Buf Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms) -------------- ------- ------ ------- ------------ -------- ---------- ------ TS_TX_DATA 14,246,367 283 7.6 4.6 145,263,880 2,883 3,844,161 8.3 USER 204,834 4 10.7 1.0 17,849,021 354 15,249 9.8 UNDOTS1 19,725 0 3.0 1.0 10,064,086 200 1,964 4.9 AE_TS 4,287,567 85 5.4 6.7 932 0 465,793 3.7 TEMP 2,022,883 40 0.0 5.8 878,049 17 0 0.0 UNDOTS3 1,310,493 26 4.6 1.0 941,675 19 43 0.0 TS_TX_IDX 1,884,478 37 7.3 1.0 23,695 0 73,703 8.3 SYSAUX 346,094 7 5.6 3.9 112,744 2 0 0.0 SYSTEM 101,771 2 7.9 3.5 25,098 0 653 2.7
Specifically, look for the timing under Rd(ms). If it is higher than 20 milliseconds per read and reads are high on busy tablespace, then start investigating the potential I/O bottleneck from the os. For further investigation, the following note may be helpful:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=223117.1]Note:223117.1 Troubleshooting I/O-related waits
Although db file scattered read and db file sequential I/O related, also remember these waits may be indicating that top sqls are not using optimal paths. If there is high db file scattered read, the sqls may not have optimal statistics to use indexes, or there may be missing indexes or not optimal indexes. Furthermore, high db file sequential may indicate sqls are using unselective indexes. So these waits may point to poor execution plans for sqls. So next step is to check top consuming sqls from the AWR report.
4. SQL Statistics
Awr shows a list of sql statistics:* SQL ordered by Elapsed Time * SQL ordered by CPU Time * SQL ordered by Gets * SQL ordered by Reads * SQL ordered by Executions * SQL ordered by Parse Calls * SQL ordered by Sharable Memory * SQL ordered by Version Count * SQL ordered by Cluster Wait Time * Complete List of SQL Text
The sql statistics may be related to the top wait events. So with top waits as db file scattered read and db file sequential, the 'SQL ordered by Reads' would be most relevant, as the buffer gets may indicate which sqls may need further investigation to tune.
So going to sql ordered by gets, the following sql is shown to have high buffer gets:
SQL ordered by Gets DB/Inst: VMWREP/VMWREP Snaps: 1-15 -> Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code. -> Total Buffer Gets: 4,745,943,815 -> Captured SQL account for 122.2% of Total SQL ordered by Gets DB/Inst: VMWREP/VMWREP Snaps: 1-15 -> Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code. -> Total Buffer Gets: 4,745,943,815 -> Captured SQL account for 122.2% of Total Gets CPU Elapsed Buffer Gets Executions per Exec %Total Time (s) Time (s) SQL Id -------------- ------------ ------------ ------ -------- --------- ------------- 1,228,753,877 168 7,314,011.2 25.9 8022.46 8404.73 5t1y1nvmwp2 SELECT ADDRESSID",CURRENT$."ADDRESSTYPEID",CURRENT$URRENT$."ADDRESS3",CURRENT$."CITY",CURRENT$."ZIP",CURRENT$."STATE",CURRENT$."PHO NECOUNTRYCODE",CURRENT$."PHONENUMBER",CURRENT$."PHONEEXTENSION",CURRENT$."FAXCOU 1,039,875,759 62,959,363 16.5 21.9 5320.27 5618.96 grr4mg7ms81 Module: DBMS_SCHEDULER INSERT INTO "ADDRESS_RDONLY" ("ADDRESSID","ADDRESSTYPEID","CUSTOMERID"," ADDRESS1","ADDRESS2","ADDRESS3","CITY","ZIP","STATE","PHONECOUNTRYCODE","PHONENU 854,035,223 168 5,083,543.0 18.0 5713.50 7458.95 4at7cbx8hnz SELECT "CUSTOMERID",CURRENT$."ISACTIVE",CURRENT$."FIRSTNAME",CURRENT$."LASTNAME",CU RRENT$."ORGANIZATION",CURRENT$."DATEREGISTERED",CURRENT$."CUSTOMERSTATUSID",CURR ENT$."LASTMODIFIEDDATE",CURRENT$."SOURCE",CURRENT$."EMPLOYEEDEPT",CURRENT$.
Notice the buffer gets are pretty high totalling 4.7 billion buffer gets. It is worth investigating the top sqls to make sure the sqls are taking optimal path. So check the explain plan to see if the plan has changed. Remember this may be 'normal', as some environments are very busy. So again compare the baseline with the report with the issue to see if there is a difference.
If there are waits such as mutex waits such as "Cursor: pin S wait on X wait", then look for high parses and high version count under 'SQL ordered by Parse Calls' and 'SQL ordered by Version Count'.
The following note will further assist in troubleshooting such wait: .mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1349387.1]Note:1349387.1 Troubleshooting 'cursor: pin S wait on X' waits.
If latches come up as the top wait, then look under Latch Activity to review the statistics of the latches.
5. Latch Activity
Look for high latch sleeps under Latch Sleep Breakdown for latch free waits:Latch Sleep Breakdown * ordered by misses desc Latch Sleep Breakdown DB/Inst: VMWREP/VMWREP Snaps: 1-15 -> ordered by misses desc Latch Name ---------------------------------------- Get Requests Misses Sleeps Spin Gets Sleep1 Sleep2 Sleep3 -------------- ----------- ----------- ---------- -------- -------- -------- cache buffers chains 2,881,936,948 3,070,271 41,336 3,031,456 0 0 0 row cache objects 941,375,571 1,215,395 852 1,214,606 0 0 0 object queue header operation 763,607,977 949,376 30,484 919,782 0 0 0 cache buffers lru chain 376,874,990 705,162 3,192 702,090 0 0 0
Here the top latch is cache buffers chains. Although the gets are high at 2.8 billion buffer gets, the sleeps at 41,336 is low. Average number of sleeps per miss ratio (Avg Slps/Miss) is low.
For latch free waits, review the following note to identify what type of latches to investigate:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=413942.1]Note:413942.1 How to Identify Which Latch is Associated with a "latch free" wait
5. ADDM reports can be reviewed along with AWR to assist in diagnosis.
Here is sample ADDM report from .mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=250655.1]Note:250655.1How to use the Automatic Database Diagnostic Monitor:DETAILED ADDM REPORT FOR TASK 'SCOTT_ADDM' WITH ID 5 ---------------------------------------------------- Analysis Period: 17-NOV-2003 from 09:50:21 to 10:35:47 Database ID/Instance: 494687018/1 Snapshot Range: from 1 to 3 Database Time: 4215 seconds Average Database Load: 1.5 active sessions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FINDING 1: 65% impact (2734 seconds) ------------------------------------ PL/SQL execution consumed significant database time. RECOMMENDATION 1: SQL Tuning, 65% benefit (2734 seconds) ACTION: Tune the PL/SQL block with SQL_ID fjxa1vp3yhtmr. Refer to the "Tuning PL/SQL Applications" chapter of Oracle's "PL/SQL User's Guide and Reference" RELEVANT OBJECT: SQL statement with SQL_ID fjxa1vp3yhtmr BEGIN EMD_NOTIFICATION.QUEUE_READY(:1, :2, :3); END; FINDING 2: 35% impact (1456 seconds) ------------------------------------ SQL statements consuming significant database time were found. RECOMMENDATION 1: SQL Tuning, 35% benefit (1456 seconds) ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID gt9ahqgd5fmm2. RELEVANT OBJECT: SQL statement with SQL_ID gt9ahqgd5fmm2 and PLAN_HASH 547793521 UPDATE bigemp SET empno = ROWNUM FINDING 3: 20% impact (836 seconds) ----------------------------------- The throughput of the I/O subsystem was significantly lower than expected. RECOMMENDATION 1: Host Configuration, 20% benefit (836 seconds) ACTION: Consider increasing the throughput of the I/O subsystem. Oracle's recommended solution is to stripe all data file using the SAME methodology. You might also need to increase the number of disks for better performance. RECOMMENDATION 2: Host Configuration, 14% benefit (584 seconds) ACTION: The performance of file D:\ORACLE\ORADATA\V1010\UNDOTBS01.DBF was significantly worse than other files. If striping all files using the SAME methodology is not possible, consider striping this file over multiple disks. RELEVANT OBJECT: database file "D:\ORACLE\ORADATA\V1010\UNDOTBS01.DBF" SYMPTOMS THAT LED TO THE FINDING: Wait class "User I/O" was consuming significant database time. (34% impact [1450 seconds]) FINDING 4: 11% impact (447 seconds) ----------------------------------- Undo I/O was a significant portion (33%) of the total database I/O. NO RECOMMENDATIONS AVAILABLE SYMPTOMS THAT LED TO THE FINDING: The throughput of the I/O subsystem was significantly lower than expected. (20% impact [836 seconds]) Wait class "User I/O" was consuming significant database time. (34% impact [1450 seconds]) FINDING 5: 9.9% impact (416 seconds) ------------------------------------ Buffer cache writes due to small log files were consuming significant database time. RECOMMENDATION 1: DB Configuration, 9.9% benefit (416 seconds) ACTION: Increase the size of the log files to 796 M to hold at least 20 minutes of redo information.
Addm report gives possible recommendations in more readable format. However, ADDM should be interpreted along with AWR statistics for accurate diagnostics.
6. Some notable wait events:
CPU waitsJust because CPU comes as top wait in AWR may not indicate a problem. However, if performance is slow with high CPU usage, then start investigating the wait. First, check to see if a sql is taking most CPU under SQL ordered by CPU Time in AWR:
SQL ordered by CPU Time DB/Inst: VMWREP/VMWREP Snaps: 1-15 -> Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code. -> % Total is the CPU Time divided into the Total CPU Time times 100 -> Total CPU Time (s): 56,207 -> Captured SQL account for 114.6% of Total CPU Elapsed CPU per % Total Time (s) Time (s) Executions Exec (s) % Total DB Time SQL Id ---------- ---------- ------------ ----------- ------- ------- ------------- 20,349 24,884 168 121.12 36.2 9.1 7bbhgqykv3cm9 Module: DBMS_SCHEDULER DECLARE job BINARY_INTEGER := :job; next_date TIMESTAMP WITH TIME ZONE := :myda te; broken BOOLEAN := FALSE; job_name VARCHAR2(30) := :job_name; job_subname VARCHAR2(30) := :job_subname; job_owner VARCHAR2(30) := :job_owner; job_start TIMESTAMP WITH TIME ZONE := :job_start; job_scheduled_start TIMESTAMP WITH TIME
Although CPU time in seconds is 20,349, the total DB time is only 9.1%. Furthermore, the top waits from this AWR were I/O waits per above.
Check to see if other waits follow the high CPU wait. For example, cursor: pin S waits may cause the high CPU with following known issue:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=6904068.8]Note:6904068.8Bug 6904068 - High CPU usage when there are "cursor: pin S" waits
If a process outside of the database is taking high CPU, run oswatcher or other os diagnostic tools to find which process is taking high CPU:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=433472.1]Note:433472.1OS Watcher For Windows (OSWFW) User Guide
Here is a note on how to further diagnose high CPU:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=164768.1]Note:164768.1Troubleshooting: High CPU Utilization
Log file synch waits
When user session commits or rolls back, the log writer flushes the redo from log buffer to redo logs.
Check to see if there is too many frequent commits from the application and if batch commits can be done. Furthermore, make sure the redo logs are on fast disk and not on RAID 5 which is not appropriate for applications with frequent writes.
Reference Note .mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=34592.1]Note:34592.1WAITEVENT: "log file sync"
Buffer busy wait
This wait is when a session is waiting for buffer from the buffer cache, but the buffer is busy either being read by another session or another session is holding it in incompatible mode. In order to find which block is busy and why, use the following note:
Reference Note.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=34405.1]Note:34405.1WAITEVENT: "buffer busy waits"
If the block is data block, eliminate the hot block, increase the freelist on the table, or use locally managed tablespace. If the block is an index block, rebuild the index, partition the index, or use reverse key index.
Cursor: pin S wait on X
Cursor wait has to do with parsing. So check for high parses and sqls with high version count.
Also check for known bugs, in the following notes:
Reference Note.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1298015.1]Note:1298015.1WAITEVENT: "cursor: pin S wait on X"
Details in tuning and resolving this wait is in the following note:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1349387.1]Note:1349387.1 Troubleshooting 'cursor: pin S wait on X' waits.
Here are additional notes on how to read portions of AWR report:
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=786554.1]Note:786554.1 How to Read PGA Memory Advisory Section in AWR and Statspack Reports
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=754639.1]Note:754639.1 How to Read Buffer Cache Advisory Section in AWR and Statspack Reports.
Here is a link on how to read statspack report:
.mht!x-usc:http://www.oracle.com/technetwork/database/focus-areas/performance/statspack-opm4-134117.pdf]http://www.oracle.com/technetwork/database/focus-areas/performance/statspack-opm4-134117.pdf
References
.mht!x-usc:https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=228913.1]NOTE:228913.1 - Systemwide Tuning using STATSPACK Report相关文章推荐
- FAQ: How to Use AWR reports to Diagnose Database Performance Issues [ID 1359094.1]
- How to Interpret the OS stats section of an AWR report (文档 ID 762526.1)
- How to Generate an AWR Report and Create Baselines (文档 ID 748642.1)
- FAQ: How to Use AWR reports to Diagnose Database Performance Issues [ID 1359094.1]
- How to Read Buffer Cache Advisory Section in AWR and Statspack Reports. (文档 ID 754639.1)
- How to Read PGA Memory Advisory Section in AWR and Statspack Reports (文档 ID 786554.1)
- How to open/read/write a local file from an applet
- How to create an OCM Response file to apply patch in silent mode - opatch silent (Doc ID 966023.1)
- How to Read an Engineering Research Paper
- How to Export and Import the AWR Repository From One Database to Another (文档 ID 785730.1)
- How to Create an Report Style
- How to Add the JVM Component to an Existing Oracle Database (Doc ID 1461562.1)
- How To Make A DFF Read Only Through Form Personalisations? (文档 ID 1289789.1)
- Master Note- How to diagnose Database Performance - FAQ [ID 402983.1]
- How to Read an Engineering Research Paper
- Master Note: How to diagnose Database Performance - FAQ [ID 402983.1]
- How to get an X11 Window from a Process ID?
- How to Send an Email Using UTL_SMTP with Authenticated Mail Server. (文档 ID 885522.1)
- How to allow a user to click on TextBlocks which return an integer ID in the click handler
- How to Create an OCM Response file to Apply a Patch (文档 ID 966023.1)