您的位置:首页 > Web前端

11g New Feature dbms_stats Pending and Published Statistics

2012-01-13 16:52 405 查看
 Prior to Oracle 11g, the default behaviour was to make statistics available for the optimizer to use as soon as they were gathered.
While statistics are required to enable the optimizer to generate optimal execution plans, sometimes just by gathering fresh statistics tried and trusted execution plans can abruptly change and thereby adversely affect application performance.

In Oracle 11g we can now ‘defer’ the publication of statistics until they have been tried and tested and once we have confirmed that the execution plans are correct and optimal. Statistics remain in the pending state until they are published and the parameter
optimizer_use_pending_statistics (default value FALSE) which when set at the session level will enables us to test the pending statistics independently of other database sessions.

Let us look at a test case using the SALES table in the SH schema.

If we see the query below, it appears that the data for the column CHANNEL_ID is skewed where majority of the rows have the value 3 and a very small majority have the value 9.

SQL> conn sh/sh

Connected.

SQL> select channel_id,count(*) from sales group by channel_id order by 2;

CHANNEL_ID   COUNT(*)

---------- ----------

         9       2074

         4     118416

         2     258025

         3     540328

But both the queries below are performing a full table scan of the SALES table when ideally it should be performing an index scan when the value 9 is used in the WHERE clause considering it accounts for a very small proportion of the rows in the SALES table.

SQL> set autot trace exp stat

SQL> set linesize 130

SQL> SELECT

S.cust_id

,S.prod_id

,SUM(S.amount_sold)

FROM sh.sales S

WHERE

channel_id=9

GROUP BY S.cust_id, S.prod_id

ORDER BY S.cust_id, S.prod_id

/

  2    3    4    5    6    7    8    9   10 

2074 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 4109827725

----------------------------------------------------------------------------------------------

| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |       |  1746 | 29682 |   490   (2)| 00:00:06 |       |       |

|   1 |  SORT GROUP BY       |       |  1746 | 29682 |   490   (2)| 00:00:06 |       |       |

|   2 |   PARTITION RANGE ALL|       |  1746 | 29682 |   489   (2)| 00:00:06 |     1 |    28 |

|*  3 |    TABLE ACCESS FULL | SALES |  1746 | 29682 |   489   (2)| 00:00:06 |     1 |    28 |

----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   3 - filter("CHANNEL_ID"=9)

Note

-----

   - SQL plan baseline "SQL_PLAN_6k40mr9y0bzzyebe69af4" used for this statement

Statistics

----------------------------------------------------------

        559  recursive calls

         24  db block gets

       1827  consistent gets

       1716  physical reads

      11328  redo size

      54398  bytes sent via SQL*Net to client

       2041  bytes received via SQL*Net from client

        140  SQL*Net roundtrips to/from client

         18  sorts (memory)

          0  sorts (disk)

       2074  rows processed

SQL> SELECT

S.cust_id

,S.prod_id

,SUM(S.amount_sold)

FROM sh.sales S

WHERE

channel_id=3

GROUP BY S.cust_id, S.prod_id

ORDER BY S.cust_id, S.prod_id

/

  2    3    4    5    6    7    8    9   10 

258968 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 4109827725

------------------------------------------------------------------------------------------------------

| Id  | Operation            | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |

------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |       |   359K|  5966K|       |  2831   (1)| 00:00:34 |       |       |

|   1 |  SORT GROUP BY       |       |   359K|  5966K|    16M|  2831   (1)| 00:00:34 |       |       |

|   2 |   PARTITION RANGE ALL|       |   543K|  9029K|       |   490   (2)| 00:00:06 |     1 |    28 |

|*  3 |    TABLE ACCESS FULL | SALES |   543K|  9029K|       |   490   (2)| 00:00:06 |     1 |    28 |

------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   3 - filter("CHANNEL_ID"=3)

Note

-----

   - SQL plan baseline "SQL_PLAN_gyqybj20pgvzcebe69af4" used for this statement

Statistics

----------------------------------------------------------

         10  recursive calls

          0  db block gets

       1722  consistent gets

       1647  physical reads

          0  redo size

    6582470  bytes sent via SQL*Net to client

     190427  bytes received via SQL*Net from client

      17266  SQL*Net roundtrips to/from client

          1  sorts (memory)

          0  sorts (disk)

     258968  rows processed

As earlier mentioned, statistics for a table are published immediately by default, so we use the DBMS_STATS package to change this default behaviour.

SQL> select sysdate from dual;

SYSDATE

-------------------

2012-01-13 14:27:05

SQL> exec dbms_stats.set_table_prefs('SH','SALES','PUBLISH','FALSE');

PL/SQL procedure successfully completed.

SQL> select dbms_stats.get_prefs('PUBLISH', 'SH', 'SALES' ) FROM DUAL;

DBMS_STATS.GET_PREFS('PUBLISH','SH','SALES')

----------------------------------------------------------------------------------------------------------------------------------

FALSE

Since we observed that the data in the table is skewed, we will gather histograms which may help the optimizer make more informed decisions when generating an execution plan.

Note that since we have disabled the publishing of statistics for the table SALES, even though we have analyzed the table, the LAST_ANALYZED column shows that the table has not been recently analyzed.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES',METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

SQL> SELECT LAST_ANALYZED FROM DBA_TABLES WHERE TABLE_NAME='SALES';

LAST_ANALYZED

-------------------

2012-01-13 13:59:32

However, we query the DBA_TAB_PENDING_STATS view, it shows that the SALES table which is partitioned has been analyzed now.

SQL> SELECT TABLE_NAME,PARTITION_NAME ,LAST_ANALYZED  FROM DBA_TAB_PENDING_STATS;

TABLE_NAME                     PARTITION_NAME                 LAST_ANALYZED

------------------------------ ------------------------------ -------------------

SALES                          SALES_Q4_2003                  2012-01-13 14:29:05

SALES                          SALES_Q4_2002                  2012-01-13 14:29:05

SALES                          SALES_Q4_2001                  2012-01-13 14:29:05

SALES                          SALES_Q4_2000                  2012-01-13 14:29:04

SALES                          SALES_Q4_1999                  2012-01-13 14:29:04

SALES                          SALES_Q4_1998                  2012-01-13 14:29:04

SALES                          SALES_Q3_2003                  2012-01-13 14:29:04

SALES                          SALES_Q3_2002                  2012-01-13 14:29:04

SALES                          SALES_Q3_2001                  2012-01-13 14:29:03

SALES                          SALES_Q3_2000                  2012-01-13 14:29:03

SALES                          SALES_Q3_1999                  2012-01-13 14:29:03

TABLE_NAME         
c954
            PARTITION_NAME                 LAST_ANALYZED

------------------------------ ------------------------------ -------------------

SALES                          SALES_Q3_1998                  2012-01-13 14:29:03

SALES                          SALES_Q2_2003                  2012-01-13 14:29:03

SALES                          SALES_Q2_2002                  2012-01-13 14:29:03

SALES                          SALES_Q2_2001                  2012-01-13 14:29:03

SALES                          SALES_Q2_2000                  2012-01-13 14:29:03

SALES                          SALES_Q2_1999                  2012-01-13 14:29:01

SALES                          SALES_Q2_1998                  2012-01-13 14:29:01

SALES                          SALES_Q1_2003                  2012-01-13 14:29:01

SALES                          SALES_Q1_2002                  2012-01-13 14:29:01

SALES                          SALES_Q1_2001                  2012-01-13 14:29:01

SALES                          SALES_Q1_2000                  2012-01-13 14:29:01

TABLE_NAME                     PARTITION_NAME                 LAST_ANALYZED

------------------------------ ------------------------------ -------------------

SALES                          SALES_Q1_1999                  2012-01-13 14:29:01

SALES                          SALES_Q1_1998                  2012-01-13 14:29:01

SALES                          SALES_H2_1997                  2012-01-13 14:29:00

SALES                          SALES_H1_1997                  2012-01-13 14:29:00

SALES                          SALES_1996                     2012-01-13 14:29:00

SALES                          SALES_1995                     2012-01-13 14:29:00

SALES                                                         2012-01-13 14:29:05

29 rows selected.

We would like to see if the query execution plan has changed or improved after the recent gathering of statistics. We can use the parameter optimizer_use_pending_statistics for this by setting the value to TRUE (default is FALSE)

SQL> alter session set optimizer_use_pending_statistics=TRUE;

Session altered.

We now find that the optimizer is performing an index scan when the value 9 is used as a predicate value for CHANNEL_ID while it performs a full table scan when the value 3 is used which is a much more optimal plan than the original one before gathering
fresh statistics.

SQL> explain plan

for

SELECT

     S.cust_id

    ,S.prod_id

    ,SUM(S.amount_sold)

  FROM sh.sales S

 WHERE

channel_id=3

 GROUP BY S.cust_id, S.prod_id

 ORDER BY S.cust_id, S.prod_id

/  2    3    4    5    6    7    8    9   10   11   12 

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT

----------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 4109827725

------------------------------------------------------------------------------------------------------

| Id  | Operation            | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |

------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |       |   359K|  5966K|       |  2823   (1)| 00:00:34 |       |       |

|   1 |  SORT GROUP BY       |       |   359K|  5966K|    16M|  2823   (1)| 00:00:34 |       |       |

|   2 |   PARTITION RANGE ALL|       |   540K|  8965K|       |   490   (2)| 00:00:06 |     1 |    28 |

|*  3 |    TABLE ACCESS FULL | SALES |   540K|  8965K|       |   490   (2)| 00:00:06 |     1 |    28 |

------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT

----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   3 - filter("CHANNEL_ID"=3)

Note

-----

   - SQL plan baseline "SQL_PLAN_gyqybj20pgvzcebe69af4" used for this statement

19 rows selected.

SQL> explain plan

for

SELECT

     S.cust_id

    ,S.prod_id

    ,SUM(S.amount_sold)

  FROM sh.sales S

 WHERE

channel_id=9

 GROUP BY S.cust_id, S.prod_id

 ORDER BY S.cust_id, S.prod_id

/

  2    3    4    5    6    7    8    9   10   11   12 

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT

----------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 1760806683

-------------------------------------------------------------------------------------------------------------------------

| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

-------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                    |                   |   921 |  2420 |    17   (6)| 00:00:01 |       |       |

|   1 |  SORT GROUP BY                      |                   |   921 |  2420 |    17   (6)| 00:00:01 |       |       |

|   2 |   PARTITION RANGE SINGLE            |                   |   921 |  2420 |    16   (0)| 00:00:01 |     5 |     5 |

|   3 |    TABLE ACCESS BY LOCAL INDEX ROWID| SALES             |   921 |  2420 |    16   (0)| 00:00:01 |     5 |     5 |

|   4 |     BITMAP CONVERSION TO ROWIDS     |                   |       |       |            |          |       |       |

|*  5 |       BITMAP INDEX SINGLE VALUE     | SALES_CHANNEL_BIX |       |       |            |          |     5 |     5 |

-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   5 - access("CHANNEL_ID"=9)

16 rows selected.

Since we have now confirmed that the plan is acceptable and the new statistics can be used by the optimizer, we PUBLISH the statistics now for the SALES table. Note that the LAST_ANALYZED column is also updated and the DBA_TAB_PENDING_STATS shows that there
are no more pending statistics which need to be published.

SQL> EXEC DBMS_STATS.PUBLISH_PENDING_STATS ('SH','SALES');

PL/SQL procedure successfully completed.

SQL> SELECT LAST_ANALYZED FROM DBA_TABLES WHERE TABLE_NAME='SALES';

LAST_ANALYZED

-------------------

2012-01-13 14:29:05

SQL> SELECT COUNT(*) FROM DBA_TAB_PENDING_STATS;

  COUNT(*)

----------

         0

SQL> alter session set optimizer_use_pending_statistics=FALSE;

Session altered.

 

 
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息