您的位置:首页 > 数据库

高效SQL语句必杀技

2012-04-01 14:32 288 查看
No SQL,No cost. SQL语句是造成数据库开销最大的部分。而不良SQL写法直接导致数据库系统性能下降的情形比比皆是。那么如何才能称得

上高效的SQL语句呢?一是查询优化器为当前的SQL语句生成最佳的执行计划,保证数据读写使用最佳路径;二是设置合理的物理存储结构,如表

的类型,字段的顺序,字段的数据类型等。本文主要描述如何编写高效的SQL语句并给出示例。下面的描述主要分为三个部分,一是编写高效SQL

语句,二是使用索引提高查询性能的部分,三是总结部分。

一、编写高效SQL语句

[sql]
view plaincopyprint?

1) 选择最有效的表名顺序(仅适用于RBO模式)

ORACLE的解析器总是按照从右到左的顺序处理FROM子句中的表名,因此FROM子句中最后的一个表将作为驱动表被优先处理。当FROM子句

存在多个表的时候,应当考虑将表上记录最少的那个表置于FROM的最右端作为基表。Oracle会首先扫描基表(FROM子句中最后的那个表)并对

记录进行排序,然后扫描第二个表(FROM子句中最后第二个表),最后将所有从第二个表中检索出的记录与第一个表中合适记录进行合并。如

果有3个以上的表连接查询, 那就需要选择交叉表(intersection
table)作为基础表,交叉表是指那个被其他表所引用的表。

下面的例子使用最常见的scott或hr模式下的表进行演示

表 EMP 有14条记录

表 DEPT 有4条记录
SELECT /*+ rule */
COUNT( * ) FROM emp, dept;
--高效的写法

scott@CNMMBO> set autotrace traceonly stat;

scott@CNMMBO> SELECT /*+ rule */
COUNT( * ) FROM emp, dept;

Elapsed: 00:00:00.14

Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

35 consistent gets

0 physical reads

0 redo size

515 bytes sent via SQL*Net to client

492 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

SELECT /*+ rule */
COUNT( * ) FROM dept, emp;
--低效的写法
scott@CNMMBO> SELECT /*+
rule */ COUNT( * )
FROM dept, emp;

Elapsed: 00:00:00.02

Statistics

----------------------------------------------------------

1 recursive calls

0 db block gets

105 consistent gets

0 physical reads

0 redo size

515 bytes sent via SQL*Net to client

492 bytes received via SQL*Net
from client
2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

2) select 查询中避免使用'*'

当你想在SELECT子句中列出所有的COLUMN时,使用动态SQL列引用
'*' 是一个方便的方法.不幸的是,这是一个非常低效的方法.实际

上,ORACLE在解析的过程中, 会将 '*' 依次转换成所有的列名, 这个工作是通过查询数据字典完成的, 这意味着将耗费更多的时间。

注:本文中的例子出于简化演示而使用了select * ,生产环境应避免使用.

3) 减少访问数据库的次数

每当执行一条SQL语句,Oracle 需要完成大量的内部操作,象解析SQL语句,估算索引的利用率,绑定变量, 读数据块等等.由此可

见,减少访问数据库的次数,实际上是降低了数据库系统开销
-->下面通过3种方式来获得雇员编号为7788与7902的相关信息

-->方式 1 (最低效):

select ename,job,sal from emp
where empno=7788;

select ename,job,sal from emp
where empno=7902;

-->方式 2 (次低效):

-->下面使用了参数游标来完成,每传递一次参数则需要对表emp访问一次,增加了I/O

DECLARE

CURSOR C1(E_NO NUMBER)
IS

SELECT ename, job, sal

FROM emp

WHERE empno = E_NO;

BEGIN

OPEN C1 (7788);

FETCH C1 INTO …, …, …;

..

OPEN C1 (7902);

FETCH C1 INTO …, …, …;

CLOSE C1;

END;

-->方式 3 (最高效)

SELECT a.ename

, a.job

, a.sal

, b.ename

, b.job

, b.sal

FROM emp a, emp b

WHERE a.empno = 7788 AND b.empno = 7902;

注意:在SQL*Plus,SQL*Forms和Pro*C中重新设置ARRAYSIZE参数,可以增加每次数据库访问的检索数据量,建议值为200.

4) 使用DECODE函数来减少处理时间
-->使用decode函数可以避免重复扫描相同的行或重复连接相同的表

select count(*),sum(sal)
from emp where deptno=20
and ename like
'SMITH%';

select count(*),sum(sal)
from emp where deptno=30
and ename like
'SMITH%';

-->通过使用decode函数一次扫描即可完成所有满足条件记录的处理

SELECT COUNT( DECODE( deptno, 20,
'x', NULL ) ) d20_count

, COUNT( DECODE( deptno, 30,
'x', NULL ) ) d30_count

, SUM( DECODE( deptno, 20, sal,
NULL ) ) d20_sal

, SUM( DECODE( deptno, 30, sal,
NULL ) ) d30_sal

FROM emp

WHERE ename LIKE
'SMITH%';

类似的,DECODE函数也可以运用于GROUP BY 和ORDER
BY子句中。

5) 整合简单,无关联的数据库访问
-->如果你有几个简单的数据库查询语句,你可以把它们整合到一个查询中以提高性能(即使它们之间没有关系)

-->整合前

SELECT name

FROM emp

WHERE empno = 1234;

SELECT name

FROM dept

WHERE deptno = 10;

SELECT name

FROM cat

WHERE cat_type =
'RD';

-->整合后

SELECT e.name, d.name, c.name

FROM cat c

, dpt d

, emp e

, dual x

WHERE NVL( 'X', x.dummy ) = NVL(
'X', e.ROWID(+) )

AND NVL( 'X', x.dummy ) = NVL(
'X', d.ROWID(+) )
AND NVL( 'X', x.dummy ) = NVL(
'X', c.ROWID(+) )
AND e.emp_no(+) = 1234

AND d.dept_no(+) = 10

AND c.cat_type(+) = 'RD';

-->从上面的SQL语句可以看出,尽管三条语句被整合为一条,性能得以提高,然可读性差,此时应权衡性能与代价

6) 删除重复记录

-->通过使用rowid来作为过滤条件,性能高效

DELETE FROM emp e

WHERE e.ROWID > (SELECT
MIN( x.ROWID )
FROM emp x

WHERE x.empno = e.empno);

7) 使用truncate 代替
delete

-->通常情况下,任意记录的删除需要在回滚段构造删除前镜像以实现回滚(rollback).对于未提交的数据在执行rollback之后,Oracle会生成

-->等价SQL语句去恢复记录(如delete,则生成对应的insert语句;如insert则生成对应的delete;如update,则是同时生成delete和insert

-->使用truncate命令则是执行DDL命令,不产生任何回滚信息,直接格式化并释放高水位线.故该语句性能高效.由于不能rollback,因此慎用.

8) 尽量多使用COMMIT(COMMIT应确保事务的完整性)

-->只要有可能,在程序中尽量多使用COMMIT,这样程序的性能得到提高,需求也会因为COMMIT所释放的资源而减少

-->COMMIT所释放的资源:

-->1.回滚段上用于恢复数据的信息

-->2.释放语句处理期间所持有的锁

-->3.释放redo log buffer占用的空间(commit将redo log buffer中的entries 写入到联机重做日志文件)

-->4.ORACLE为管理上述3种资源中的内部开销

9) 计算记录条数

-->一般的情况下,count(*)比count(1)稍快.如果可以通过索引检索,对索引列的计数是最快的,因为直接扫描索引即可,例如COUNT(EMPNO)

-->实际情况是经测试上述三种情况并无明显差异.

10) 用Where子句替换HAVING子句

-->尽可能的避免having子句,因为HAVING 子句是对检索出所有记录之后再对结果集进行过滤。这个处理需要排序,总计等操作

-->通过WHERE子句则在分组之前即可过滤不必要的记录数目,从而减少聚合的开销

-->低效:

SELECT deptno, AVG( sal )

FROM emp

GROUP BY deptno

HAVING deptno = 20;

scott@CNMMBO> SELECT deptno,
AVG( sal )

2 FROM emp

3 GROUP BY deptno

4 HAVING deptno= 20;

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

7 consistent gets

0 physical reads

0 redo size

583 bytes sent via SQL*Net to client

492 bytes received via SQL*Net
from client
2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

-->高效:

SELECT deptno, AVG( sal )

FROM emp

WHERE deptno = 20

GROUP BY deptno;

scott@CNMMBO> SELECT deptno,
AVG( sal )
2 FROM emp

3 WHERE deptno = 20

4 GROUP BY deptno;

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

2 consistent gets

0 physical reads

0 redo size

583 bytes sent via SQL*Net to client

492 bytes received via SQL*Net
from client
2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

11) 最小化表查询次数

-->在含有子查询的SQL语句中,要特别注意减少对表的查询

-->低效:

SELECT *

FROM employees

WHERE department_id = (SELECT department_id

FROM departments

WHERE department_name =
'Marketing')

AND manager_id = (SELECT manager_id

FROM departments

WHERE department_name =
'Marketing');

-->高效:

SELECT *

FROM employees

WHERE ( department_id, manager_id ) = (SELECT department_id, manager_id

FROM departments

WHERE department_name = 'Marketing')

-->类似更新多列的情形

-->低效:
UPDATE employees

SET job_id = ( SELECT
MAX( job_id ) FROM jobs ), salary = (
SELECT AVG( min_salary )
FROM jobs )
WHERE department_id = 10;

-->高效:

UPDATE employees
SET ( job_id, salary ) = (
SELECT MAX( job_id ),
AVG( min_salary ) FROM jobs )

WHERE department_id = 10;

12) 使用表别名

-->在多表查询时,为所返回列使用表别名作为前缀以减少解析时间以及那些相同列歧义引起的语法错误

13) 用EXISTS替代IN

在一些基于基础表的查询中,为了满足一个条件,往往需要对另一个表进行联接.在这种情况下,使用EXISTS(或NOT EXISTS)通常

将提高查询的效率.

-->低效:

SELECT *

FROM emp

WHERE sal > 1000

AND deptno IN (SELECT deptno

FROM dept

WHERE loc =
'DALLAS')

-->高效:

SELECT *

FROM emp

WHERE empno > 1000

AND EXISTS

(SELECT 1

FROM dept

WHERE deptno = emp.deptno
AND loc = 'DALLAS')

14) 用NOT EXISTS替代NOT
IN
在子查询中,NOT IN子句引起一个内部的排序与合并.因此,无论何时NOT
IN子句都是最低效的,因为它对子查询中的表执行了一个全表
遍历.为避免该情形,应当将其改写成外部连接(OUTTER JOIN)或适用NOT EXISTS

-->低效:

SELECT *

FROM emp

WHERE deptno NOT
IN (SELECT deptno

FROM dept

WHERE loc =
'DALLAS');

-->高效:

SELECT e.*

FROM emp e

WHERE NOT EXISTS

(SELECT 1

FROM dept

WHERE deptno = e.deptno
AND loc = 'DALLAS');

-->最高效(尽管下面的查询最高效,并不推荐使用,因为列loc使用了不等运算,当表dept数据量较大,且loc列存在索引的话,则此时索引失效)

SELECT e.*

FROM emp e LEFT
JOIN dept d ON e.deptno = d.deptno

WHERE d.loc <> 'DALLAS'

15) 使用表连接替换EXISTS

一般情况下,使用表连接比EXISTS更高效

-->低效:

SELECT *

FROM employees e

WHERE EXISTS

(SELECT 1

FROM departments

WHERE department_id = e.department_id
AND department_name = 'IT');

-->高效:
SELECT *
-->经测试此写法SQLplus下比上面的写法多一次逻辑读,而在Toad下两者结果一致

FROM employees e INNER
JOIN departments d ON d.department_id = e.department_id

WHERE d.department_name =
'IT';

16) 用EXISTS替换DISTINCT

对于一对多关系表信息查询时(如部门表和雇员表),应避免在select 子句中使用distinct,而使用exists来替换

-->低效:

SELECT DISTINCT e.department_id, department_name

FROM departments d INNER
JOIN employees e ON d.department_id = e.department_id;

-->高效:

SELECT d.department_id,department_name

from departments d

WHERE EXISTS

(SELECT 1

FROM employees e

WHERE d.department_id=e.department_id);

EXISTS 使查询更为迅速,因为RDBMS核心模块将在子查询的条件一旦满足后,立刻返回结果

-->经测试此写法SQLplus下比上面的写法多一次逻辑读,而在Toad下两者结果一致

17) 使用 UNION ALL 替换
UNION(如果有可能的话)

当SQL语句需要UNION两个查询结果集时,这两个结果集合会以UNION-ALL的方式被合并, 然后在输出最终结果前进行排序。

如果用UNION ALL替代UNION, 这样排序就不是必要了。 效率就会因此得到提高。

注意:
UNION ALL会输出所有的结果集,而UNION则过滤掉重复记录并对其进行排序.因此在使用时应考虑业务逻辑是否允许当前的结果集存在重复现象

寻找低效的SQL语句

-->下面的语句主要适用于从视图v$sqlarea中获得当前运行下且耗用buffer_gets较多的SQL语句

SELECT executions

, disk_reads

, buffer_gets

, ROUND( ( buffer_gets
- disk_reads )
/ buffer_gets, 2 )
hit_ratio
, ROUND( disk_reads / executions, 2 ) reads_per_run

, sql_text
FROM v$sqlarea

WHERE executions > 0

AND buffer_gets > 0

AND ( buffer_gets

- disk_reads )

/ buffer_gets < 0.80

ORDER BY 4
DESC;

18) 尽可能避免使用函数,函数会导致更多的 recursive calls

[sql]
view plaincopyprint?

1) 避免基于索引列的计算

where 子句中的谓词上存在索引,而此时基于该列的计算将使得索引失效

-->低效:
SELECT employee_id, first_name

FROM   employees

WHERE  employee_id + 10 > 150;
-->索引列上使用了计算,因此索引失效,走全表扫描方式

-->高效:

SELECT employee_id, first_name

FROM   employees

WHERE  employee_id > 160;
-->走索引范围扫描方式

例外情形
上述规则不适用于SQL中的MIN和MAX函数

hr@CNMMBO> SELECT MAX( employee_id ) max_id

2  FROM   employees

3  WHERE  employee_id

4         + 10 > 150;

1 row selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1481384439

---------------------------------------------------------------------------------------------

| Id  | Operation                   |
Name          | Rows  | Bytes | Cost (%CPU)|
Time     |
---------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |               |     1 |     4 |     1   (0)| 00:00:01 |

|   1 |  SORT AGGREGATE             |               |     1 |     4 |            |          |

|   2 |   FIRST ROW                 |               |     5 |    20 |     1   (0)| 00:00:01 |

|*  3 |    INDEX FULL SCAN (MIN/MAX)| EMP_EMP_ID_PK |     5 |    20 |     1   (0)| 00:00:01
|
---------------------------------------------------------------------------------------------

2) 避免在索引列上使用NOT运算或不等于运算(<>,!=)

通常,我们要避免在索引列上使用NOT或<>,两者会产生在和在索引列上使用函数相同的影响。 当ORACLE遇到NOT或不等运算时,他就会停止

使用索引转而执行全表扫描。

-->低效:

SELECT *

FROM   emp

WHERE  NOT ( deptno = 20 );
-->实际上NOT ( deptno = 20 )等同于deptno <> 20,即deptno <>同样会限制索引

-->高效:

SELECT *

FROM   emp

WHERE  deptno > 20
OR deptno < 20;

-->尽管此方式可以替换且实现上述结果,但依然走全表扫描,如果是单纯的 > 或 < 运算,则此时为索引范围扫描

需要注意的是,在某些时候, ORACLE优化器会自动将NOT转化成相对应的关系操作符

其次如果是下列运算符进行NOT运算,依然有可能选择走索引, 仅仅除了NOT = 之外,因为
NOT = 等价于 <>

“NOT >”   to <=

“NOT >=”  to <

“NOT <”   to >=

“NOT <=”  to >

来看一个实际的例子

hr@CNMMBO> SELECT *

2  FROM   employees

3  where not employee_id<100;
-->索引列上使用了not,但是该查询返回了所有的记录,即107条,因此此时选择走全表扫描

107 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1445457117

-------------------------------------------------------------------------------

| Id  | Operation         | Name      |
Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |           |   107 |  7276 |     3   (0)| 00:00:01 |

|*  1 |  TABLE ACCESS FULL| EMPLOYEES |   107 |  7276 |     3   (0)| 00:00:01 |
-->执行计划中使用了走全表扫描方式
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("EMPLOYEE_ID">=100)
-->查看这里的谓词信息被自动转换为 >= 运算符

hr@CNMMBO> SELECT *

2  FROM   employees

3  where not employee_id<140;
-->此例与上面的语句相同,仅仅是查询范围不同返回67条记录,而此时选择了索引范围扫描

67 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 603312277

---------------------------------------------------------------------------------------------

| Id  | Operation                   | Name          |
Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |               |    68 |  4624 |     3   (0)| 00:00:01 |

|   1 |  TABLE ACCESS
BY INDEX ROWID| EMPLOYEES     |    68 |  4624 |     3   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | EMP_EMP_ID_PK |    68 |       |     1   (0)| 00:00:01 |
-->索引范围扫描方式
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("EMPLOYEE_ID">=140)

3) 用UNION 替换OR(适用于索引列)

通常情况下,使用UNION 替换WHERE子句中的OR将会起到较好的效果.基于索引列使用OR使得优化器倾向于使用全表扫描,而不是扫描索引.

注意,以上规则仅适用于多个索引列有效。 如果有column没有被索引, 查询效率可能会因为你没有选择OR而降低。

-->低效:

SELECT deptno, dname

FROM   dept

WHERE  loc = 'DALLAS'
OR deptno = 20;

-->高效:

SELECT deptno, dname

FROM   dept

WHERE  loc = 'DALLAS'

UNION

SELECT deptno, dname

FROM   dept

WHERE  deptno = 30

-->经测试,由于数据量较少,此时where子句中的谓词上都存在索引列时,两者性能相当.

-->假定where子句中存在两列
scott@CNMMBO> create
table t6 as
select object_id,owner,object_name
from dba_objects where owner='SYS'
and rownum<1001;

scott@CNMMBO> insert
into t6 select object_id,owner,object_name
from dba_objects where owner='SCOTT'
and rownum<6;

scott@CNMMBO> create
index i_t6_object_id on t6(object_id);

scott@CNMMBO> create
index i_t6_owner on t6(owner);

scott@CNMMBO> insert
into t6 select object_id,owner,object_name
from dba_objects where owner='SYSTEM'
and rownum<=300;

scott@CNMMBO> commit;

scott@CNMMBO> exec dbms_stats.gather_table_stats('SCOTT','T6',cascade=>true);

scott@CNMMBO> select owner,count(*)
from t6 group
by owner;

OWNER                  COUNT(*)

-------------------- ----------

SCOTT                         5

SYSTEM                      300

SYS                        1000

scott@CNMMBO> select *
from t6 where owner='SCOTT'
and rownum<2;

OBJECT_ID OWNER                OBJECT_NAME

---------- -------------------- --------------------

69450 SCOTT                T_TEST

scott@CNMMBO> select *
from t6 where object_id=69450
or owner='SYSTEM';

301 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 238853296

-----------------------------------------------------------------------------------------------

| Id  | Operation                    |
Name           | Rows  | Bytes | Cost (%CPU)|
Time     |
-----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT             |                |   300 |  7200 |     5   (0)| 00:00:01 |

|   1 |  CONCATENATION               |                |       |       |            |          |

|   2 |   TABLE ACCESS
BY INDEX ROWID| T6             |     1 |    24 |     2   (0)| 00:00:01 |

|*  3 |    INDEX RANGE SCAN          | I_T6_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |

|*  4 |   TABLE ACCESS
BY INDEX ROWID| T6             |   299 |  7176 |     3   (0)| 00:00:01 |

|*  5 |    INDEX RANGE SCAN          | I_T6_OWNER     |   300 |       |     1   (0)| 00:00:01 |

-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - access("OBJECT_ID"=69450)

4 - filter(LNNVL("OBJECT_ID"=69450))

5 - access("OWNER"='SYSTEM')

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

46  consistent gets

0  physical reads

0  redo size

11383  bytes sent via SQL*Net to client

712  bytes received via SQL*Net
from client

22  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

301  rows processed

scott@CNMMBO> select *
from t6 where owner='SYSTEM'
or object_id=69450;

301 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 238853296

-----------------------------------------------------------------------------------------------

| Id  | Operation                    |
Name           | Rows  | Bytes | Cost (%CPU)|
Time     |
-----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT             |                |   300 |  7200 |     5   (0)| 00:00:01 |

|   1 |  CONCATENATION               |                |       |       |            |          |

|   2 |   TABLE ACCESS
BY INDEX ROWID| T6             |     1 |    24 |     2   (0)| 00:00:01 |

|*  3 |    INDEX RANGE SCAN          | I_T6_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |

|*  4 |   TABLE ACCESS
BY INDEX ROWID| T6             |   299 |  7176 |     3   (0)| 00:00:01 |

|*  5 |    INDEX RANGE SCAN          | I_T6_OWNER     |   300 |       |     1   (0)| 00:00:01 |

-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

3 - access("OBJECT_ID"=69450)

4 - filter(LNNVL("OBJECT_ID"=69450))

5 - access("OWNER"='SYSTEM')

Statistics

----------------------------------------------------------

1  recursive calls

0  db block gets

46  consistent gets

0  physical reads

0  redo size

11383  bytes sent via SQL*Net to client

712  bytes received via SQL*Net
from client

22  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

301  rows processed

scott@CNMMBO> select *
from t6

2  where object_id=69450

3  union

4  select * from t6

5  where owner='SYSTEM';

301 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 370530636

------------------------------------------------------------------------------------------------

| Id  | Operation                     |
Name           | Rows  | Bytes | Cost (%CPU)|
Time     |
------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT              |                |   301 |  7224 |     7  (72)| 00:00:01 |

|   1 |  SORT UNIQUE                  |                |   301 |  7224 |     7  (72)| 00:00:01 |

|   2 |   UNION-ALL                   |                |       |       |            |          |

|   3 |    TABLE ACCESS BY
INDEX ROWID| T6             |     1 |    24 |     2   (0)| 00:00:01 |

|*  4 |     INDEX RANGE SCAN          | I_T6_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |

|   5 |    TABLE ACCESS BY
INDEX ROWID| T6             |   300 |  7200 |     3   (0)| 00:00:01 |

|*  6 |     INDEX RANGE SCAN          | I_T6_OWNER     |   300 |       |     1   (0)| 00:00:01 |

------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

4 - access("OBJECT_ID"=69450)

6 - access("OWNER"='SYSTEM')

Statistics

----------------------------------------------------------

1  recursive calls

0  db block gets

7  consistent gets

0  physical reads

0  redo size

11383  bytes sent via SQL*Net to client

712  bytes received via SQL*Net
from client

22  SQL*Net roundtrips to/from client

1  sorts (memory)

0  sorts (disk)

301  rows processed

-->从上面的统计信息可知,consistent gets由46下降为7,故当where子句中谓词上存在索引时,使用union替换or更高效

-->即使当列object_id与owner上不存在索引时,使用union仍然比or更高效(在Oracle 10g R2与Oracle 11g R2测试)

4) 避免索引列上使用函数

-->下面是一个来自实际生产环境的例子

-->表acc_pos_int_tbl上business_date列存在索引,由于使用了SUBSTR函数,此时索引失效,使用全表扫描

SELECT acc_num

, curr_cd

, DECODE( '20110728'

, ( SELECT TO_CHAR( LAST_DAY( TO_DATE(
'20110728', 'YYYYMMDD' ) ),
'YYYYMMDD' ) FROM dual ), 0

,   adj_credit_int_lv1_amt

+ adj_credit_int_lv2_amt

- adj_debit_int_lv1_amt

- adj_debit_int_lv2_amt )

AS interest

FROM   acc_pos_int_tbl

WHERE  SUBSTR( business_date, 1, 6 ) = SUBSTR(
'20110728', 1, 6 ) AND business_date <=
'20110728';

-->改进的办法

SELECT acc_num

, curr_cd

, DECODE( '20110728'

, ( SELECT TO_CHAR( LAST_DAY( TO_DATE(
'20110728', 'YYYYMMDD' ) ),
'YYYYMMDD' ) FROM dual ), 0

,   adj_credit_int_lv1_amt
+ adj_credit_int_lv2_amt

- adj_debit_int_lv1_amt

- adj_debit_int_lv2_amt )

AS interest

FROM   acc_pos_int_tbl acc_pos_int_tbl

WHERE  business_date >= TO_CHAR( LAST_DAY( ADD_MONTHS( TO_DATE(
'20110728', 'yyyymmdd' ), -1 ) )

+ 1, 'yyyymmdd' )

AND business_date <= '20110728';

-->下面的例子虽然没有使用函数,但字符串连接同样导致索引失效

-->低效:

SELECT account_name, amount

FROM   transaction

WHERE  account_name

|| account_type = 'AMEXA';

-->高效:

SELECT account_name, amount

FROM   transaction

WHERE  account_name = 'AMEX'
AND account_type = 'A';

5) 比较不匹配的数据类型

-->下面的查询中business_date列上存在索引,且为字符型,这种

-->低效:

SELECT *

FROM   acc_pos_int_tbl

WHERE  business_date = 20090201;

Execution Plan

----------------------------------------------------------

Plan hash value: 2335235465

-------------------------------------------------------------------------------------

| Id  | Operation         | Name            |
Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |                 | 37516 |  2857K|   106K  (1)| 00:21:17 |

|*  1 |  TABLE ACCESS
FULL| ACC_POS_INT_TBL | 37516 |  2857K|   106K  (1)| 00:21:17 |

-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter(TO_NUMBER("BUSINESS_DATE")=20090201)
-->这里可以看到产生了类型转换

-->高效:

SELECT *

FROM   acc_pos_int_tbl

WHERE  business_date =
'20090201'

6) 索引列上使用 NULL 值

IS NULL和IS
NOT NULL会限制索引的使用,因为数据中没有值等于NULL值,即便是NULL值也不等于NULL值.且NULL值不存储在于索引之中

因此应尽可能避免在索引类上使用NULL值

SELECT acc_num

, pl_cd

, order_qty

, trade_date

FROM   trade_client_tbl

WHERE  input_date IS
NOT NULL;

Execution Plan
----------------------------------------------------------

Plan hash value: 901462645

--------------------------------------------------------------------------------------

| Id  | Operation         | Name             |
Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |                  |     1 |    44 |    15   (0)| 00:00:01 |

|*  1 |  TABLE ACCESS
FULL| TRADE_CLIENT_TBL |     1 |    44 |    15   (0)| 00:00:01 |

--------------------------------------------------------------------------------------

alter table trade_client_tbl
modify (input_date not
null);

不推荐使用的查询方式
SELECT * FROM table_name
WHERE col IS
NOT NULL

SELECT * FROM table_name
WHERE col IS
NULL

推荐使用的方式
SELECT * FROM table_name
WHERE col >= 0 --尽可能的使用 =, >=, <=, like 等运算符

-->Author: Robinson Cheng

-->Blog: http://blog.csdn.net/robinson_0612 
1) 避免基于索引列的计算
where 子句中的谓词上存在索引,而此时基于该列的计算将使得索引失效

-->低效:
SELECT employee_id, first_name
FROM   employees
WHERE  employee_id + 10 > 150;        -->索引列上使用了计算,因此索引失效,走全表扫描方式

-->高效:
SELECT employee_id, first_name
FROM   employees
WHERE  employee_id > 160;    -->走索引范围扫描方式

例外情形
上述规则不适用于SQL中的MIN和MAX函数
hr@CNMMBO> SELECT MAX( employee_id ) max_id
2  FROM   employees
3  WHERE  employee_id
4         + 10 > 150;

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1481384439
---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     1 |     4 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |               |     1 |     4 |            |          |
|   2 |   FIRST ROW                 |               |     5 |    20 |     1   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| EMP_EMP_ID_PK |     5 |    20 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

2) 避免在索引列上使用NOT运算或不等于运算(<>,!=)
通常,我们要避免在索引列上使用NOT或<>,两者会产生在和在索引列上使用函数相同的影响。 当ORACLE遇到NOT或不等运算时,他就会停止
使用索引转而执行全表扫描。

-->低效:
SELECT *
FROM   emp
WHERE  NOT ( deptno = 20 );   -->实际上NOT ( deptno = 20 )等同于deptno <> 20,即deptno <>同样会限制索引

-->高效:
SELECT *
FROM   emp
WHERE  deptno > 20 OR deptno < 20;
-->尽管此方式可以替换且实现上述结果,但依然走全表扫描,如果是单纯的 > 或 < 运算,则此时为索引范围扫描

需要注意的是,在某些时候, ORACLE优化器会自动将NOT转化成相对应的关系操作符
其次如果是下列运算符进行NOT运算,依然有可能选择走索引, 仅仅除了NOT = 之外,因为 NOT = 等价于 <>

“NOT >”   to <=
“NOT >=”  to <
“NOT <”   to >=
“NOT <=”  to >

来看一个实际的例子
hr@CNMMBO> SELECT *
2  FROM   employees
3  where not employee_id<100; -->索引列上使用了not,但是该查询返回了所有的记录,即107条,因此此时选择走全表扫描

107 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1445457117
-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |   107 |  7276 |     3   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMPLOYEES |   107 |  7276 |     3   (0)| 00:00:01 | -->执行计划中使用了走全表扫描方式
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("EMPLOYEE_ID">=100)           -->查看这里的谓词信息被自动转换为 >= 运算符

hr@CNMMBO> SELECT *
2  FROM   employees
3  where not employee_id<140; -->此例与上面的语句相同,仅仅是查询范围不同返回67条记录,而此时选择了索引范围扫描

67 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 603312277

---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |    68 |  4624 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMPLOYEES     |    68 |  4624 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | EMP_EMP_ID_PK |    68 |       |     1   (0)| 00:00:01 | -->索引范围扫描方式
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_ID">=140)

3) 用UNION 替换OR(适用于索引列)
通常情况下,使用UNION 替换WHERE子句中的OR将会起到较好的效果.基于索引列使用OR使得优化器倾向于使用全表扫描,而不是扫描索引.
注意,以上规则仅适用于多个索引列有效。 如果有column没有被索引, 查询效率可能会因为你没有选择OR而降低。
-->低效:
SELECT deptno, dname
FROM   dept
WHERE  loc = 'DALLAS' OR deptno = 20;

-->高效:
SELECT deptno, dname
FROM   dept
WHERE  loc = 'DALLAS'
UNION
SELECT deptno, dname
FROM   dept
WHERE  deptno = 30

-->经测试,由于数据量较少,此时where子句中的谓词上都存在索引列时,两者性能相当.
-->假定where子句中存在两列
scott@CNMMBO> create table t6 as select object_id,owner,object_name from dba_objects where owner='SYS' and rownum<1001;

scott@CNMMBO> insert into t6 select object_id,owner,object_name from dba_objects where owner='SCOTT' and rownum<6;

scott@CNMMBO> create index i_t6_object_id on t6(object_id);

scott@CNMMBO> create index i_t6_owner on t6(owner);

scott@CNMMBO> insert into t6 select object_id,owner,object_name from dba_objects where owner='SYSTEM' and rownum<=300;

scott@CNMMBO> commit;

scott@CNMMBO> exec dbms_stats.gather_table_stats('SCOTT','T6',cascade=>true);

scott@CNMMBO> select owner,count(*) from t6 group by owner;

OWNER                  COUNT(*)
-------------------- ----------
SCOTT                         5
SYSTEM                      300
SYS                        1000

scott@CNMMBO> select * from t6 where owner='SCOTT' and rownum<2;

OBJECT_ID OWNER                OBJECT_NAME
---------- -------------------- --------------------
69450 SCOTT                T_TEST

scott@CNMMBO> select * from t6 where object_id=69450 or owner='SYSTEM';

301 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 238853296
-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |   300 |  7200 |     5   (0)| 00:00:01 |
|   1 |  CONCATENATION               |                |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T6             |     1 |    24 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I_T6_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| T6             |   299 |  7176 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | I_T6_OWNER     |   300 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("OBJECT_ID"=69450)
4 - filter(LNNVL("OBJECT_ID"=69450))
5 - access("OWNER"='SYSTEM')

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
46  consistent gets
0  physical reads
0  redo size
11383  bytes sent via SQL*Net to client
712  bytes received via SQL*Net from client
22  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
301  rows processed

scott@CNMMBO> select * from t6 where owner='SYSTEM' or object_id=69450;

301 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 238853296
-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |   300 |  7200 |     5   (0)| 00:00:01 |
|   1 |  CONCATENATION               |                |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T6             |     1 |    24 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I_T6_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| T6             |   299 |  7176 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | I_T6_OWNER     |   300 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("OBJECT_ID"=69450)
4 - filter(LNNVL("OBJECT_ID"=69450))
5 - access("OWNER"='SYSTEM')

Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
46  consistent gets
0  physical reads
0  redo size
11383  bytes sent via SQL*Net to client
712  bytes received via SQL*Net from client
22  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
301  rows processed

scott@CNMMBO> select * from t6
2  where object_id=69450
3  union
4  select * from t6
5  where owner='SYSTEM';

301 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 370530636
------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                |   301 |  7224 |     7  (72)| 00:00:01 |
|   1 |  SORT UNIQUE                  |                |   301 |  7224 |     7  (72)| 00:00:01 |
|   2 |   UNION-ALL                   |                |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| T6             |     1 |    24 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | I_T6_OBJECT_ID |     1 |       |     1   (0)| 00:00:01 |
|   5 |    TABLE ACCESS BY INDEX ROWID| T6             |   300 |  7200 |     3   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN          | I_T6_OWNER     |   300 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("OBJECT_ID"=69450)
6 - access("OWNER"='SYSTEM')

Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
7  consistent gets
0  physical reads
0  redo size
11383  bytes sent via SQL*Net to client
712  bytes received via SQL*Net from client
22  SQL*Net roundtrips to/from client
1  sorts (memory)
0  sorts (disk)
301  rows processed

-->从上面的统计信息可知,consistent gets由46下降为7,故当where子句中谓词上存在索引时,使用union替换or更高效
-->即使当列object_id与owner上不存在索引时,使用union仍然比or更高效(在Oracle 10g R2与Oracle 11g R2测试)

4) 避免索引列上使用函数
-->下面是一个来自实际生产环境的例子
-->表acc_pos_int_tbl上business_date列存在索引,由于使用了SUBSTR函数,此时索引失效,使用全表扫描
SELECT acc_num
, curr_cd
, DECODE( '20110728'
, ( SELECT TO_CHAR( LAST_DAY( TO_DATE( '20110728', 'YYYYMMDD' ) ), 'YYYYMMDD' ) FROM dual ), 0
,   adj_credit_int_lv1_amt
+ adj_credit_int_lv2_amt
- adj_debit_int_lv1_amt
- adj_debit_int_lv2_amt )
AS interest
FROM   acc_pos_int_tbl
WHERE  SUBSTR( business_date, 1, 6 ) = SUBSTR( '20110728', 1, 6 ) AND business_date <= '20110728';

-->改进的办法
SELECT acc_num
, curr_cd
, DECODE( '20110728'
, ( SELECT TO_CHAR( LAST_DAY( TO_DATE( '20110728', 'YYYYMMDD' ) ), 'YYYYMMDD' ) FROM dual ), 0
,   adj_credit_int_lv1_amt
+ adj_credit_int_lv2_amt
- adj_debit_int_lv1_amt
- adj_debit_int_lv2_amt )
AS interest
FROM   acc_pos_int_tbl acc_pos_int_tbl
WHERE  business_date >= TO_CHAR( LAST_DAY( ADD_MONTHS( TO_DATE( '20110728', 'yyyymmdd' ), -1 ) )
+ 1, 'yyyymmdd' )
AND business_date <= '20110728';

-->下面的例子虽然没有使用函数,但字符串连接同样导致索引失效
-->低效:
SELECT account_name, amount
FROM   transaction
WHERE  account_name
|| account_type = 'AMEXA';

-->高效:
SELECT account_name, amount
FROM   transaction
WHERE  account_name = 'AMEX' AND account_type = 'A';

5) 比较不匹配的数据类型
-->下面的查询中business_date列上存在索引,且为字符型,这种
-->低效:
SELECT *
FROM   acc_pos_int_tbl
WHERE  business_date = 20090201;

Execution Plan
----------------------------------------------------------
Plan hash value: 2335235465

-------------------------------------------------------------------------------------
| Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                 | 37516 |  2857K|   106K  (1)| 00:21:17 |
|*  1 |  TABLE ACCESS FULL| ACC_POS_INT_TBL | 37516 |  2857K|   106K  (1)| 00:21:17 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("BUSINESS_DATE")=20090201)    -->这里可以看到产生了类型转换

-->高效:
SELECT *
FROM   acc_pos_int_tbl
WHERE  business_date = '20090201'

6) 索引列上使用 NULL 值
IS NULL和IS NOT NULL会限制索引的使用,因为数据中没有值等于NULL值,即便是NULL值也不等于NULL值.且NULL值不存储在于索引之中
因此应尽可能避免在索引类上使用NULL值

SELECT acc_num
, pl_cd
, order_qty
, trade_date
FROM   trade_client_tbl
WHERE  input_date IS NOT NULL;

Execution Plan
----------------------------------------------------------
Plan hash value: 901462645
--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |     1 |    44 |    15   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TRADE_CLIENT_TBL |     1 |    44 |    15   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

alter table trade_client_tbl modify (input_date not null);

不推荐使用的查询方式
SELECT * FROM table_name WHERE col IS NOT NULL

SELECT * FROM table_name WHERE col IS NULL

推荐使用的方式
SELECT * FROM table_name WHERE col >= 0 --尽可能的使用 =, >=, <=, like 等运算符
-->Author: Robinson Cheng
-->Blog: http://blog.csdn.net/robinson_0612

三、总结

1、尽可能最小化基表数据以及中间结果集(通过过滤条件避免后续产生不必要的计算与聚合)

2、为where子句中的谓词信息提供最佳的访问路径(rowid访问,索引访问)

3、使用合理的SQL写法来避免过多的Oracle内部开销以提高性能

4、合理的使用提示以提高表之间的连接来提高连接效率(如避免迪卡尔集,将不合理的嵌套连接改为hash连接等)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: