您的位置:首页 > 其它

关于not exists,not in,exists ,in(参考TOM的文章)

2010-02-23 08:37 429 查看
EXISTS:一般用于关联查询(其他子查询也可以但没有意义) 特点 EXISTS前面不会写字段名称,以及后面查询多是

查询含有员工的部门号,含有员工的部门,部门有员工

1  select deptno from dept d
2  where exists(
3  select 1 from emp e
4  where e.deptno = d.deptno
5* )
SQL> /

DEPTNO
------
10
20
30

3 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2505850981

-------------------------------------------------------------------------------
| Id  | Operation           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |         |     3 |    18 |     4  (25)| 00:00:01 |
|   1 |  NESTED LOOPS       |         |     3 |    18 |     4  (25)| 00:00:01 |
|   2 |   SORT UNIQUE       |         |    14 |    42 |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| EMP     |    14 |    42 |     3   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN | PK_DEPT |     1 |     3 |     0   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("E"."DEPTNO"="D"."DEPTNO")

Statistics


上面的语句类似执行了如下的操作 也就是一个对外表循环的过程 外表每拿出一个值来与内表的连接列比较 内表中存在那么输出

for d.deptno in ( select deptno from dept d)
loop
if ( exists(select 1 from emp e where e.deptno = d.deptno )
then
output the record
end if
end loop

由于外表没有过滤条件所以一般会导致外表是full scan 所以建议在外表建立索引 这样就如上面所示 执行计划走了 index unique scan来提高运行速度,

用EXISTS 的最佳情景是 dept 这个表比较小 并且 内表 emp对deptno建立了适合索引.

IN:
Equal-to-any-member-of test. Equivalent to =ANY.

SQL> select e.ename,e.deptno from emp e
2  where e.deptno in ( select d.deptno from dept d);

ENAME      DEPTNO
---------- ------
SMITH          20
ALLEN          30
WARD           30
JONES          20
MARTIN         30
BLAKE          30
CLARK          10
SCOTT          20
KING           10
TURNER         30
ADAMS          20
JAMES          30
FORD           20
MILLER         10

14 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3074306753

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |    14 |   168 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS      |         |    14 |   168 |     3   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| EMP     |    14 |   126 |     3   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| PK_DEPT |     1 |     3 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("E"."DEPTNO"="D"."DEPTNO")


类似于,in 会把这种子查询转变为hash或者sort连接 所以一般建议子表查询的结果比较小,这样比较范围就小

select e.ename,e.deptno

from emp e,(select distinct d.deptno from dept d) d

where e.deptno = d.deptno

The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then joined to
the original table -- typically

Lets say the result of the subquery is small -- then IN is typicaly more appropriate.

所以说 并非只有用exists的就比用in的好 要看情况而定

从性能上来看
exists是用loop的方式,循环的次数影响大,外表要记录数少,内表就无所谓了,

in 的话一般希望内表比较少

如果内外表都是大表的话那么依情况而定 这需要具体数据量的测试以及其他因素的调整 索引啦之类的

Note that in general, NOT IN and NOT EXISTS are NOT the same!!!

SQL> select count(*) from emp where empno not in ( select mgr from emp );

COUNT(*)
----------
0

apparently there are NO rows such that an employee is not a mgr -- everyone is a mgr
(or are they)


SQL> select count(*) from emp T1
2 where not exists ( select null from emp T2 where t2.mgr = t1.empno );

----------
9

Ahh, but now there are 9 people who are not managers. Beware the NULL value and NOT IN!!
(also the reason why NOT IN is sometimes avoided).


NOT IN can be just as efficient as NOT EXISTS -- many orders of magnitude BETTER even --
if an "anti-join" can be used (if the subquery is known to not return nulls)

这个查询结果是查询不在mgr列表的员工号,如果范围值为No row意味着没有不在mgr列表的员工号,意味着全是经理级人物,但事实上并非如此,通过not exists语句做同样的查询,查找员工号没有存在于mgr列上的数量,返回结果是9,说明还有9个人不是经理级人物啊

通过查询得知mgr包含Null

not in :可以理解为 xx = x1 and xx=x2 and xx=x3 这样如果not in 里面有null值 那么意味着 这个返回结果为false不返回值

所以对于使用 not in 要慎重因为里面牵涉到空问题,所以要确认子查询的列不返回空

not in内外表都进行全表扫描,没有用到索引;
not extsts 的子查询能用到表上的索引。
所以推荐用not exists代替not in
不过如果是exists和in就要具体看情况了
有时间用具体的实例和执行计划来说明。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: