您的位置:首页 > 其它

IC90024,为什么没有选择NLJOIN?

2015-04-27 23:05 288 查看
IC90024: OPTIMIZER MIGHT FAVOUR HSJOIN OVER AN ORDERED NLJOIN

The optimizer might favour a hash join (HSJOIN) over a nested
loop join (NLJOIN) alternative under the following conditions:
- the join is on two or more columns
- both tables in the join have an index with leading, non-bound
key columns that participate in the join
- one or more leading columns matchin order of the join
columns, but not all

For example, consider the following query

SELECT *
FROM T1,T2
WHERE T1.A=T2.A and T1.B=T2.B and T1.C=T2.C and T1.X=1;

where index IX1 is defined on T1(X,A,C,B) and index IX2 is
defined on T2(A,B,C). The column T1.X is bound to the constant
1 as a result of the predicate "T1.X=1" so for the NLJOIN with
IX1 access on the outer and IX2 access on the inner, the leading
non-bound columns for both indexes are referenced in the join
predicate T1.A=T2.A, but the subsequent columns are not ordered
in join column order. Under these conditions, the optimizer
might over estimate the cost of the NLJOIN alternative,
favouring a possibly worse-performing HSJOIN alternative

Local fix

Create an index on either table, ensuring that the key columns
are in join order. Referring to the example in the Error
Description, this could be achieved by creating an index on
T2(A,C,B).

APAR验证分析:

环境:

DB2V9.7 FP9

测试数据:

db2 "create table t1(x integer,a varchar(128),b varchar(128),c varchar(128),d date)"
db2 "create table t2(a varchar(128),b varchar(128),c varchar(128),d date)"
db2 "insert into t1 select row_number() over(),tabschema,tabname,colname,current date from syscat.columns"
db2 "insert into t2 select a,b,c,d from t1"
db2 "insert into t1 select * from t1"
db2 "insert into t1 select * from t1"
db2 "insert into t1 select * from t1"
db2 "insert into t1 select * from t1"
db2 "insert into t2 select * from t2"
db2 "insert into t2 select * from t2"
db2 "insert into t2 select * from t2"
db2 "insert into t2 select * from t2"
db2 "insert into t1 values(10000,'a','b','c',current date)"
db2 "insert into t2 values('a','b','c',current date)"
db2 "create index idx1_t1 on t1(x,a,c,b)"
db2 "create index idx1_t2 on t2(a,b,c)"
db2 "runstats on table e97q9a.t1 with distribution and indexes all"
db2 "runstats on table e97q9a.t2 with distribution and indexes all"

$ db2 "select count(1) from t1"

1
-----------
99873

1 record(s) selected.

$ db2 "select count(1) from t2"

1
-----------
99873

1 record(s) selected.

$ db2 "select count(1) from t1,t2 where t1.a=t2.a and t1.b=t2.b and t1.c=t2.c and t1.x=10000"

1
-----------
1

1 record(s) selected.

db2expln -d sample -g -q "select * from t1,t2 where t1.a=t2.a and t1.b=t2.b and t1.c=t2.c and t1.x=10000" -t

Statement:

select *
from t1, t2
where t1.a=t2.a and t1.b=t2.b and t1.c=t2.c and t1.x=10000

Optimizer Plan:

Rows
Operator
(ID)
Cost

255.923
n/a
RETURN
( 1)
921.938
|
255.923
n/a
HSJOIN
( 2)
921.938
/ \
99873 15.9976
n/a n/a
TBSCAN FETCH
( 3) ( 4)
811.538 93.0733
| / \
99873 15.9976 99873
n/a n/a n/a
Table: IXSCAN Table:
E97Q9A ( 5) E97Q9A
T2 15.1502 T1
|
6243
Index:
E97Q9A
IDX1_T1

t1.x=10000记录数只有1条, 最优的执行计划应该是NLJOIN,
外表是T1(索引扫描,筛选x=10000的记录),内表是T2表(索引扫描)
但是优化器这里选择了不高效的HASH JOIN, T2表全表扫描, IO成本高.

按Local Fix的方法, t2表的索引重建为(a,c,b), 但是问题并没有解决.
而且该APAR描述v97fp9已经修复, 但是本次实验使用的版本正是v97fp9.
APAR应该是有问题的.

正确的Local Fix:

方法1: runstats on table e97q9a.t2 with distribution and detailed indexes all --用detailed选项

detailed会收集CLUSTERFACTOR,PAGE_FETCH_PAIRS信息.

$ db2 "select substr(tabname,1,32) tabname,substr(indname,1,32) indname,clusterfactor,clusterratio from syscat.indexes where tabname in('T1','T2')"

TABNAME INDNAME CLUSTERFACTOR CLUSTERRATIO
-------------------------------- -------------------------------- ------------------------ ------------
T1 IDX1_T1 -1.00000000000000E+000 0
T2 IDX1_T2 +2.01785804368663E-004 -1

Optimizer Plan:

Rows
Operator
(ID)
Cost

255.923
n/a
RETURN
( 1)
515.543
|
255.923
n/a
NLJOIN
( 2)
515.543
/ \----\
15.9976 *
n/a / \
FETCH 15.9976 99873
( 3) n/a n/a
93.0733 IXSCAN Table:
/ \ ( 6) E97Q9A
15.9976 99873 15.1511 T2
n/a n/a |
IXSCAN Table: 6243
( 4) E97Q9A Index:
15.1502 T1 E97Q9A
| IDX1_T2
6243
Index:
E97Q9A
IDX1_T1

方法2: reorg table e97q9a.t2 index idx1_t2;
runstats on table e97q9a.t2 with distribution and indexes all(可以不用detailed)

$ db2 "select substr(tabname,1,32) tabname,substr(indname,1,32) indname,clusterfactor,clusterratio from syscat.indexes where tabname in('T1','T2')"

TABNAME INDNAME CLUSTERFACTOR CLUSTERRATIO
-------------------------------- -------------------------------- ------------------------ ------------
T1 IDX1_T1 -1.00000000000000E+000 0
T2 IDX1_T2 -1.00000000000000E+000 100

Optimizer Plan:

Rows
Operator
(ID)
Cost

255.923
n/a
RETURN
( 1)
456.329
|
255.923
n/a
NLJOIN
( 2)
456.329
/ \----\
15.9976 *
n/a / \
FETCH 15.9976 99873
( 3) n/a n/a
93.0733 IXSCAN Table:
/ \ ( 6) E97Q9A
15.9976 99873 15.1511 T2
n/a n/a |
IXSCAN Table: 6243
( 4) E97Q9A Index:
15.1502 T1 E97Q9A
| IDX1_T2
6243
Index:
E97Q9A
IDX1_T1

另外, V10.5 FP4已经修复.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐