您的位置:首页 > 其它

mycat ShareJoin后少了2条记录的问题剖析详细过程

2016-05-04 19:58 423 查看

--<>问题描述

Mycat 在进行join的时候,本来是应该join出4条记录,但是结果却只出来2条记录,问题在哪里?

mycat版本是1.5 mysql版本是5.6.12-log

1,准备环境,先去原始mysql库上执行建库录入数据操作

# 在db1库和db2库分别建2个测试表
mysql> use db1
Database changed
mysql> CREATE TABLE `t1` (
-> `id` bigint(20) NOT NULL AUTO_INCREMENT,
-> `name` varchar(200) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE `t2` (
-> `id` bigint(20) NOT NULL AUTO_INCREMENT,
-> `name` varchar(200) DEFAULT NULL,
-> `parent_id` bigint(20) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> use db2
Database changed
mysql> CREATE TABLE `t1` (
-> `id` bigint(20) NOT NULL AUTO_INCREMENT,
-> `name` varchar(200) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;
r(200) DEFAULT NULL,
`parent_id` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE `t2` (
-> `id` bigint(20) NOT NULL AUTO_INCREMENT,
-> `name` varchar(200) DEFAULT NULL,
-> `parent_id` bigint(20) DEFAULT NULL,
-> PRIMARY KEY (`id`)
-> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.01 sec)

mysql>
# 也是在mysql中执行,开始录入数据
mysql> Insert into db2.t1(id,name) values(1,'db2');
Query OK, 1 row affected (0.01 sec)

mysql> Insert into db1.t1(id,name) values(2,'db1');
db1.2',1);
Insert into db2.t2(id,name,parent_id) values(3,'db2.2',2);Query OK, 1 row affected (0.00 sec)

mysql> Insert into db1.t2(id,name,parent_id) values(2,'db1.1',1);
Query OK, 1 row affected (0.00 sec)

mysql> Insert into db1.t2(id,name,parent_id) values(4,'db2.2',2);
Query OK, 1 row affected (0.00 sec)

mysql> Insert into db2.t2(id,name,parent_id) values(1,'db1.2',1);
Query OK, 1 row affected (0.00 sec)

mysql> Insert into db2.t2(id,name,parent_id) values(3,'db2.2',2);
Query OK, 1 row affected (0.01 sec)

mysql>
请注意,重点来了,按照正常的理解,2个节点的t1表有2条记录,2个节点的t2表有4条记录,那么表关联后应该返回4条记录,对不对?

那么mycat到底给我们返回了几条记录呢是不是4条呢?

答案是否定的!!!!!!!接下来配置mycat吧,实际验证一下。

2,mycat中schema.xml配置

Schema.xml配置:
# table表配置
<table name="t1" dataNode="dn1,dn2" rule="mod-long" primaryKey="id" />
<table name="t2" dataNode="dn2,dn1" rule="mod-long" primaryKey="id" />

# datanode配置
<dataNode name="dn1" dataHost="dh1" database="db1" />
<dataNode name="dn2" dataHost="dh1" database="db2" />

# datahost配置
<dataHost name="dh1" maxCon="1000" minCon="10" balance="0"
writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">
<heartbeat>select user()</heartbeat>
<writeHost host="hostM1" url="192.168.121.61:3306" user="root" password=""/>
</dataHost>

3,重启mycat

[root@oracle_standby ~]# mycat restart
Stopping Mycat-server...
Stopped Mycat-server.
Starting Mycat-server...
[root@oracle_standby ~]#
文章源地址:/article/7549630.html,谢绝转载

4,查看mycat执行结果

# 单表查询t1有2条记录,正常
mysql> select * from t1;
+----+------+
| id | name |
+----+------+
| 2 | db1 |
| 1 | db2 |
+----+------+
2 rows in set (0.01 sec)

# 单表查询t2有4条记录,正常
mysql> select * from t2;
+----+-------+-----------+
| id | name | parent_id |
+----+-------+-----------+
| 1 | db1.2 | 1 |
| 3 | db2.2 | 2 |
| 2 | db1.1 | 1 |
| 4 | db2.2 | 2 |
+----+-------+-----------+
4 rows in set (0.01 sec)

# 那么t1 通过parent_id关联t2表,应该有4条记录,对不对?但是只有2条,而且加了ShareJoin前后的查询出来的数据还不一样,如下所示:
mysql> select * from t1,t2 where t1.id = t2.parent_id;
+----+------+----+-------+-----------+
| id | name | id | name | parent_id |
+----+------+----+-------+-----------+
| 2 | db1 | 4 | db2.2 | 2 |
| 1 | db2 | 1 | db1.2 | 1 |
+----+------+----+-------+-----------+
2 rows in set (0.01 sec)

mysql> /*!mycat:catlet=demo.catlets.ShareJoin */ select t1.*,t2.* from t1,t2 where t1.id = t2.parent_id;
+----+------+-------+-----------+
| id | name | name | parent_id |
+----+------+-------+-----------+
| 1 | db2 | db1.1 | 1 |
| 2 | db1 | db2.2 | 2 |
+----+------+-------+-----------+
2 rows in set (0.02 sec)

mysql>
# 看到加了注解走了mycat后,结果集不一样了,问题在哪里呢?

5,Mycat.log分析

05/04 17:21:11.443 DEBUG [$_NIOREACTOR-0-RW] (ServerQueryHandler.java:56) -ServerConnection [id=1, schema=TESTDB, host=127.0.0.1, user=test,txIsolation=3, autocommit=true, schema=TESTDB]/*!mycat:catlet=demo.catlets.ShareJoin */ select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id

05/04 17:21:11.447 DEBUG [$_NIOREACTOR-0-RW] (EnchachePool.java:76) -SQLRouteCache miss cache ,key:TESTDB/*!mycat:catlet=demo.catlets.ShareJoin */ select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id
05/04 17:21:11.469 DEBUG [$_NIOREACTOR-0-RW] (HintCatletHandler.java:46) -load catelet class:demo.catlets.ShareJoin to run sql select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id
05/04 17:21:11.573 INFO [$_NIOREACTOR-0-RW] (JoinParser.java:70) -SQL: select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id
# 这里路由正确,catlet路由到了dn1、dn2了。
05/04 17:21:11.677 INFO [$_NIOREACTOR-0-RW] (ShareJoin.java:150) -Catlet exec:dn1,dn2, sql:select id , name from t1
05/04 17:21:11.677 DEBUG [BusinessExecutor2] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1
05/04 17:21:11.681 DEBUG [BusinessExecutor2] (SQLJob.java:85) -con query sql:select id , name from t1 to con:MySQLConnection [id=2, lastTime=1462353671681, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=dn1{select id , name from t1}, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
05/04 17:21:11.685 DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=2, lastTime=1462353671666, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
05/04 17:21:11.685 INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=1,dataNodeOrDatabase=dn1,sql=select id , name from t1, jobHandler=demo.catlets.ShareDBJoinHandler@6fb40e2d]
05/04 17:21:11.685 DEBUG [BusinessExecutor0] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1
05/04 17:21:11.686 DEBUG [BusinessExecutor0] (SQLJob.java:85) -con query sql:select id , name from t1 to con:MySQLConnection [id=8, lastTime=1462353671686, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=dn2{select id , name from t1}, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
05/04 17:21:11.687 DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=8, lastTime=1462353671666, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
05/04 17:21:11.724 DEBUG [BusinessExecutor1] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1
# 这里面有用到SQLJob.java,通过将t1表的id传给t2表以where in(1,2)的方式来获取t2表的数据
05/04 17:21:11.724 DEBUG [BusinessExecutor1] (SQLJob.java:85) -con query sql:select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) to con:MySQLConnection [id=2, lastTime=1462353671724, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=dn1{select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) }, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

而在mycat中,执行这个sql语句是能获取到4条记录的,如下所示:
mysql> select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2);
+-----------+-----+-------+-----------+
| parent_id | id2 | name | parent_id |
+-----------+-----+-------+-----------+
| 1 | 1 | db1.2 | 1 |
| 2 | 3 | db2.2 | 2 |
| 1 | 2 | db1.1 | 1 |
| 2 | 4 | db2.2 | 2 |
+-----------+-----+-------+-----------+
4 rows in set (0.03 sec)

mysql>
那为什么最后在mycat窗口中显示为2条呢?继续看下面日志:
05/04 17:21:11.724 DEBUG [BusinessExecutor3] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1
05/04 17:21:11.725 DEBUG [BusinessExecutor3] (SQLJob.java:85) -con query sql:select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) to con:MySQLConnection [id=8, lastTime=1462353671725, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=dn2{select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) }, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
05/04 17:21:11.725 INFO [$_NIOREACTOR-0-RW] (ShareJoin.java:229) -SQLParallJob:dn1,dn2, sql:select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2)
05/04 17:21:11.726 INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=2,dataNodeOrDatabase=dn2,sql=select id , name from t1, jobHandler=demo.catlets.ShareDBJoinHandler@6fb40e2d]
05/04 17:21:11.726 DEBUG [$_NIOREACTOR-0-RW] (EngineCtx.java:82) -packge id 1
05/04 17:21:11.727 DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=2, lastTime=1462353671707, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
05/04 17:21:11.728 INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=3,dataNodeOrDatabase=dn1,sql=select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) , jobHandler=demo.catlets.ShareRowOutPutDataHandler@13037844]
05/04 17:21:11.728 DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=8, lastTime=1462353671707, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
05/04 17:21:11.728 INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=4,dataNodeOrDatabase=dn2,sql=select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) , jobHandler=demo.catlets.ShareRowOutPutDataHandler@13037844]
05/04 17:21:11.729 INFO [$_NIOREACTOR-0-RW] (EngineCtx.java:171) -all job finished for front connection: ServerConnection [id=1, schema=TESTDB, host=127.0.0.1, user=test,txIsolation=3, autocommit=true, schema=TESTDB]
05/04 17:21:11.729 INFO [$_NIOREACTOR-0-RW] (EngineCtx.java:159) -write eof ,packgId:10
05/04 17:21:11.729 INFO [$_NIOREACTOR-0-RW] (ShareJoin.java:159) -发送数据OK

PS:看到最后,一切流程正常了,没有看到问题所在,看来只有去看源代码了。

6,源码分析join

01.png:



02.png:



如果说。两次都获取到了,就正常了:03.png:



那为啥2次获取不到呢?看着啊,我输出了点东西,04.png:



这就是原因:
private byte[] getRow(String value,int index){
for(Map.Entry<String,byte[]> e: arows.entrySet() ){
String key=e.getKey();
RowDataPacket rowDataPkg = ResultSetUtil.parseRowData(e.getValue(), afields);
String id = ByteUtil.getString(rowDataPkg.fieldValues.get(index));
if (id.equals(value)){
System.out.println("getRow方法,结果集将A表join列值为【"+value+"】的集合remove掉了,再相查就查不到了");
return arows.remove(key);
}
}
return null;
}
这里面的逻辑,处理比较麻烦,如果想改,还没想好怎么改……
再次描述下,不知道是否能描述清楚,大家都懂的,it人在讲述“专业高深技能”方面经常是一剑西来天外飞仙,场面恢宏绚丽无比,口如悬河滔滔不绝,就是太高大上了,普通人叹为观止一时无法领悟其中精髓……
这个还是接地气点,说的是他第一查询后,把对应key的值remove掉了,所以第二次查不到了,他查B表是查到4条,根据B表查询结果,取原先A表查询结果去组合返回结果,一条一条返回,可以第一次查的时候把A表对应值的记录去掉了,第二条(关联列值相同)查不到A表记录,他认为不符合join结果,就去掉了。个人理解应该i是判断是否符合join条件,和removekey那个值,分开处理才是的。个人建议。

--<>--这个sharejoin处理的很复杂,如果不稳定的话,建议抛弃,ER也建议抛弃

Bty:此处问题有akey提出,我做了数据准备工作日志分析,帅哥做了源码分析工作,victor大神积极参与,静哥等密切关注推进,最后我整理交流记录而成本篇笔记,准备提交git,谢谢大家的帮助。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: