mysql MHA主从切换问题实验总结
2016-05-28 14:39
639 查看
问题一:
Fri May 27 10:01:05 2016 - [error][/apps/lib/mha/mha_manager/MHA/MasterRotate.pm, ln161] We should not start online master switch when one of connections are running long updates on the current master(10.16.24.108(10.16.24.108:3307)). Currently
1 update thread(s) are running.
Details:
{'Time' => '88270','Command' => 'Daemon','db' => undef,'Id' => '2','Info' => undef,'User' => 'event_scheduler','Progress' => '0.000','State' => 'Waiting on empty queue','Host' => 'localhost'}
Fri May 27 10:01:05 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/sh/mha/mha_manager/bin/masterha_master_switch line 53.
解决方法:
关掉event_schedule即可:
(product)mha@10.16.24.108 [(none)]> SET GLOBAL event_scheduler =off;
Query OK, 0 rows affected (0.00 sec)
(product)mha@10.16.24.108 [(none)]> Select @@event_scheduler;
+-------------------+
| @@event_scheduler |
+-------------------+
| OFF |
+-------------------+
1 row in set (0.00 sec)
(product)mha@10.16.24.108 [(none)]> show processlist\G
*************************** 1. row ***************************
Id: 140
User: repl
Host: 10.16.24.107:44449
db: NULL
Command: Binlog Dump
Time: 13262
State: Master has sent all binlog to slave; waiting for binlog to be updated
Info: NULL
Progress: 0.000
*************************** 2. row ***************************
Id: 141
User: repl
Host: 10.16.24.109:23490
db: NULL
Command: Binlog Dump
Time: 13254
State: Master has sent all binlog to slave; waiting for binlog to be updated
Info: NULL
Progress: 0.000
*************************** 3. row ***************************
Id: 147
User: mha
Host: 10.16.24.108:59213
db: NULL
Command: Query
Time: 0
State: init
Info: show processlist
Progress: 0.000
3 rows in set (0.00 sec)
问题二:
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 10.16.24.108(10.16.24.108:3307)? (YES/no): yes
Fri May 27 15:19:14 2016 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [info] Checking MHA is not monitoring or doing failover..
Fri May 27 15:19:14 2016 - [info] Checking replication health on 10.16.24.107..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [info] Checking replication health on 10.16.24.109..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/ServerManager.pm, ln1218] 10.16.24.109 is not alive!
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/MasterRotate.pm, ln232] Failed to get new master!
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/sh/mha/mha_manager/bin/masterha_master_switch line 53.
解决方法:
因为10.16.24.109的/apps/conf/mha/app1.cnf中的no_master=1限制了它成为新master的可能,标识掉no_master=1后,重新在线切换成功。
问题三:
Sat May 28 09:35:06 2016 - [info] Master configurations are as below:
Master 10.16.24.109(10.16.24.109:3307), replicating from 10.16.24.108(10.16.24.108:3307)
Master 10.16.24.108(10.16.24.108:3307), replicating from 10.16.24.109(10.16.24.109:3307), read-only
Sat May 28 09:35:06 2016 - [warning] SQL Thread is stopped(no error) on 10.16.24.108(10.16.24.108:3307)
Sat May 28 09:35:06 2016 - [error][/apps/lib/mha/mha_manager/MHA/ServerManager.pm, ln726] Slave 10.16.24.107(10.16.24.107:3307) replicates from 10.16.24.108:3307, but real master is 10.16.24.109(10.16.24.109:3307)!
Sat May 28 09:35:06 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/lib/mha/mha_manager/MHA/MasterRotate.pm line 85.
解决方法:
10.16.24.108上执行:set global read_only=off;
10.16.24.109上执行:set global read_only=on;
10.16.24.107上执行:set global read_only=on;
问题四:
Sat May 28 10:00:32 2016 831853 Set read_only=0 on the new master.
Sat May 28 10:00:32 2016 832417Add vip 10.16.24.58 on eth1..
RTNETLINK answers: Operation not permitted
解决方法:
在root用户下每个节点执行:
chmod u+s /sbin/ip
问题五:
MHA手工在线切换后,vip也漂到新主库上,但在其它主机上用vip连接时,却还是连到本主机的从库上
是啥原因
解决方法:
在所有从库上执行drop_vip.sh即可
Fri May 27 10:01:05 2016 - [error][/apps/lib/mha/mha_manager/MHA/MasterRotate.pm, ln161] We should not start online master switch when one of connections are running long updates on the current master(10.16.24.108(10.16.24.108:3307)). Currently
1 update thread(s) are running.
Details:
{'Time' => '88270','Command' => 'Daemon','db' => undef,'Id' => '2','Info' => undef,'User' => 'event_scheduler','Progress' => '0.000','State' => 'Waiting on empty queue','Host' => 'localhost'}
Fri May 27 10:01:05 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/sh/mha/mha_manager/bin/masterha_master_switch line 53.
解决方法:
关掉event_schedule即可:
(product)mha@10.16.24.108 [(none)]> SET GLOBAL event_scheduler =off;
Query OK, 0 rows affected (0.00 sec)
(product)mha@10.16.24.108 [(none)]> Select @@event_scheduler;
+-------------------+
| @@event_scheduler |
+-------------------+
| OFF |
+-------------------+
1 row in set (0.00 sec)
(product)mha@10.16.24.108 [(none)]> show processlist\G
*************************** 1. row ***************************
Id: 140
User: repl
Host: 10.16.24.107:44449
db: NULL
Command: Binlog Dump
Time: 13262
State: Master has sent all binlog to slave; waiting for binlog to be updated
Info: NULL
Progress: 0.000
*************************** 2. row ***************************
Id: 141
User: repl
Host: 10.16.24.109:23490
db: NULL
Command: Binlog Dump
Time: 13254
State: Master has sent all binlog to slave; waiting for binlog to be updated
Info: NULL
Progress: 0.000
*************************** 3. row ***************************
Id: 147
User: mha
Host: 10.16.24.108:59213
db: NULL
Command: Query
Time: 0
State: init
Info: show processlist
Progress: 0.000
3 rows in set (0.00 sec)
问题二:
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 10.16.24.108(10.16.24.108:3307)? (YES/no): yes
Fri May 27 15:19:14 2016 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [info] Checking MHA is not monitoring or doing failover..
Fri May 27 15:19:14 2016 - [info] Checking replication health on 10.16.24.107..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [info] Checking replication health on 10.16.24.109..
Fri May 27 15:19:14 2016 - [info] ok.
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/ServerManager.pm, ln1218] 10.16.24.109 is not alive!
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/MasterRotate.pm, ln232] Failed to get new master!
Fri May 27 15:19:14 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/sh/mha/mha_manager/bin/masterha_master_switch line 53.
解决方法:
因为10.16.24.109的/apps/conf/mha/app1.cnf中的no_master=1限制了它成为新master的可能,标识掉no_master=1后,重新在线切换成功。
问题三:
Sat May 28 09:35:06 2016 - [info] Master configurations are as below:
Master 10.16.24.109(10.16.24.109:3307), replicating from 10.16.24.108(10.16.24.108:3307)
Master 10.16.24.108(10.16.24.108:3307), replicating from 10.16.24.109(10.16.24.109:3307), read-only
Sat May 28 09:35:06 2016 - [warning] SQL Thread is stopped(no error) on 10.16.24.108(10.16.24.108:3307)
Sat May 28 09:35:06 2016 - [error][/apps/lib/mha/mha_manager/MHA/ServerManager.pm, ln726] Slave 10.16.24.107(10.16.24.107:3307) replicates from 10.16.24.108:3307, but real master is 10.16.24.109(10.16.24.109:3307)!
Sat May 28 09:35:06 2016 - [error][/apps/lib/mha/mha_manager/MHA/ManagerUtil.pm, ln177] Got ERROR: at /apps/lib/mha/mha_manager/MHA/MasterRotate.pm line 85.
解决方法:
10.16.24.108上执行:set global read_only=off;
10.16.24.109上执行:set global read_only=on;
10.16.24.107上执行:set global read_only=on;
问题四:
Sat May 28 10:00:32 2016 831853 Set read_only=0 on the new master.
Sat May 28 10:00:32 2016 832417Add vip 10.16.24.58 on eth1..
RTNETLINK answers: Operation not permitted
解决方法:
在root用户下每个节点执行:
chmod u+s /sbin/ip
问题五:
MHA手工在线切换后,vip也漂到新主库上,但在其它主机上用vip连接时,却还是连到本主机的从库上
是啥原因
解决方法:
在所有从库上执行drop_vip.sh即可
相关文章推荐
- mysql查询今天、昨天、本周、本月、上一月 、今年数据
- 修改mysql的用户的密码
- mysql表名忽略大小写配置
- mysql 解除正在死锁的状态
- MySQL死锁问题实例分析及解决方法(主要是SQL语句可能会产生的问题)
- MySQL: MyISAM和InnoDB的区别
- MySql5.7环境搭建
- MySQL的完整性约束之:域(字段)完整性
- MySQL的完整性约束之:实体完整性(主键的添加)
- MySQL中的数据查询语言,针对单表的查询
- MySQL中的DML:数据操作语言,增删改
- MySQL中的数据库和数据表的操作
- mysql中CONCAT值为空的问题解决办法
- Mysql下优化SQL的一般步骤
- 以mysql为例介绍PreparedStatement防止sql注入原理
- MySQL NOTE
- Navicat for mysql 一次性插入多条数据乱码的问题
- mysql常见问题
- 第109讲: Spark Streaming电商广告点击综合案例动态黑名单基于数据库MySQL的真正操作代码实战
- Mysql JDBC 连接串参数说明