replication-manager:复制集群管理组件

replication-manager是signal18开源的一款数据库高可用工具,采用go语言开发。支持MySQL、MariaDB、Percona,其包含一下功能特性:

  • 复制监控
  • 复制拓扑检测
  • 主从切换(switchover)
  • 主节点故障切换(failover)
  • 大多数场景下数据零丢失
  • 多群集管理
  • 支持proxy(proxysql,haproxy等)

下载安装介质 DownLoad

安装介质

1
[root@t-luhx01-v-szzb mysql]# rpm -ivh replication-manager-osc-2.0.2-1.x86_64.rpm

创建数据目录

1
2
[root@t-luhx01-v-szzb mysql]# mkdir manager
[root@t-luhx01-v-szzb mysql]# chown -R mysql.mysql manager

编辑配置文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
[root@t-luhx01-v-szzb mysql]# cat /etc/replication-manager/config.toml
[mysql-luhx]
title = "mysql-luhx"
db-servers-hosts = "10.0.139.161:33006,10.0.139.162:33006,10.0.139.163:33006"
db-servers-prefered-master = "10.0.139.161:33006"
db-servers-ignored-hosts = "10.0.139.162:33006"
db-servers-credential = "dba:Abcd123#"
replication-credential = "repl:Abcd123#"
failover-mode = "automatic"
[Default]
monitoring-datadir = "/service/mysql/manager"
monitoring-sharedir = "/service/mysql/manager"
log-level=7
log-file = "/service/mysql/manager/replication-manager.log"
replication-multi-master = false
replication-multi-tier-slave = false
failover-readonly-state = true
http-server = true
http-bind-address = "0.0.0.0"
http-port = "8000"

启动replication-manager

1
2
[root@t-luhx01-v-szzb mysql]# service replication-manager restart
Restarting replication-manager (via systemctl):            [  OK  ]

访问dashboard dashboard

switchover

switchover可以在web控制台直接点击switchover

failover

当主节点关闭时,Fail count超过5次即立刻进行切换,当恢复时会自动加入新群集成为主节点

遗留问题

在连续切换两次时,大概率会出现双主的情况,相关日志如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
2020/08/04 09:56:53 [mysql-luhx] INFO  - --------------------------
2020/08/04 09:56:53 [mysql-luhx] INFO  - Starting master switchover
2020/08/04 09:56:53 [mysql-luhx] INFO  - --------------------------
2020/08/04 09:56:53 [mysql-luhx] INFO  - Checking long running updates on master 10
2020/08/04 09:56:53 [mysql-luhx] INFO  - Flushing tables on master 10.0.139.163:33006
2020/08/04 09:56:53 [mysql-luhx] INFO  - Electing a new master
2020/08/04 09:56:53 [mysql-luhx] DEBUG - Election rig: 10.0.139.161:33006 elected as preferred master
2020/08/04 09:56:53 [mysql-luhx] INFO  - Slave 10.0.139.161:33006 has been elected as a new master
2020/08/04 09:56:53 [mysql-luhx] INFO  - Terminating all threads on 10.0.139.163:33006
2020/08/04 09:56:53 [mysql-luhx] INFO  - Rejecting updates on 10.0.139.163:33006 (old master)
2020/08/04 09:56:53 [mysql-luhx] INFO  - Waiting for candidate master to apply relay log
2020/08/04 09:56:53 [mysql-luhx] INFO  - Reading all relay logs on 10.0.139.161:33006
2020/08/04 09:56:53 [mysql-luhx] INFO  - Waiting sync IO_Pos:mysql-bin.000002/794, Slave_Pos:mysql-bin.000002 794
2020/08/04 09:56:53 [mysql-luhx] DEBUG - Save replication status before electing
2020/08/04 09:56:53 [mysql-luhx] DEBUG - master_log_file=mysql-bin.000002
2020/08/04 09:56:53 [mysql-luhx] DEBUG - master_log_pos=794
2020/08/04 09:56:53 [mysql-luhx] DEBUG - Candidate was in sync=false
2020/08/04 09:56:53 [mysql-luhx] INFO  - Stopping slave thread on new master
2020/08/04 09:56:53 [mysql-luhx] INFO  - Resetting slave on new master and set read/write mode on
2020/08/04 09:56:53 [mysql-luhx] ERROR - Reset slave failed on new master, reason:Error 1290: The MySQL server is running with the --super-read-only option so it cannot execute this statement 
2020/08/04 09:56:53 [mysql-luhx] INFO  - Inject fake transaction on new master 10.0.139.161:33006 
2020/08/04 09:56:53 [mysql-luhx] INFO  - Switching old master as a slave
2020/08/04 09:56:53 [mysql-luhx] INFO  - Doing MySQL GTID switch of the old master
2020/08/04 09:56:53 [mysql-luhx] ERROR - Change master failed on old master Change master statement CHANGE MASTER TO master_host='10.0.139.161', master_port=33006, master_user='repl', master_password='Abcd123#', master_connect_retry=5, master_heartbeat_period=3, MASTER_
AUTO_POSITION = 1 failed, reason: Error 1794: Slave is not configured or failed to initialize properly. You must at least set --server-id to enable either a master or a slave. Additional error messages can be found in the MySQL error log.
2020/08/04 09:56:53 [mysql-luhx] ERROR - Start slave failed on old master Error 1794: Slave is not configured or failed to initialize properly. You must at least set --server-id to enable either a master or a slave. Additional error messages can be found in the MySQL er
ror log.


[root@t-luhx01-v-szzb manager]# cat mysql-error.log
2020-08-03T08:44:48.501273Z 1115 [ERROR] Error reading packet from server for channel '': Lost connection to MySQL server during query (server_errno=2013)
2020-08-03T08:44:48.501354Z 1115 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2020-08-03T08:44:48.595808Z 1387 [Warning] Info table is not ready to be used. Table 'mysql.slave_master_info' cannot be opened.
2020-08-03T08:44:48.595841Z 1387 [ERROR] Error in checking mysql.slave_master_info repository info type of TABLE.
2020-08-03T08:44:48.595854Z 1387 [ERROR] Error creating master info: Error checking repositories.
comments powered by Disqus