MySQL安装部署之MHA

MHA介绍

MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,在MySQL故障切换过程中,MHA能够在0~30秒内自动完成数据库的自动切换,同时在最大程度上保证数据的一致性。

该软件由MHA Manager和MHA Node组成,MHA Manager可部署在独立的机器上管理多个master-slavei集群,也可以部署在slave节点上。MHA Node则部署在每个数据库节点上,MHA Manager定时探测集群中的master节点,当master故障时,会自动将最新的slave升级为master,并把集群中存活的其它slave指向它。

在自动切换过程中,MHA会尝试从宕机的master服务器拷贝二进制日志,最大程度保证数据不丢失,但是如果master机器关机或者无法ssh,则MHA无法获取到二进制日志,进行故障转移则会丢失数据。结合MySQL的半同步复制,可以大大降低数据丢失的风险。

工作原理

  1. 从崩溃的master保存二进制日志事件(binlog events);
  2. 识别含有最新更新的slave;
  3. 应用差异的中继日志(relay log) 到其他slave;
  4. 应用从master保存的二进制日志事件(binlog events);
  5. 提升最新的slave为新master
  6. 群集中存活的其它slave节点执行新master MHA

MHA组件

  • masterha_check_ssh : 检查MHA的SSH状态,MHA要求数据节点之间SSH互信
  • masterha_check_repl : 检查MySQL主从复制状态
  • masterha_manager : 启动MHA服务
  • masterha_check_status : 检测当前MHA运行状态,包括服务状态,主节点
  • masterha_master_monitor : 监测master是否宕机
  • masterha_master_switch : 控制故障转移(自动或手动)
  • masterha_conf_host : 添加或删除配置的server信息
  • save_binary_logs : 保存和复制master的二进制日志
  • apply_diff_relay_logs : 识别差异的中继日志事件并应用于其它slave
  • purge_relay_logs : 清除中继日志
  • secondary_check_script:通过多条网络路由检测master的可用性
  • master_ip_failover_script:failover切换群集VIP
  • shutdown_script:强制关闭master节点
  • report_script:发送报告
  • init_conf_load_script:加载初始配置参数
  • master_ip_online_change:更新master节点ip地址

MHA安装部署

MySQL主从配置

MySQL实例安装 可参考MySQL安装部署(一)

修改内核

1
2
$ echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
$ sysctl -p

master创建同步用户

1
2
root@(none) 15:16> GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'repl'@'10.240.204.%' identified by 'Abcd123#';
root@(none) 15:16> flush privileges;

slave同步master

1
2
root@(none) 15:16> change master to master_host='10.240.204.157', MASTER_PORT=3306,master_user='repl',master_password='Abcd123#',master_auto_position=1;
root@(none) 15:16> start slave;

查看同步状态

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
root@(none) 15:23> show slave status\G
*************************** 1. row ***************************
            Slave_IO_State: Waiting for master to send event
            Master_Host: 10.240.204.157
            Master_User: repl
            Master_Port: 33006
            Connect_Retry: 60
            Master_Log_File: mysql-bin.000001
            Read_Master_Log_Pos: 1012
            Relay_Log_File: mysql-relay-bin.000002
            Relay_Log_Pos: 1225
            Relay_Master_Log_File: mysql-bin.000001
            Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

MHA配置

master创建MHA用户

1
root@(none) 15:23> GRANT ALL PRIVILEGES ON *.* TO 'mha'@'10.240.204.%' identified by 'Abcd123#';

创建MHA目录

1
2
$ mkdir -p /usr/local/mha
$ mkdir -p /etc/mha

按顺序安装MHA程序(manager节点)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ yum install mha4mysql-node-0.56-0.el6.noarch.rpm -y
$ yum install perl-Config-Tiny-2.14-7.el7.noarch.rpm -y
$ yum install perl-Email-Date-Format-1.002-15.el7.noarch.rpm -y
$ yum install perl-Mail-Sender-0.8.23-1.el7.noarch.rpm -y
$ yum install perl-Mail-Sendmail-0.79-21.el7.noarch.rpm -y
$ yum install perl-MIME-Types-1.38-2.el7.noarch.rpm -y
$ yum install perl-MIME-Lite-3.030-1.el7.noarch.rpm -y
$ yum install perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm -y
$ yum install perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm -y
$ yum install mha4mysql-manager-0.56-0.el6.noarch.rpm -y
$ yum install perl-DBD-MySQL -y

其它节点安装mha node

1
$ yum install mha4mysql-node-0.56-0.el6.noarch.rpm

配置SSH互信(每个节点执行一遍)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
ed:e3:6c:1f:8c:95:2e:82:5a:5c:e5:80:04:1c:83:ab root@nvvmysql1-v-szzb
The key's randomart image is:
+--[ RSA 2048]----+
|   o+o.          |
|  . .o .         |
|   .  . . .      |
|  .      =   .   |
| .      S o o    |
|E    . o . =     |
|      + . = +    |
|     o   +.o .   |
|    .    .o..    |
+-----------------+

$ ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.240.204.157
$ ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.240.204.165
$ ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.240.204.149

配置MHA参数(/etc/mha/app1.conf)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[server default]
user=mha
password=Abcd123#
manager_workdir=/usr/local/mha
manager_log=/usr/local/mha/manager.log
remote_workdir=/usr/local/mha
master_ip_failover_script= /usr/local/mha/master_ip_failover
ssh_user=root
repl_user=repl
repl_password=Abcd123#
ping_interval=1
[server1]
hostname=10.240.204.157
port=3306
master_binlog_dir=/service/binlog
candidate_master=1
[server2]
hostname=10.240.204.149
port=3306
master_binlog_dir=/service/binlog
no_master=1
[server3]
hostname=10.240.204.165
port=3306
master_binlog_dir=/service/binlog
candidate_master=1

Tips: 其中no_master为0,则表示不会切换为master

配置failover脚本(/usr/local/mha/master_ip_failover)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
	#!/usr/bin/env perl  
	use strict;  
	use warnings FATAL =>'all';  
	
	use Getopt::Long;  
	
	my (  
	$command,          $ssh_user,        $orig_master_host, $orig_master_ip,  
	$orig_master_port, $new_master_host, $new_master_ip,    $new_master_port  
	);  
	
	my $vip = '10.240.204.175/32';  
	my $key = "1";  
	my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";  
	my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";  
	my $exit_code = 0;  
	
	GetOptions(  
	'command=s'          => \$command,  
	'ssh_user=s'         => \$ssh_user,  
	'orig_master_host=s' => \$orig_master_host,  
	'orig_master_ip=s'   => \$orig_master_ip,  
	'orig_master_port=i' => \$orig_master_port,  
	'new_master_host=s'  => \$new_master_host,  
	'new_master_ip=s'    => \$new_master_ip,  
	'new_master_port=i'  => \$new_master_port,  
	);  
	
	exit &main();  
	
	sub main {  
	
	#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";  
	
	if ( $command eq "stop" || $command eq "stopssh" ) {  
	
	        # $orig_master_host, $orig_master_ip, $orig_master_port are passed.  
	        # If you manage master ip address at global catalog database,  
	        # invalidate orig_master_ip here.  
	        my $exit_code = 1;  
	        eval {  
	            print "\n\n\n***************************************************************\n";  
	            print "Disabling the VIP - $vip on old master: $orig_master_host\n";  
	            print "***************************************************************\n\n\n\n";  
	&stop_vip();  
	            $exit_code = 0;  
	        };  
	        if ($@) {  
	            warn "Got Error: $@\n";  
	            exit $exit_code;  
	        }  
	        exit $exit_code;  
	}  
	elsif ( $command eq "start" ) {  
	
	        # all arguments are passed.  
	        # If you manage master ip address at global catalog database,  
	        # activate new_master_ip here.  
	        # You can also grant write access (create user, set read_only=0, etc) here.  
	my $exit_code = 10;  
	        eval {  
	            print "\n\n\n***************************************************************\n";  
	            print "Enabling the VIP - $vip on new master: $new_master_host \n";  
	            print "***************************************************************\n\n\n\n";  
	&start_vip();  
	            $exit_code = 0;  
	        };  
	        if ($@) {  
	            warn $@;  
	            exit $exit_code;  
	        }  
	        exit $exit_code;  
	}  
	elsif ( $command eq "status" ) {  
	        print "Checking the Status of the script.. OK \n";  
	        `ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;  
	        exit 0;  
	}  
	else {  
	&usage();  
	        exit 1;  
	}  
	}  
	
	# A simple system call that enable the VIP on the new master  
	sub start_vip() {  
	`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;  
	}  
	# A simple system call that disable the VIP on the old_master  
	sub stop_vip() {  
	`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;  
	}  
	
	sub usage {  
	print  
	"Usage: master_ip_failover ?.ommand=start|stop|stopssh|status ?.rig_master_host=host ?.rig_master_ip=ip ?.rig_master_port=port ?.ew_master_host=host ?.ew_master_ip=ip ?.ew_master_port=port\n";  
	}

添加执行权限

1
$ chmod +x /usr/local/mha/master_ip_failover

Tips: $vip为程序连接的vip地址,切换过程中会将VIP挂载到新的master上,其中ssh_start_vip和ssh_stop_vip中的eth0为网卡名称,根据实例情况修改

验证SSH状态

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ masterha_check_ssh --conf=/etc/mha/app1.conf 
Wed Aug 23 15:48:27 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Aug 23 15:48:27 2017 - [info] Reading application default configuration from /etc/mha/app1.conf..
Wed Aug 23 15:48:27 2017 - [info] Reading server configuration from /etc/mha/app1.conf..
Wed Aug 23 15:48:27 2017 - [info] Starting SSH connection tests..
Wed Aug 23 15:48:27 2017 - [debug] 
Wed Aug 23 15:48:27 2017 - [debug]  Connecting via SSH from root@10.240.204.157(10.240.204.157:22) to root@10.240.204.165(10.240.204.165:22)..
Wed Aug 23 15:48:27 2017 - [debug]   ok.
Wed Aug 23 15:48:27 2017 - [debug]  Connecting via SSH from root@10.240.204.157(10.240.204.157:22) to root@10.240.204.149(10.240.204.149:22)..
Wed Aug 23 15:48:27 2017 - [debug]   ok.
Wed Aug 23 15:48:28 2017 - [debug] 
Wed Aug 23 15:48:27 2017 - [debug]  Connecting via SSH from root@10.240.204.165(10.240.204.165:22) to root@10.240.204.157(10.240.204.157:22)..
Wed Aug 23 15:48:27 2017 - [debug]   ok.
Wed Aug 23 15:48:27 2017 - [debug]  Connecting via SSH from root@10.240.204.165(10.240.204.165:22) to root@10.240.204.149(10.240.204.149:22)..
Wed Aug 23 15:48:27 2017 - [debug]   ok.
Wed Aug 23 15:48:28 2017 - [debug] 
Wed Aug 23 15:48:28 2017 - [debug]  Connecting via SSH from root@10.240.204.149(10.240.204.149:22) to root@10.240.204.157(10.240.204.157:22)..
Wed Aug 23 15:48:28 2017 - [debug]   ok.
Wed Aug 23 15:48:28 2017 - [debug]  Connecting via SSH from root@10.240.204.149(10.240.204.149:22) to root@10.240.204.165(10.240.204.165:22)..
Wed Aug 23 15:48:28 2017 - [debug]   ok.
Wed Aug 23 15:48:28 2017 - [info] All SSH connection tests passed successfully.

验证复制状态

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ masterha_check_repl --conf=/etc/mha/app1.conf 
Wed Aug 23 15:48:37 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Aug 23 15:48:37 2017 - [info] Reading application default configuration from /etc/mha/app1.conf..
Wed Aug 23 15:48:37 2017 - [info] Reading server configuration from /etc/mha/app1.conf..
Wed Aug 23 15:48:37 2017 - [info] MHA::MasterMonitor version 0.56.
Wed Aug 23 15:48:37 2017 - [info] GTID failover mode = 1
Wed Aug 23 15:48:37 2017 - [info] Dead Servers:
Wed Aug 23 15:48:37 2017 - [info] Alive Servers:
Wed Aug 23 15:48:37 2017 - [info]   10.240.204.157(10.240.204.157:33006)
Wed Aug 23 15:48:37 2017 - [info]   10.240.204.165(10.240.204.165:33006)
Wed Aug 23 15:48:37 2017 - [info]   10.240.204.149(10.240.204.149:33006)
…………………………
10.240.204.157(10.240.204.157:33006) (current master)
  +--10.240.204.165(10.240.204.165:33006)
  +--10.240.204.149(10.240.204.149:33006)

Wed Aug 23 15:48:37 2017 - [info] Checking replication health on 10.240.204.165..
Wed Aug 23 15:48:37 2017 - [info]  ok.
Wed Aug 23 15:48:37 2017 - [info] Checking replication health on 10.240.204.149..
Wed Aug 23 15:48:37 2017 - [info]  ok.
Wed Aug 23 15:48:37 2017 - [info] Checking master_ip_failover_script status:
Wed Aug 23 15:48:37 2017 - [info]   /usr/local/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.240.204.157 --orig_master_ip=10.240.204.157 --orig_master_port=33006 
Checking the Status of the script.. OK 
Wed Aug 23 15:48:37 2017 - [info]  OK.
Wed Aug 23 15:48:37 2017 - [warning] shutdown_script is not defined.
Wed Aug 23 15:48:37 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

关闭relay log自动清理

1
root> set global relay_log_purge = 0

注意:my.cnf也需要修改

创建relay log清理脚本

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
$ cat purge_relay.sh
#!/bin/bash
user=root
passwd=Abcd123#
port=3306
log_dir='/etc/mha/log'
work_dir='/service/data'
purge='/usr/bin/purge_relay_logs'

if [ ! -d $log_dir ]
then
   mkdir $log_dir -p
fi

$purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1

添加定时任务

1
2
$ crontab -e
0 2 * * * /bin/bash /root/purge_relay_log.sh

启动MHA

1
nohup /usr/bin/masterha_manager --conf=/etc/mha/app1.conf  &

手动switch 因数据库升级或者bug修复等操作,要进行手动切换mha数据库

1
2
3
masterha_master_switch --conf=/etc/mha/app1.conf --master_state=dead --dead_master_host=10.240.204.157
masterha_master_switch --conf=/etc/mha/app1.conf 

附录

切换后MHA服务关闭 当master完成切换后,manager监控进程会自动关闭,对于这种情况官方的解释如下

Currently MHA Manager process does not run as a daemon. If failover completed successfully or the master process was killed by accident, the manager stops working. To run as a daemon, daemontool. or any external daemon program can be used. Here is an example to run from daemontools.

如果旧master重新启动,由于GTID的关系,可以通过change master指向新的master节点,再将manager服务进程启动

手动switch 因数据库升级或者bug修复等操作,要进行手动切换mha数据库,可以通过switch进行在线切换。

创建master_ip_online_change脚本

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
$ cat master_ip_online_change
#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiRes qw( sleep gettimeofday tv_interval );
use Data::Dumper;

my $_tstart;
my $_running_interval = 0.1;
my (
  $command,          $orig_master_host, $orig_master_ip,
  $orig_master_port, $orig_master_user, 
  $new_master_host,  $new_master_ip,    $new_master_port,
  $new_master_user,  
);


my $vip = '10.240.204.175/32';  # Virtual IP 
my $key = "1"; 
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
my $ssh_user = "root";
my $new_master_password='Abcd123#';
my $orig_master_password='Abcd123#';
GetOptions(
  'command=s'              => \$command,
  #'ssh_user=s'             => \$ssh_user,  
  'orig_master_host=s'     => \$orig_master_host,
  'orig_master_ip=s'       => \$orig_master_ip,
  'orig_master_port=i'     => \$orig_master_port,
  'orig_master_user=s'     => \$orig_master_user,
  #'orig_master_password=s' => \$orig_master_password,
  'new_master_host=s'      => \$new_master_host,
  'new_master_ip=s'        => \$new_master_ip,
  'new_master_port=i'      => \$new_master_port,
  'new_master_user=s'      => \$new_master_user,
  #'new_master_password=s'  => \$new_master_password,
);

exit &main();

sub current_time_us {
  my ( $sec, $microsec ) = gettimeofday();
  my $curdate = localtime($sec);
  return $curdate . " " . sprintf( "%06d", $microsec );
}

sub sleep_until {
  my $elapsed = tv_interval($_tstart);
  if ( $_running_interval > $elapsed ) {
    sleep( $_running_interval - $elapsed );
  }
}

sub get_threads_util {
  my $dbh                    = shift;
  my $my_connection_id       = shift;
  my $running_time_threshold = shift;
  my $type                   = shift;
  $running_time_threshold = 0 unless ($running_time_threshold);
  $type                   = 0 unless ($type);
  my @threads;

  my $sth = $dbh->prepare("SHOW PROCESSLIST");
  $sth->execute();

  while ( my $ref = $sth->fetchrow_hashref() ) {
    my $id         = $ref->{Id};
    my $user       = $ref->{User};
    my $host       = $ref->{Host};
    my $command    = $ref->{Command};
    my $state      = $ref->{State};
    my $query_time = $ref->{Time};
    my $info       = $ref->{Info};
    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);
    next if ( $my_connection_id == $id );
    next if ( defined($query_time) && $query_time < $running_time_threshold );
    next if ( defined($command)    && $command eq "Binlog Dump" );
    next if ( defined($user)       && $user eq "system user" );
    next
      if ( defined($command)
      && $command eq "Sleep"
      && defined($query_time)
      && $query_time >= 1 );

    if ( $type >= 1 ) {
      next if ( defined($command) && $command eq "Sleep" );
      next if ( defined($command) && $command eq "Connect" );
    }

    if ( $type >= 2 ) {
      next if ( defined($info) && $info =~ m/^select/i );
      next if ( defined($info) && $info =~ m/^show/i );
    }

    push @threads, $ref;
  }
  return @threads;
}

sub main {
  if ( $command eq "stop" ) {
    ## Gracefully killing connections on the current master
    # 1. Set read_only= 1 on the new master
    # 2. DROP USER so that no app user can establish new connections
    # 3. Set read_only= 1 on the current master
    # 4. Kill current queries
    # * Any database access failure will result in script die.
    my $exit_code = 1;
    eval {
      ## Setting read_only=1 on the new master (to avoid accident)
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );
      print current_time_us() . " Set read_only on the new master.. ";
      $new_master_handler->enable_read_only();
      if ( $new_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }
      $new_master_handler->disconnect();

      # Connecting to the orig master, die if any database error happens
      my $orig_master_handler = new MHA::DBHelper();
      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,
        $orig_master_user, $orig_master_password, 1 );

      ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand
      #$orig_master_handler->disable_log_bin_local();
      #print current_time_us() . " Drpping app user on the orig master..\n";
      #FIXME_xxx_drop_app_user($orig_master_handler);

      ## Waiting for N * 100 milliseconds so that current connections can exit
      my $time_until_read_only = 15;
      $_tstart = [gettimeofday];
      my @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_read_only > 0 && $#threads >= 0 ) {
        if ( $time_until_read_only % 5 == 0 ) {
          printf
"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_read_only * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_read_only--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write
      print current_time_us() . " Set read_only=1 on the orig master.. ";
      $orig_master_handler->enable_read_only();
      if ( $orig_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }

      ## Waiting for M * 100 milliseconds so that current update queries can complete
      my $time_until_kill_threads = 5;
      @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
        if ( $time_until_kill_threads % 5 == 0 ) {
          printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_kill_threads--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }



                print "Disabling the VIP on old master: $orig_master_host \n";
                &stop_vip();     


      ## Terminating all threads
      print current_time_us() . " Killing all application threads..\n";
      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
      print current_time_us() . " done.\n";
      #$orig_master_handler->enable_log_bin_local();
      $orig_master_handler->disconnect();

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {
    ## Activating master ip on the new master
    # 1. Create app user with write privileges
    # 2. Moving backup script if needed
    # 3. Register new master's ip to the catalog database

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
# If exit code is 0 or 10, MHA does not abort
    my $exit_code = 10;
    eval {
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      #$new_master_handler->disable_log_bin_local();
      print current_time_us() . " Set read_only=0 on the new master.\n";
      $new_master_handler->disable_read_only();

      ## Creating an app user on the new master
      #print current_time_us() . " Creating app user on the new master..\n";
      #FIXME_xxx_create_app_user($new_master_handler);
      #$new_master_handler->enable_log_bin_local();
      $new_master_handler->disconnect();

      ## Update master ip on the catalog database, etc
                print "Enabling the VIP - $vip on the new master - $new_master_host \n";
                &start_vip();
                $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {

    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

# A simple system call that enable the VIP on the new master 
sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
  print
"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  die;
}

执行切换

1
masterha_master_switch --conf=/etc/mha/app1.conf --master_state=alive --new_master_host=t-luhx01 --new_master_port=3306 -orig_master_is_new_slave --running_updates_limit=10000
Licensed under CC BY-NC-SA 4.0
comments powered by Disqus