Codis介绍
Redis的集群解决方案除了官方的Cluster,也有社区的Codis和Twemproxy。Codis是由国内豌豆荚团队开发的。
三种不同解决方案的差异
\ |
Codis |
Twemproxy |
Redis Cluster |
Resharding without restarting cluster |
YES |
NO |
YES |
pipeline |
YES |
YES |
NO |
Hash targs for multi-key operations |
YES |
YES |
YES |
Multi-key operations while resharding |
YES |
- |
No(details) |
Redis clients supporting |
Any clients |
Any clients |
Clients have to support cluster protocol |
Codis是用GO语言开发的,分为下列组件:
- Codis Server:基于Redis开发的分支。增加了额外的数据结构以及slot相关数据迁移命令
- Codis Proxy:Redis代理服务,除部分命令不支持外基于与原生Redis一致。一套Codis群集可以有多个Codis Proxy,多个Codis由Codis Dashboard保持同步
- Codis Dashboard:集群管理工具,支持Codis Proxy、Codis Server的添加、删除以及数据迁移的操作。Codis Dashboard维护群集下所有Codis Proxy状态一致性,所有对集群的操作都必须由Codis Dashboard完成,一个群集只能有1个Codis Dashboard。
- Codis Admin:集群管理的命令行工具,可以控制 Codis Proxy、Codis Dashboard状态以及访问外部存储、
- Codis FE:集群管理界面。多个群集可共享同一个前端界面
- Storage:为集群状态提供外部存储
Codis分片原理
Codis会把所有的Key分成1024个槽,这1024个槽就对应着Redis集群,Codis在内存中维护这1024个槽与redis实例的映射关系,这个值是可以手动设置的。Codis中的key分配算法是先将key进行CRC32得到一个32位的数字,然后再通过hash % 1024后得到一个余数,这个值就是key对应的槽。
Codis扩容
因为Codis只是一个中间代理,当需要扩容redis实例时,可以直接增加redis实例,在槽位分配时可以通过Codis Dashboard手动指定。Codis实现了slotsscan命令,可以扫描slot下面的key并迁移到新的Redis实例。在迁移时当前节点和新节点都会保存需要迁移槽位的信息,当槽位新增了key,Codis会强制迁移至新节点并通知后续新增的key都写入新节点。
Codis特点
- 可以无缝迁移到Codis中
- 可以动态扩容或缩容
- 对业务完全透明
- 支持多核心CPU,twemproxy只能单核
- 部分redis命令不被支持,例如keys *
- 支持group划分,group中可以设置主从并通过哨兵进行监控
不支持的Redis命令
Codis不支持部分Redis命令,具体请查看unsupport commands
Codis部署
资源清单
IP |
服务 |
端口 |
10.240.204.157 |
Codis Server |
10001/1002 |
Condis Sentinel |
10003 |
Codis Proxy |
11000/19000 |
Codis Dashboard(157) |
18080 |
Codis FE(157) |
18090 |
Zookeeper监听端口 |
21000 |
Zookeeper内部通讯端口 |
21001 |
Zookeeper选举端口 |
21002 |
GO安装
解压安装包
1
|
$ tar -xvf go1.8.5.linux-amd64.tar.gz -C /usr/local
|
添加环境变量
1
2
3
|
$ echo "export GOROOT=/usr/local/go" >> /etc/profile
$ echo "export PATH=$PATH:$GOROOT/bin" >> /etc/profile
$ source /etc/profile
|
查看GO版本
1
2
|
$ go version
go version go1.8.5 linux/amd64
|
Zookeeper安装
安装JDK
1
2
3
4
|
$ java -version
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-b10)
OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
|
解压zookeeper
1
2
|
$ tar -xvf zookeeper-3.4.14.tar.gz
$ mv zookeeper-3.4.14 /usr/local/zookeeper
|
创建相关目录
1
2
|
$ mkdir /service/zookeeper/data -p
$ mkdir /service/zookeeper/log -p
|
设置环境变量
1
2
3
|
$ echo "export ZOOKEEPER_HOME=/usr/local/zookeeper" >> /etc/profile
$ echo "export PATH=$PATH:$ZOOKEEPER_HOME/bin" >> /etc/profile
$ source /etc/profile
|
编辑配置文件(/usr/local/zookeeper/conf/zoo.cfg)
1
2
3
4
5
6
7
8
9
10
11
12
|
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/service/zookeeper/data
dataLogDir=/service/zookeeper/log
clientPort=21000
#maxClientCnxns=60
#autopurge.snapRetainCount=3
#autopurge.purgeInterval=1
server.1=10.240.204.157:21001:21002
server.2=10.240.204.165:21001:21002
server.3=10.240.204.149:21001:21002
|
配置文件参数说明
- tickTime:zookeeper中使用的基本时间单元,单位为毫秒,默认为2000
- initLimit:默认值为10,即tickTime的10倍,用于配置允许followers连接并同步leader的最大时间
- syncLimit:默认值为5,即tickTime的5倍,用于配置leader和followers之间进行心跳检测的最大延迟时间
- dataDir:zookeeper用来存储内存数据库快照的目录,并且除非指定其它目录,否则数据库更新的日志也将会存储在该目录下。
- dataLogDir:日志目录,不设置默认为dataDir
- clientPort:服务监听端口,默认为2181
- maxClientCnxns:在socket级别限制单个客户端与单个服务器之间的并发连接数量,可以通过IP来区分不同客户端,默认值为60。设置为0则完全放开限制
- autopurge.snapRetainCount:配置Zookeeper在自动清理的时候需要保留的数据文件快照的数量和对应的事务日志文件,默认值为3
- autopurge.purgeInterval:用于配置zookeeper自动清理文件的频率,默认值为1表示开启自动清理
- server.id:集群服务配置,21001为内部通讯端口,21002为选举端口
创建myid文件,对应zoo.cfg的server.id
1
2
3
|
[root@t-luhxdb01-p-szzb media]# echo "1" >/service/zookeeper/data/myid
[root@t-luhxdb02-p-szzb media]# echo "2" >/service/zookeeper/data/myid
[root@t-luhxdb03-p-szzb media]# echo "3" >/service/zookeeper/data/myid
|
启动zookeeper
1
2
3
4
|
$ zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
|
Codis Server安装
创建目录
1
|
$ mkdir /service/codis/ -p
|
解压安装包
1
2
|
$ tar -xvf codis3.2.2-go1.8.5-linux.tar.gz
$ mv codis3.2.2-go1.8.5-linux /usr/local/codis/
|
配置环境变量
1
2
|
$ echo "export PATH=$PATH:/usr/local/codis" >> /etc/profile
$ source /etc/profile
|
配置参数文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
$ cat /etc/codis/codis-server/codis.conf
daemonize yes
port 10001
timeout 0
tcp-keepalive 60
loglevel notice
databases 6
protected-mode no
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename "codis1.rdb"
dir "/service/codis"
pidfile "/service/codis/pid.file"
logfile "/service/codis/codis-server.log"
requirepass "Abcd123#"
maxmemory 24576000000
bind 10.240.204.157
repl-timeout 3600
slave-serve-stale-data yes
slave-read-only yes
repl-disable-tcp-nodelay no
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 0 0 0
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
slave-priority 50
|
Tips: 从库需要添加slaveof和masterauth参数
启动Codis
1
|
$ codis-server /etc/codis/codis-server/codis.conf
|
Codis Dashboard部署(157)
配置参数文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
##################################################
# #
# Codis-Dashboard #
# #
##################################################
# Set Coordinator, only accept "zookeeper" & "etcd" & "filesystem".
# for zookeeper/etcd, coorinator_auth accept "user:password"
# Quick Start
#coordinator_name = "filesystem"
#coordinator_addr = "/tmp/codis"
coordinator_name = "zookeeper"
coordinator_addr = "10.240.204.157:21000,10.240.204.165:21000,10.240.204.149:21000"
#coordinator_auth = ""
# Set Codis Product Name/Auth.
product_name = "codis-test"
product_auth = "Abcd123#”
# Set bind address for admin(rpc), tcp only.
admin_addr = "10.240.204.157:18080"
# Set arguments for data migration (only accept 'sync' & 'semi-async').
migration_method = "semi-async"
migration_parallel_slots = 100
migration_async_maxbulks = 200
migration_async_maxbytes = "32mb"
migration_async_numkeys = 500
migration_timeout = "30s"
# Set configs for redis sentinel.
sentinel_client_timeout = "10s"
sentinel_quorum = 2
sentinel_parallel_syncs = 1
sentinel_down_after = "30s"
sentinel_failover_timeout = "5m"
sentinel_notification_script = ""
sentinel_client_reconfig_script = ""
|
- coordinator_name:外部存储类型,接受zookeeper/etcd
- coordinator_addr:外部存储地址
- product_name:集群名称
- product_auth:集群密码,默认为空
- admin_addr:RESETFUL API端口
启动Dashboard
1
|
$ codis-dashboard --ncpu=2 --config=/etc/codis/codis-dashboard/dashboard.toml --log=/service/codis/dashboard.log --log-level=warn &
|
关闭Dashboard
1
|
$ codis-admin --dashboard=10.240.204.157:18080 --shutdown
|
Codis proxy部署
配置参数文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
$ cat /etc/codis/codis-proxy/proxy.toml
##################################################
# #
# Codis-Proxy #
# #
##################################################
# Set Codis Product Name/Auth.
product_name = "codis-test"
product_auth = "Abcd123#"
# Set auth for client session
# 1. product_auth is used for auth validation among codis-dashboard,
# codis-proxy and codis-server.
# 2. session_auth is different from product_auth, it requires clients
# to issue AUTH <PASSWORD> before processing any other commands.
session_auth = ""
# Set bind address for admin(rpc), tcp only.
admin_addr = "10.240.204.157:11000"
# Set bind address for proxy, proto_type can be "tcp", "tcp4", "tcp6", "unix" or "unixpacket".
proto_type = "tcp4"
proxy_addr = "10.240.204.157:19000"
# Set jodis address & session timeout
# 1. jodis_name is short for jodis_coordinator_name, only accept "zookeeper" & "etcd".
# 2. jodis_addr is short for jodis_coordinator_addr
# 3. jodis_auth is short for jodis_coordinator_auth, for zookeeper/etcd, "user:password" is accepted.
# 4. proxy will be registered as node:
# if jodis_compatible = true (not suggested):
# /zk/codis/db_{PRODUCT_NAME}/proxy-{HASHID} (compatible with Codis2.0)
# or else
# /jodis/{PRODUCT_NAME}/proxy-{HASHID}
jodis_name = "zookeeper"
jodis_addr = "10.240.204.157:21000,10.240.204.165:21000,10.240.204.149:21000"
jodis_timeout = "20s"
jodis_compatible = false
backend_ping_period = 5
session_max_timeout = 1800
session_max_bufsize = 131072
session_max_pipeline = 1024
session_keepalive_period = 60
|
- product_name:集群名称,与dashboard一致
- product_auth:集群密码,默认为空
- admin_addr:RESTfulAPI端口
- proxy_type:redis端口类型
- proxy_addr:redis端口地址或者路径
- jodis_timeout:jodis注册session timeout参数,单位为秒
- jodis_compatible:Jodis注册zookeeper的路径
- backend_ping_period:与codis-server探活周期,单位为秒,0表示禁止
- session_max_timeout:与client连接最大超时时间,单位为秒,0表示禁止
- session_max_bufsize:与client连接读写缓冲区大小,单位byte
- session_max_pipeline:与client连接最大的pipeline大小,官方建议不要超过1M,否则在迁移时会有卡顿感
- session_keepalive_period:与client的tcp keepalive周期,仅针对TCP,0表示禁止
启动proxy
1
|
$ codis-proxy --ncpu=2 --config=/etc/codis/codis-proxy/proxy.toml --log=/service/codis/proxy.log --log-level=warn &
|
关闭proxy
1
|
$ codis-admin --proxy=10.240.204.157:11000 --shutdown
|
Codis FE部署
生产参数文件
1
2
3
4
5
6
7
8
9
10
11
12
13
|
$ codis-admin --dashboard-list --zookeeper=10.240.204.157:21000 | tee codis.json
2019/08/09 16:38:34 zkclient.go:23: [INFO] zookeeper - zkclient setup new connection to 10.240.204.157:21000
2019/08/09 16:38:34 zkclient.go:23: [INFO] zookeeper - Connected to 10.240.204.157:21000
2019/08/09 16:38:34 zkclient.go:23: [INFO] zookeeper - Authenticated: id=73835832729600004, timeout=40000
2019/08/09 16:38:34 zkclient.go:23: [INFO] zookeeper - Re-submitting `0` credentials after reconnect
[
{
"name": "codis-test",
"dashboard": "10.240.204.157:18080"
}
]
2019/08/09 16:38:34 zkclient.go:23: [INFO] zookeeper - Recv loop terminated: err=EOF
2019/08/09 16:38:34 zkclient.go:23: [INFO] zookeeper - Send loop terminated: err=<nil>
|
启动FE
1
|
$ codis-fe --ncpu=1 --dashboard-list=/etc/codis/codis-fe/codis.json --listen=10.240.204.157:18090 --log=/service/codis/fe.log --log-level=warn --assets-dir=/usr/local/codis/assets/ &
|
关闭FE
1
|
$ ps -ef|grep codis-fe|grep -v grep|awk '{print $2}'|xargs kill
|
Sentinel配置
配置参数文件
1
2
3
4
5
|
port 10003
dir "/service/codis"
logfile "/service/codis/sentinel.log"
daemonize yes
protected-mode no
|
启动哨兵服务
1
|
$ codis-server /etc/codis/codis-server/sentinel.conf --sentinel
|
Codis群集配置
添加proxy
访问WEB控制台http://10.240.204.157:18090
点击new proxy添加proxy
添加Codis Server
先点击NEW GROUP添加一个组,再点击Add Server添加成员到组钟
添加Sentinel
添加哨兵服务,并点击sync,会自动获取主从关系
分配slot
默认1024个slot,可以将它们进行分组,1-300为group1,301-800为group2,801-1023为group3
迁移slot
若group1内存不足,则添加group4,将指定数量的slot从group1迁到group4中,迁移过程无影响
附录
dashboard异常关闭
dashboard异常关闭,错误日志如下
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
2019/02/11 13:31:43 topom.go:189: [ERROR] store: acquire lock of codis-testX failed
[error]: zk: node already exists
6 /opt/gowork/src/github.com/CodisLabs/codis/pkg/models/zk/zkclient.go:247
github.com/CodisLabs/codis/pkg/models/zk.(*Client).create
5 /opt/gowork/src/github.com/CodisLabs/codis/pkg/models/zk/zkclient.go:196
github.com/CodisLabs/codis/pkg/models/zk.(*Client).Create.func1
4 /opt/gowork/src/github.com/CodisLabs/codis/pkg/models/zk/zkclient.go:129
github.com/CodisLabs/codis/pkg/models/zk.(*Client).shell
3 /opt/gowork/src/github.com/CodisLabs/codis/pkg/models/zk/zkclient.go:195
github.com/CodisLabs/codis/pkg/models/zk.(*Client).Create
2 /opt/gowork/src/github.com/CodisLabs/codis/pkg/models/store.go:119
github.com/CodisLabs/codis/pkg/models.(*Store).Acquire
1 /opt/gowork/src/github.com/CodisLabs/codis/pkg/topom/topom.go:188
github.com/CodisLabs/codis/pkg/topom.(*Topom).Start
0 /opt/gowork/src/github.com/CodisLabs/codis/cmd/dashboard/main.go:169
main.main
... ...
[stack]:
1 /opt/gowork/src/github.com/CodisLabs/codis/pkg/topom/topom.go:189
github.com/CodisLabs/codis/pkg/topom.(*Topom).Start
0 /opt/gowork/src/github.com/CodisLabs/codis/cmd/dashboard/main.go:169
main.main
|
退出dashboard
1
|
$ codis-admin --dashboard=10.240.204.157:18080 --shutdown
|
删除lock
1
|
$ codis-admin --remove-lock --product=codis-test --zookeeper=10.240.204.157:21000
|
启动dashboard
1
|
$ codis-dashboard --ncpu=2 --config=/etc/codis/codis-dashboard/dashboard.toml --log=/service/codis/dashboard.log --log-level=warn &
|