percona-toolkit : pt-stalk

pt-stalk采用shell脚本编写,主要用于在问题时间点收集OS及MySQL的诊断信息,包括CPU,内存,磁盘等资源以及数据库锁等待,主从复制,状态等信息。

触发

pt-stalk可以用后台服务的方式监控MySQL并指定触发条件,当触发条件时收集相关当前时间点系统和数据库信息。相关参数如下:

  • function:默认为status,表示监控show global status的输出;processlist表示监控show processlist输出,也可以自定义监控脚本
  • variable:默认为Threads_runing,表示监控参数,可根据监控输出自行指定
  • threshold:默认为25,表示监控阈值,超过阈值将触发条件;如果参数非数值,需要配合match一同使用,
  • cycles:默认为5,表示连续满足5次触发条件,才触发信息收集
  • iterations:指定收集次数,到达参数指定后退出,默认一直运行
  • run-time:收集多长时间的数据,默认30秒
  • sleep:前一次触发收集后,休息多长时间再次开启监控,默认300秒
  • interval:状态检查频率,默认1秒
  • dest:监控数据存放地址,默认为/var/lib/pt-stalk
  • retention-time:监控数据保留时长,默认30天
  • daemonize:后台运行
  • log:运行日志,默认为/var/log/pt-stalk.log
  • collect:条件触发时收集诊断数据。collect-gdb表示收集GDB堆栈跟踪;collect-strace表示收集跟踪数据;collect-tcpdump表示收集tcpdump数据

示例

创建cpu使用率判断脚本

1
2
3
4
$ cat /root/highcpu.sh
function cpu_check(){
  a=$(sar 1 1 | grep -i "Average:"| awk '{print $8}');echo 100 - $a |bc
}

开启守护进程,cpu使用率超过50%触发收集

1
pt-stalk --daemonize --dest=/tmp/pt-stalk --user=root --password=Abcd123# --port=33006 --function=/root/highcpu.sh --variable highcpu --cycles=3 --interval=1 --threshold 50 --sleep=60 --log=/var/log/pt-stalk.log

查看收集的文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
[root@t-luhx03-v-szzb pt-stalk]# ls -lrt
total 4800
-rw-r----- 1 root root     395 Jan 18 16:56 2021_01_18_16_56_59-trigger
-rw-r----- 1 root root   14937 Jan 18 16:56 2021_01_18_16_56_59-pmap
-rw-r----- 1 root root   24865 Jan 18 16:57 2021_01_18_16_56_59-variables
-rw-r----- 1 root root    1258 Jan 18 16:57 2021_01_18_16_56_59-log_error
-rw-r----- 1 root root    9370 Jan 18 16:57 2021_01_18_16_56_59-innodbstatus1
-rw-r----- 1 root root   31995 Jan 18 16:57 2021_01_18_16_56_59-ps
-rw-r----- 1 root root      55 Jan 18 16:57 2021_01_18_16_56_59-mutex-status1
-rw-r----- 1 root root    8927 Jan 18 16:57 2021_01_18_16_56_59-opentables1
-rw-r----- 1 root root   39172 Jan 18 16:57 2021_01_18_16_56_59-sysctl
-rw-r----- 1 root root    9393 Jan 18 16:57 2021_01_18_16_56_59-lsof
-rw-r----- 1 root root     137 Jan 18 16:57 2021_01_18_16_56_59-disk-space
-rw-r----- 1 root root   29387 Jan 18 16:57 2021_01_18_16_56_59-iostat
-rw-r----- 1 root root    2784 Jan 18 16:57 2021_01_18_16_56_59-vmstat
-rw-r----- 1 root root 1081739 Jan 18 16:57 2021_01_18_16_56_59-mysqladmin
-rw-r----- 1 root root  104546 Jan 18 16:57 2021_01_18_16_56_59-procstat
-rw-r----- 1 root root   38100 Jan 18 16:57 2021_01_18_16_56_59-meminfo
-rw-r----- 1 root root   32462 Jan 18 16:57 2021_01_18_16_56_59-diskstats
-rw-r----- 1 root root   77935 Jan 18 16:57 2021_01_18_16_56_59-procvmstat
-rw-r----- 1 root root   89640 Jan 18 16:57 2021_01_18_16_56_59-netstat_s
-rw-r----- 1 root root  165210 Jan 18 16:57 2021_01_18_16_56_59-interrupts
-rw-r----- 1 root root   28080 Jan 18 16:57 2021_01_18_16_56_59-df
-rw-r----- 1 root root  375240 Jan 18 16:57 2021_01_18_16_56_59-slabinfo
-rw-r----- 1 root root   24018 Jan 18 16:57 2021_01_18_16_56_59-processlist
-rw-r----- 1 root root  735190 Jan 18 16:57 2021_01_18_16_56_59-netstat
-rw-r----- 1 root root   11730 Jan 18 16:57 2021_01_18_16_56_59-slave-status
-rw-r----- 1 root root    9423 Jan 18 16:57 2021_01_18_16_56_59-innodbstatus2
-rw-r----- 1 root root      16 Jan 18 16:57 2021_01_18_16_56_59-hostname
-rw-r----- 1 root root    8927 Jan 18 16:57 2021_01_18_16_56_59-opentables2
-rw-r----- 1 root root      55 Jan 18 16:57 2021_01_18_16_56_59-mutex-status2
-rw-r----- 1 root root    1242 Jan 18 16:57 2021_01_18_16_56_59-mpstat-overall
-rw-r----- 1 root root   18149 Jan 18 16:57 2021_01_18_16_56_59-mpstat
-rw-r----- 1 root root    2031 Jan 18 16:57 2021_01_18_16_56_59-iostat-overall
-rw-r----- 1 root root     326 Jan 18 16:57 2021_01_18_16_56_59-vmstat-overall
-rw-r----- 1 root root  379534 Jan 18 16:58 2021_01_18_16_56_59-top
-rw-r----- 1 root root   36862 Jan 18 16:58 2021_01_18_16_56_59-output
-rw-r----- 1 root root     396 Jan 18 16:58 2021_01_18_16_58_33-trigger
-rw-r----- 1 root root   15691 Jan 18 16:58 2021_01_18_16_58_33-pmap
-rw-r----- 1 root root   21675 Jan 18 16:58 2021_01_18_16_58_33-variables
-rw-r----- 1 root root    1258 Jan 18 16:58 2021_01_18_16_58_33-log_error
-rw-r----- 1 root root    9372 Jan 18 16:58 2021_01_18_16_58_33-innodbstatus1
-rw-r----- 1 root root   31379 Jan 18 16:58 2021_01_18_16_58_33-ps
-rw-r----- 1 root root      55 Jan 18 16:58 2021_01_18_16_58_33-mutex-status1
-rw-r----- 1 root root    8927 Jan 18 16:58 2021_01_18_16_58_33-opentables1
-rw-r----- 1 root root   39173 Jan 18 16:58 2021_01_18_16_58_33-sysctl
-rw-r----- 1 root root       0 Jan 18 16:58 2021_01_18_16_58_33-dmesg
-rw-r----- 1 root root     244 Jan 18 16:58 2021_01_18_16_58_33-vmstat-overall
-rw-r----- 1 root root    1054 Jan 18 16:58 2021_01_18_16_58_33-iostat-overall
-rw-r----- 1 root root      76 Jan 18 16:58 2021_01_18_16_58_33-mpstat-overall
-rw-r----- 1 root root    9393 Jan 18 16:58 2021_01_18_16_58_33-lsof
-rw-r----- 1 root root   76856 Jan 18 16:58 2021_01_18_16_58_33-top
-rw-r----- 1 root root    6489 Jan 18 16:58 2021_01_18_16_58_33-mpstat
-rw-r----- 1 root root   11801 Jan 18 16:58 2021_01_18_16_58_33-iostat
-rw-r----- 1 root root    1146 Jan 18 16:58 2021_01_18_16_58_33-vmstat
-rw-r----- 1 root root  430080 Jan 18 16:58 2021_01_18_16_58_33-mysqladmin
-rw-r----- 1 root root   31173 Jan 18 16:58 2021_01_18_16_58_33-procvmstat
-rw-r----- 1 root root   41820 Jan 18 16:58 2021_01_18_16_58_33-procstat
-rw-r----- 1 root root     528 Jan 18 16:58 2021_01_18_16_58_33-lock-waits
-rw-r----- 1 root root   66084 Jan 18 16:58 2021_01_18_16_58_33-interrupts
-rw-r----- 1 root root     528 Jan 18 16:58 2021_01_18_16_58_33-transactions
-rw-r----- 1 root root   12984 Jan 18 16:58 2021_01_18_16_58_33-diskstats
-rw-r----- 1 root root     528 Jan 18 16:58 2021_01_18_16_58_33-prepared-statements
-rw-r----- 1 root root   11232 Jan 18 16:58 2021_01_18_16_58_33-df
-rw-r----- 1 root root   35856 Jan 18 16:58 2021_01_18_16_58_33-netstat_s
-rw-r----- 1 root root   15240 Jan 18 16:58 2021_01_18_16_58_33-meminfo
-rw-r----- 1 root root  150096 Jan 18 16:58 2021_01_18_16_58_33-slabinfo
-rw-r----- 1 root root   11118 Jan 18 16:58 2021_01_18_16_58_33-processlist
-rw-r----- 1 root root  289026 Jan 18 16:58 2021_01_18_16_58_33-netstat
-rw-r----- 1 root root    4692 Jan 18 16:58 2021_01_18_16_58_33-slave-status
-rw-r----- 1 root root   19495 Jan 18 16:58 2021_01_18_16_58_33-output
-rw-r----- 1 root root     137 Jan 18 16:58 2021_01_18_16_58_33-disk-space

注:重点关注的有innodbstatus、iostat、lock-waits,transactions等

分析

percona-toolkit中还有一个pt-sift工具用于分析pt-stalk采集的数据,进行汇总展示。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[root@t-luhx03-v-szzb ~]# pt-sift /tmp/log/pt-stalk/2021_01_18_17_12_00
======== t-luhx03-v-szzb at 2021_01_18_17_12_00 DEFAULT (15 of 15) ========
--diskstats--
  #ts device    rd_s rd_avkb rd_mb_s rd_mrg rd_cnc   rd_rt    wr_s wr_avkb wr_mb_s wr_mrg wr_cnc   wr_rt busy in_prg    io_s  qtime stime
 {29} dm-0       0.1    39.0     0.0     0%    0.0    13.0    13.1     6.8     0.1     0%    0.0     1.3   0%      0    13.2    1.2   0.3
 dm-0  0% . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
--vmstat--
 r b   swpd   free   buff   cache si so bi  bo   in   cs us sy id wa st
14 0 133756 134628 277768 3901324  0  0  1  38    1    0  2  1 97  0  0
 0 0 133756 133356 277784 3904740  0  0  5 150 2538 2345  2  2 95  0  0
wa 0% . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
--innodb--
    txns: 1xnot (0s)
    0 queries inside InnoDB, 0 queries in queue
    Main thread: sleeping, pending reads 0, writes 0, flush 0
    Log: lsn = 17603194, chkp = 17603185, chkp age = 9
    Threads are waiting at:
    Threads are waiting on:
--processlist--
    State
      2  
      1  Waiting on empty queue
      1  starting
      1  logging slow query
    Command
      2  Sleep
      2  Query
      1  Daemon
--stack traces--
    No stack trace file exists
--oprofile--
    No opreport file exists
Licensed under CC BY-NC-SA 4.0
comments powered by Disqus