Tcpdump工具是Unix和linux系统抓网络数据库包最有效的工具,windows上类似的工具是wireshark。 tcpdump可以将网络中传送的数据包的“头”完全截获下来提供分析。它支持针对网络层、协议、主机、网络或端口的过滤,并提供and、or、not等逻辑语句来帮助你去掉无用的信息。另外tcpdump可以导入的文件中,可以进一步使用wireshark和java代码进一步统计过滤分析。该命令需要root权限,命令会自动把网卡设置为混杂(promiscuous)状态

1,Tcpdump常用命令:

监听某个网卡

tcpdump -i bond0

显示和某主机192.168.0.1通信的数据包

tcpdump host 192.168.0.1

 源地址和目的地址,特殊端口的数据包

tcpdump src 192.168.1.100 and dst192.168.1.2 and port ftp

 查看udp数据包

tcpdump udp

 查看数据包的内容

tcpdump -A

 相关数据包写入某文件

tcpdump -w /tmp/tcpdump.cap

 2,TCPDUMP应用案例

tcpdump不仅可以处理日常网络相关问题问题,还可用于分析数据库问题,用于数据库调优

 案例1:客户端(192.168.15.14)突然不能访问sql server数据库(192.168.15.14)

 1,windows端使用wireshark抓到的报文,通过报文显示,SQLSERVER服务器端已经收到了ack请求,并把确认了相关请求(ACK=1),但是客户端都没有到确认请求

10:51:21.102439 IP (tos 0x10, ttl  60, id 45670, offset 0, flags [DF], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>

10:51:23.750271 IP (tos 0x10, ttl  60, id 45768, offset 0, flags [DF], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>

10:51:29.943904 IP (tos 0x10, ttl  60, id 45971, offset 0, flags [none], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>

10:51:42.045897 IP (tos 0x10, ttl  60, id 46849, offset 0, flags [none], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>

 

14309       23.459236000 192.168.1.219 192.168.15.14 TCP  60     50162 > ms-sql-s [SYN] Seq=0 Win=65535Len=0 MSS=1460

14310       23.459330000 192.168.15.14 192.168.1.219 TCP  58     ms-sql-s > 50162 [SYN, ACK] Seq=0 Ack=1Win=8192 Len=0 MSS=1460
 

2,为什么回包没有收到呢,使用trace命令看看

 

C:\Users\Administrator>tracert192.168.1.219

 

通过最多 30 个跃点跟踪到 192.168.1.219 的路由

 

 1     1 ms     1 ms    1 ms  192.168.15.30

 2    <1 毫秒   <1 毫秒   <1 毫秒 192.168.15.36

 3     1 ms     1 ms    1 ms  192.168.208.106

 4     1 ms     1 ms    1 ms  192.168.215.137

 5     1 ms     1 ms    1 ms  192.168.212.245

 6     1 ms    <1 毫秒   <1 毫秒 192.168.212.246

 7     1 ms    1 ms     1 ms  192.168.212.241

 8     1 ms     1 ms    1 ms  192.168.248.241

 9     1 ms     1 ms    1 ms  192.168.249.98

 10     2ms     5 ms     1 ms 192.168.1.219

 跟踪完成。
 

3,linux测trace发现不通,且数据库收到了请求的数据包,也发送了回包,但客户端没有收到回包,说明回去的数据包在路上丢了。基本判断为路由问题了。

yytlc:/#>traceroute 192.168.15.14

trying to get source for 192.168.15.14

source should be 192.168.1.219

traceroute to 192.168.15.14 (192.168.15.14)from 192.168.1.219 (192.168.1.219), 30 hops max

outgoing MTU = 1500

 1 192.168.1.217 (192.168.1.217)  4ms  2 ms 6 ms
 2 192.168.47.220 (192.168.47.220)  0ms  1 ms 6 ms
 3 192.168.253.41 (192.168.253.41)  8ms  8 ms 8 ms
 4  * * *
 5  * * *
 6  * * *

........

 trace路由时抓包结果为

12:08:49.834285 IP yytlc.61860 >192.168.15.14.33456: udp 1472

12:08:55.834091 IP yytlc.61860 >192.168.15.14.33457: udp 1472

12:09:00.835624 IP yytlc.61860 >192.168.15.14.33458: udp 1472

 而此时windows端wireshark抓包的结果显示,已经收到udp请求

 11539       47.422984000 192.168.1.219 192.168.15.14 UDP 1514         Source port: 61860  Destination port: 33457

 4,仅网络专家协助,junper路由器上的路由有问题,导致回包不能正确送达。

案例2:sqlplus客户端不能连接oracle数据库的问题,连接时报错ORA-12537

现象:连接报错

[oracle@localhost ~]$ sqlplus u/p@SMPDB

SQL*Plus: Release 11.2.0.2.0 Production on 星期一 11月 25 14:32:452013
 
Copyright (c) 1982, 2010, Oracle.  All rights reserved.

ERROR:

ORA-12537: TNS: 连接关闭

客户端抓包:收到了回来的数据包,但连接却关闭了

 

[root@localhost ~]# tcpdump -i eth0 host 192.168.3.220

tcpdump: verbose output suppressed, use -vor -vv for full protocol decode

listening on eth0, link-type EN10MB(Ethernet), capture size 96 bytes

 16:48:07.048525 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: S 2870102332:2870102332(0) win 5840 <mss1460,sackOK,timestamp 443389148 0,nop,wscale 7>

16:48:07.048872 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: S 2343325666:2343325666(0) ack 2870102333 win 65535<mss 1460,nop,wscale 3,sackOK,timestamp 32985 443389148>

16:48:07.048882 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: . ack 1 win 46 <nop,nop,timestamp 44338914932985>

16:48:07.049044 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: P 1:225(224) ack 1 win 46 <nop,nop,timestamp443389149 32985>

16:48:07.049145 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: . ack 225 win 8298 <nop,nop,timestamp 32986443389149>

16:49:07.370802 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: F 1:1(0) ack 225 win 8298 <nop,nop,timestamp 92987443389149>

16:49:07.370888 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: . ack 2 win 46 <nop,nop,timestamp 44344947192987>

16:49:07.371014 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: F 225:225(0) ack 2 win 46 <nop,nop,timestamp443449471 92987>

16:49:07.371121 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: . ack 226 win 8297 <nop,nop,timestamp 92987443449471>

数据库服务器端抓包,只收到了数据包请求,但没有回应的数据包(注意这个client端收到了回包是矛盾的,至今也没明白具体原因)

 

16:53:57.176963 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,nop,wscale 3,sackOK,TS val 32986 ecr 0], length 0

16:54:00.185469 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,nop,wscale 3,sackOK,TS val 35986 ecr 0], length 0

16:54:03.396744 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,nop,wscale 3,sackOK,TS val 39186 ecr 0], length 0

16:54:06.618718 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,sackOK,eol], length 0

16:54:09.846067 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,sackOK,eol], length 0

16:54:13.073922 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,sackOK,eol], length 0

16:54:19.326237 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 936514366, win 65535, options [mss1380,sackOK,eol], length 0

16:54:31.603109 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 936514366, win 65535, options [mss1380,sackOK,eol], length 0

16:54:55.892606 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 802356553, win 65535, options [mss1380,sackOK,eol], length 0

初步定位

既然服务器端收到了数据库包,说明1521端口,在防火墙已经开通了。问题在数据库服务器端。服务器的listener.log日志中也没有发现任何来自客户端的连接请求。

 最终定位:

数据库服务器上开启了iptables防火墙策略,导致客户端连不上数据库,在iptables上开通相关防火墙策略后,访问即正常了

案例3:使用linux iptables后ftp端口不通的情况

现象:ftp能正常连接,但不能传输数据

ftp不通时的抓包现象,数据传输使用了ftp-data端口

root@stylog1 ~]# tcpdump -i bond0 host 192.168.9.37
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:48:10.171437 IP 192.168.9.37.55460 > 192.168.5.5.ftp: Flags [P.], seq 2473112340:2473112365, ack 2946208393, win 8064, length 25
10:48:10.171486 IP 192.168.5.5.ftp > 192.168.9.37.55460: Flags [.], ack 25, win 115, length 0

10:51:38.397111 IP 192.168.5.5.ftp-data > 192.168.9.37.55516: Flags [S], seq 2207620674, win 14600, options [mss 1460,sackOK,TS val 1965825832 ecr 0,nop,wscale 7], length 0
10:51:54.397107 IP 192.168.5.5.ftp-data > 192.168.9.37.55516: Flags [S], seq 2207620674, win 14600, options [mss 1460,sackOK,TS val 1965841832 ecr 0,nop,wscale 7], length 0
ftp-data使用了20端口,这个端口没开防火墙策略
[root@stylog1 ~]# cat /etc/services |grep ftp-data
ftp-data        20/tcp
ftp-data        20/udp
ftp-data        20/sctp                 # FTP
kftp-data       6620/tcp                # Kerberos V5 FTP Data
kftp-data       6620/udp                # Kerberos V5 FTP Data

案例4:中间件服务器迁移到云平台后,业务访问慢的问题(20140331更新)

问题描述:服务器迁移到云平台后,业务测试时办理业务明显偏慢,迁移前1.2秒,迁移到云平台后需要2.7秒

由于是测试环境,使用了tcpdump进行了抓包分析,但办理一笔业务时,一共执行了600条sql语句

tcpdump -A -i eth0 -nn port 15701 and dst host 10.4.1.1|grep -i select
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
...................@select companyimp0_.compa0_.virtualHost as virtualH5_@6_0_, companyimp0_.mx as mx6_0_, companyimp0_.logoId as logoId6_=0_ from Company companyimp0_ where companyimp0_.companyId=:1 ............................T.......`
...................@select groupimpl0_.grou10_0_, groupimpl0_.name a@s name10_0_, groupimpl0_.description as descript9_10_0_, groupim@pl0_.type_ as type10_10_0_, groupimpl0_.typeSettings as typeSet1@1_10_0_, groupimpl0_.friendlyURL as friendl12_10_0_, groupimpl0_@.active_ as active13_10_0_ from Group_ groupimpl0_ where groupim.pl0_.groupId=:1 ............................T........
...
--共580+条sql语句
由于迁移前系统在同一个网段,单次sql查询需要1ms,而迁移后中间多了一道防火墙,需要3ms,但由于有600条sql,因此每笔业务增加时间1.8s。这就是慢的原因。

备注:由于是测试环境,抓sql的方法也可以使用oracle提供的trace,有时候tcpdump也是一种方便快捷的方法,可以在客户端操作。

通过oracle sql trace也是几乎同样的结果:583 user SQL statements in session.

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       19      0.00       0.00          0          0          0           0
Execute     90      0.00       0.02          0          0          0           0
Fetch      116      0.01       0.06         21        338          0         127
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      225      0.01       0.09         21        338          0         127

Misses in library cache during parse: 11
Misses in library cache during execute: 11

  583  user  SQL statements in session.
   90  internal SQL statements in session.
  673  SQL statements in session.


GitHub 加速计划 / li / linux-dash
10.39 K
1.2 K
下载
A beautiful web dashboard for Linux
最近提交(Master分支:2 个月前 )
186a802e added ecosystem file for PM2 4 年前
5def40a3 Add host customization support for the NodeJS version 4 年前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐