Tcpdump的用法及使用案例
Tcpdump工具是Unix和linux系统抓网络数据库包最有效的工具,windows上类似的工具是wireshark。 tcpdump可以将网络中传送的数据包的“头”完全截获下来提供分析。它支持针对网络层、协议、主机、网络或端口的过滤,并提供and、or、not等逻辑语句来帮助你去掉无用的信息。另外tcpdump可以导入的文件中,可以进一步使用wireshark和java代码进一步统计过滤分析。该命令需要root权限,命令会自动把网卡设置为混杂(promiscuous)状态
1,Tcpdump常用命令:
监听某个网卡
tcpdump -i bond0
显示和某主机192.168.0.1通信的数据包
tcpdump host 192.168.0.1
源地址和目的地址,特殊端口的数据包
tcpdump src 192.168.1.100 and dst192.168.1.2 and port ftp
查看udp数据包
tcpdump udp
查看数据包的内容
tcpdump -A
相关数据包写入某文件
tcpdump -w /tmp/tcpdump.cap
2,TCPDUMP应用案例
tcpdump不仅可以处理日常网络相关问题问题,还可用于分析数据库问题,用于数据库调优
案例1:客户端(192.168.15.14)突然不能访问sql server数据库(192.168.15.14)
1,windows端使用wireshark抓到的报文,通过报文显示,SQLSERVER服务器端已经收到了ack请求,并把确认了相关请求(ACK=1),但是客户端都没有到确认请求
10:51:21.102439 IP (tos 0x10, ttl 60, id 45670, offset 0, flags [DF], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>
10:51:23.750271 IP (tos 0x10, ttl 60, id 45768, offset 0, flags [DF], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>
10:51:29.943904 IP (tos 0x10, ttl 60, id 45971, offset 0, flags [none], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>
10:51:42.045897 IP (tos 0x10, ttl 60, id 46849, offset 0, flags [none], length:44) yytlc.50162 > 192.168.15.14.ms-sql-s: S [tcp sum ok]616881461:616881461(0) win 65535 <mss 1460>
14309 23.459236000 192.168.1.219 192.168.15.14 TCP 60 50162 > ms-sql-s [SYN] Seq=0 Win=65535Len=0 MSS=1460
14310 23.459330000 192.168.15.14 192.168.1.219 TCP 58 ms-sql-s > 50162 [SYN, ACK] Seq=0 Ack=1Win=8192 Len=0 MSS=1460
2,为什么回包没有收到呢,使用trace命令看看
C:\Users\Administrator>tracert192.168.1.219
通过最多 30 个跃点跟踪到 192.168.1.219 的路由
1 1 ms 1 ms 1 ms 192.168.15.30
2 <1 毫秒 <1 毫秒 <1 毫秒 192.168.15.36
3 1 ms 1 ms 1 ms 192.168.208.106
4 1 ms 1 ms 1 ms 192.168.215.137
5 1 ms 1 ms 1 ms 192.168.212.245
6 1 ms <1 毫秒 <1 毫秒 192.168.212.246
7 1 ms 1 ms 1 ms 192.168.212.241
8 1 ms 1 ms 1 ms 192.168.248.241
9 1 ms 1 ms 1 ms 192.168.249.98
10 2ms 5 ms 1 ms 192.168.1.219
跟踪完成。
3,linux测trace发现不通,且数据库收到了请求的数据包,也发送了回包,但客户端没有收到回包,说明回去的数据包在路上丢了。基本判断为路由问题了。
yytlc:/#>traceroute 192.168.15.14
trying to get source for 192.168.15.14
source should be 192.168.1.219
traceroute to 192.168.15.14 (192.168.15.14)from 192.168.1.219 (192.168.1.219), 30 hops max
outgoing MTU = 1500
1 192.168.1.217 (192.168.1.217) 4ms 2 ms 6 ms
2 192.168.47.220 (192.168.47.220) 0ms 1 ms 6 ms
3 192.168.253.41 (192.168.253.41) 8ms 8 ms 8 ms
4 * * *
5 * * *
6 * * *
........
trace路由时抓包结果为
12:08:49.834285 IP yytlc.61860 >192.168.15.14.33456: udp 1472
12:08:55.834091 IP yytlc.61860 >192.168.15.14.33457: udp 1472
12:09:00.835624 IP yytlc.61860 >192.168.15.14.33458: udp 1472
而此时windows端wireshark抓包的结果显示,已经收到udp请求
11539 47.422984000 192.168.1.219 192.168.15.14 UDP 1514 Source port: 61860 Destination port: 33457
4,仅网络专家协助,junper路由器上的路由有问题,导致回包不能正确送达。案例2:sqlplus客户端不能连接oracle数据库的问题,连接时报错ORA-12537
现象:连接报错
[oracle@localhost ~]$ sqlplus u/p@SMPDB
SQL*Plus: Release 11.2.0.2.0 Production on 星期一 11月 25 14:32:452013
Copyright (c) 1982, 2010, Oracle. All rights reserved.
ERROR:
ORA-12537: TNS: 连接关闭
客户端抓包:收到了回来的数据包,但连接却关闭了
[root@localhost ~]# tcpdump -i eth0 host 192.168.3.220
tcpdump: verbose output suppressed, use -vor -vv for full protocol decode
listening on eth0, link-type EN10MB(Ethernet), capture size 96 bytes
16:48:07.048525 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: S 2870102332:2870102332(0) win 5840 <mss1460,sackOK,timestamp 443389148 0,nop,wscale 7>
16:48:07.048872 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: S 2343325666:2343325666(0) ack 2870102333 win 65535<mss 1460,nop,wscale 3,sackOK,timestamp 32985 443389148>
16:48:07.048882 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: . ack 1 win 46 <nop,nop,timestamp 44338914932985>
16:48:07.049044 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: P 1:225(224) ack 1 win 46 <nop,nop,timestamp443389149 32985>
16:48:07.049145 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: . ack 225 win 8298 <nop,nop,timestamp 32986443389149>
16:49:07.370802 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: F 1:1(0) ack 225 win 8298 <nop,nop,timestamp 92987443389149>
16:49:07.370888 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: . ack 2 win 46 <nop,nop,timestamp 44344947192987>
16:49:07.371014 IP 192.168.1.45.38405 >192.168.3.220.ncube-lm: F 225:225(0) ack 2 win 46 <nop,nop,timestamp443449471 92987>
16:49:07.371121 IP 192.168.3.220.ncube-lm> 192.168.1.45.38405: . ack 226 win 8297 <nop,nop,timestamp 92987443449471>
数据库服务器端抓包,只收到了数据包请求,但没有回应的数据包(注意这个client端收到了回包是矛盾的,至今也没明白具体原因)
16:53:57.176963 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,nop,wscale 3,sackOK,TS val 32986 ecr 0], length 0
16:54:00.185469 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,nop,wscale 3,sackOK,TS val 35986 ecr 0], length 0
16:54:03.396744 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,nop,wscale 3,sackOK,TS val 39186 ecr 0], length 0
16:54:06.618718 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,sackOK,eol], length 0
16:54:09.846067 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,sackOK,eol], length 0
16:54:13.073922 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 1170139240, win 65535, options [mss1380,sackOK,eol], length 0
16:54:19.326237 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 936514366, win 65535, options [mss1380,sackOK,eol], length 0
16:54:31.603109 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 936514366, win 65535, options [mss1380,sackOK,eol], length 0
16:54:55.892606 IP 192.168.1.45.38405 >DSAPP2.ncube-lm: Flags [S], seq 802356553, win 65535, options [mss1380,sackOK,eol], length 0
初步定位
既然服务器端收到了数据库包,说明1521端口,在防火墙已经开通了。问题在数据库服务器端。服务器的listener.log日志中也没有发现任何来自客户端的连接请求。
最终定位:
数据库服务器上开启了iptables防火墙策略,导致客户端连不上数据库,在iptables上开通相关防火墙策略后,访问即正常了
案例3:使用linux iptables后ftp端口不通的情况
现象:ftp能正常连接,但不能传输数据
ftp不通时的抓包现象,数据传输使用了ftp-data端口
root@stylog1 ~]# tcpdump -i bond0 host 192.168.9.37
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:48:10.171437 IP 192.168.9.37.55460 > 192.168.5.5.ftp: Flags [P.], seq 2473112340:2473112365, ack 2946208393, win 8064, length 25
10:48:10.171486 IP 192.168.5.5.ftp > 192.168.9.37.55460: Flags [.], ack 25, win 115, length 0
10:51:38.397111 IP 192.168.5.5.ftp-data > 192.168.9.37.55516: Flags [S], seq 2207620674, win 14600, options [mss 1460,sackOK,TS val 1965825832 ecr 0,nop,wscale 7], length 0
10:51:54.397107 IP 192.168.5.5.ftp-data > 192.168.9.37.55516: Flags [S], seq 2207620674, win 14600, options [mss 1460,sackOK,TS val 1965841832 ecr 0,nop,wscale 7], length 0
ftp-data使用了20端口,这个端口没开防火墙策略
[root@stylog1 ~]# cat /etc/services |grep ftp-data
ftp-data 20/tcp
ftp-data 20/udp
ftp-data 20/sctp # FTP
kftp-data 6620/tcp # Kerberos V5 FTP Data
kftp-data 6620/udp # Kerberos V5 FTP Data
案例4:中间件服务器迁移到云平台后,业务访问慢的问题(20140331更新)
问题描述:服务器迁移到云平台后,业务测试时办理业务明显偏慢,迁移前1.2秒,迁移到云平台后需要2.7秒
由于是测试环境,使用了tcpdump进行了抓包分析,但办理一笔业务时,一共执行了600条sql语句
tcpdump -A -i eth0 -nn port 15701 and dst host 10.4.1.1|grep -i select
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
...................@select companyimp0_.compa0_.virtualHost as virtualH5_@6_0_, companyimp0_.mx as mx6_0_, companyimp0_.logoId as logoId6_=0_ from Company companyimp0_ where companyimp0_.companyId=:1 ............................T.......`
...................@select groupimpl0_.grou10_0_, groupimpl0_.name a@s name10_0_, groupimpl0_.description as descript9_10_0_, groupim@pl0_.type_ as type10_10_0_, groupimpl0_.typeSettings as typeSet1@1_10_0_, groupimpl0_.friendlyURL as friendl12_10_0_, groupimpl0_@.active_ as active13_10_0_ from Group_ groupimpl0_ where groupim.pl0_.groupId=:1 ............................T........
...
--共580+条sql语句
由于迁移前系统在同一个网段,单次sql查询需要1ms,而迁移后中间多了一道防火墙,需要3ms,但由于有600条sql,因此每笔业务增加时间1.8s。这就是慢的原因。
备注:由于是测试环境,抓sql的方法也可以使用oracle提供的trace,有时候tcpdump也是一种方便快捷的方法,可以在客户端操作。
通过oracle sql trace也是几乎同样的结果:583 user SQL statements in session.
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 19 0.00 0.00 0 0 0 0
Execute 90 0.00 0.02 0 0 0 0
Fetch 116 0.01 0.06 21 338 0 127
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 225 0.01 0.09 21 338 0 127
Misses in library cache during parse: 11
Misses in library cache during execute: 11
583 user SQL statements in session.
90 internal SQL statements in session.
673 SQL statements in session.
更多推荐
所有评论(0)