JVM工具之Arthas使用
写在前面
源码 。
Arthas是淘宝开源的JVM问题定位工具,本文一起来看下。
1:准备
arthas是一个可执行的jar包,可以从这里 下载,运行jar包后会列出所有的java进程,用户可以选择附加到哪个java进程,进而就可以使用arthas提供的各种命令来定位问题和辅助性能优化了。如下开始使用操作步骤:
在arthas的交互命令行中就可以执行其提供的命令来进行操作了,下面一起来看下。
2:命令的使用
2.1:thread
thread命令用来查看线程信息,如线程的名称,所属组,优先级,占用CPU,运行状态等,如下可能结果:
[arthas@24632]$ thread
thread
Threads Total: 34, NEW: 0, RUNNABLE: 10, BLOCKED: 0, WAITING: 4, TIMED_WAITING:
3, TERMINATED: 0, Internal threads: 17
ID NAME GROUP PRIORI STATE %CPU DELTA TIME INTER DAEMON
12 useManyCpuThread main 5 RUNNA 101.28 0.203 0:33.1 false false
2 Reference Handler system 10 WAITI 0.0 0.000 0:0.00 false true
3 Finalizer system 8 WAITI 0.0 0.000 0:0.01 false true
4 Signal Dispatcher system 9 RUNNA 0.0 0.000 0:0.00 false true
5 Attach Listener system 5 RUNNA 0.0 0.000 0:0.01 false true
15 arthas-timer system 5 WAITI 0.0 0.000 0:0.00 false true
17 Keep-Alive-Timer system 8 TIMED 0.0 0.000 0:0.00 false true
18 arthas-NettyHttpTel system 5 RUNNA 0.0 0.000 0:0.01 false true
19 arthas-NettyWebsock system 5 RUNNA 0.0 0.000 0:0.00 false true
20 arthas-NettyWebsock system 5 RUNNA 0.0 0.000 0:0.00 false true
21 arthas-shell-server system 5 TIMED 0.0 0.000 0:0.00 false true
22 arthas-session-mana system 5 TIMED 0.0 0.000 0:0.00 false true
23 arthas-UserStat system 5 WAITI 0.0 0.000 0:0.00 false true
下面我们来模拟下不同场景下使用该命令来定位问题。
2.1.1:定位CPU彪高问题
CPU彪高一般就是有大量的计算工作在进行,而通过定位到具体的进程就能辅助找到问题代码,测试代码如下:
public class ManyCPUInThread {
public static void main(String[] args) {
new Thread(() -> {
while (true) {
Md5Utils.getMD5(UUID.randomUUID().toString().getBytes());
Md5Utils.getMD5(UUID.randomUUID().toString().getBytes());
Md5Utils.getMD5(UUID.randomUUID().toString().getBytes());
Md5Utils.getMD5(UUID.randomUUID().toString().getBytes());
Md5Utils.getMD5(UUID.randomUUID().toString().getBytes());
}
}, "useManyCpuThread").start();
}
}
其中线程名称是useManyCpuThread
,这将有助于我们定位导致问题的线程,根据阿里巴巴java规范手册 也应该这么做,如下图:
运行测试程序,突然发现CPU彪高,如下图:
我们通过如下步骤使用thread
命令来定位问题:
- 依附到Java进程
$ java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.6.0
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 11104 dongshi.daddy.ManyCPUInThread
...
1
[INFO] arthas home: C:\Users\Administrator\.arthas\lib\3.6.0\arthas
[INFO] Try to attach process 11104
[INFO] Attach process 11104 success.
[INFO] arthas-client connect 127.0.0.1 3658
,---. ,------. ,--------.,--. ,--. ,---. ,---.
/ O \ | .--. ''--. .--'| '--' | / O \ ' .-'
| .-. || '--'.' | | | .--. || .-. |`. `-.
| | | || |\ \ | | | | | || | | |.-' |
`--' `--'`--' '--' `--' `--' `--'`--' `--'`-----'
...
time 2022-04-21 13:50:48
[arthas@11104]$
- thread
[arthas@11104]$ thread
thread
Threads Total: 33, NEW: 0, RUNNABLE: 10, BLOCKED: 0, WAITING: 4, TIMED_WAITING:
2, TERMINATED: 0, Internal threads: 17
ID NAME GROUP PRIORI STATE %CPU DELTA TIME INTER DAEMON
12 useManyCpuThread main 5 RUNNA 107.63 0.218 6:54.7 false false
26 arthas-command-exec system 5 RUNNA 7.69 0.015 0:0.01 false true
2 Reference Handler system 10 WAITI 0.0 0.000 0:0.01 false true
3 Finalizer system 8 WAITI 0.0 0.000 0:0.01 false true
4 Signal Dispatcher system 9 RUNNA 0.0 0.000 0:0.00 false true
...
可以看到行12 useManyCpuThread main 5 RUNNA 107.63 0.218 6:54.7 false false
,名称为useManyCpuThread
的线程占用了107.63的CPU,然后就可以定位到具体代码,查找CPU占用多的原因了。
2.2:jvm
查看jvm相关信息,如进程号,内存大小,GC耗时,总加载类个数,总线程个数,等,如下:
jvm
RUNTIME
-------------------------------------------------------------------------------
MACHINE-NAME 11104@jhp
JVM-START-TIME 2022-04-21 13:45:26
MANAGEMENT-SPEC-VERSI 1.2
ON
SPEC-NAME Java Virtual Machine Specification
SPEC-VENDOR Oracle Corporation
SPEC-VERSION 1.8
VM-NAME Java HotSpot(TM) 64-Bit Server VM
VM-VENDOR Oracle Corporation
2.3:trace
在日常工作中经常会有用户反馈某个接口慢
,但是一般情况下我们是没有在方法中添加对应的耗时日志的,此时定位到底因为什么慢就不太好办了,arthas的trace命令针对这种场景给出了解决方案,允许我们动态的追踪某个方法的执行耗时,从而辅助定位性能问题,下面我们一起来看下。
2.3.1:命令格式
trace 类名称 方法名称
,如trace dongshi.daddy.MathGame run
,就是追踪dongshi.daddy.MathGame类的run方法。
2.3.2:测试程序
public class MathGame {
private static Random random = new Random();
private int illegalArgumentCount = 0;
public static void main(String[] args) throws InterruptedException {
MathGame game = new MathGame();
while (true) {
game.run();
TimeUnit.SECONDS.sleep(1L);
}
}
public void run() throws InterruptedException {
try {
int number = random.nextInt() / 10000;
List<Integer> primeFactors = primeFactors(number);
print(number, primeFactors);
} catch (Exception e) {
System.out.println(String.format("illegalArgumentCount:%3d, ", new Object[]{Integer.valueOf(this.illegalArgumentCount)}) + e.getMessage());
}
}
public static void print(int number, List<Integer> primeFactors) {
StringBuffer sb = new StringBuffer(number + "=");
for (Iterator<Integer> iterator = primeFactors.iterator(); iterator.hasNext(); ) {
int factor = ((Integer) iterator.next()).intValue();
sb.append(factor).append('*');
}
if (sb.charAt(sb.length() - 1) == '*')
sb.deleteCharAt(sb.length() - 1);
System.out.println(sb);
}
public List<Integer> primeFactors(int number) {
if (number < 2) {
this.illegalArgumentCount++;
throw new IllegalArgumentException("number is: " + number + ", need >= 2");
}
List<Integer> result = new ArrayList<Integer>();
int i = 2;
while (i <= number) {
if (number % i == 0) {
result.add(Integer.valueOf(i));
number /= i;
i = 2;
continue;
}
i++;
}
return result;
}
private void primeFactors111() {
System.out.println("xxxxxxxxxxxxxxx");
}
}
该类生成一个正整数,然后打印正整数的因子乘积的形式。运行后,继续看2.3.3:trace测试
。
2.3.3:trace测试
[arthas@20784]$ trace dongshi.daddy.MathGame run -n 2
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 70 ms, listenerId: 1
`---ts=2022-04-21 17:54:23;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
`---[0.5033ms] dongshi.daddy.MathGame:run()
+---[0.0348ms] dongshi.daddy.MathGame:primeFactors() #25
`---[0.1011ms] dongshi.daddy.MathGame:print() #26
`---ts=2022-04-21 17:54:24;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
`---[0.2798ms] dongshi.daddy.MathGame:run()
+---[0.1041ms] dongshi.daddy.MathGame:primeFactors() #25
`---[0.0942ms] dongshi.daddy.MathGame:print() #26
Command execution times exceed limit: 2, so command will exit. You can set it with -n option.
打印2次,实际情况可自行调整,默认是打印100次,以第二次输出为例---[0.2798ms] dongshi.daddy.MathGame:run()
,run方法总耗时0.2798ms
,+---[0.0348ms] dongshi.daddy.MathGame:primeFactors() #25
,primeFactors方法耗时0.0348ms
,代码行数为25
,---[0.0942ms] dongshi.daddy.MathGame:print() #26
,print方法耗时0.0942ms
,代码行数为26
。
总耗时大于每个方法的耗时总和,这是因为总耗时包含了arthas添加逻辑的耗时,但是对我们大概定位耗时程序影响不大。
- 显示jdk相关方法调用
默认是不显示jdk的方法调用信息的,可以设置--skipJDKMethod false
,如下:
[arthas@20784]$ trace --skipJDKMethod false dongshi.daddy.MathGame run -n1
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 24 ms, listenerId: 3
`---ts=2022-04-21 18:18:46;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
`---[0.5441ms] dongshi.daddy.MathGame:run()
+---[0.032ms] java.util.Random:nextInt() #24
+---[0.2704ms] dongshi.daddy.MathGame:primeFactors() #25
`---[0.1162ms] dongshi.daddy.MathGame:print() #26
Command execution times exceed limit: 1, so command will exit. You can set it with -n option.
- 根据耗时过滤
如果我们只关心耗时到达一定时长的方法调用,则可以添加#cost>毫秒数
,如#cost>20
,过滤耗时大于20ms的方法调用,如下测试:
[arthas@19320]$ trace dongshi.daddy.MathGame run -n2 '#cost>20'
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 73 ms, listenerId: 1
`---ts=2022-04-21 18:33:53;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
`---[1000.4337ms] dongshi.daddy.MathGame:run()
+---[999.895ms] dongshi.daddy.MathGame:primeFactors() #25
`---[0.122ms] dongshi.daddy.MathGame:print() #26
`---ts=2022-04-21 18:33:55;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
`---[1015.0411ms] dongshi.daddy.MathGame:run()
+---[1014.9108ms] dongshi.daddy.MathGame:primeFactors() #25
`---[0.0678ms] dongshi.daddy.MathGame:print() #26
Command execution times exceed limit: 2, so command will exit. You can set it with -n option.
注意是必须其中有一个方法耗时大于20ms才会显示该次方法调用链路,是显示完整的调用链路,而不是只显示满足耗时时长的方法调用。
2.3.4:真实场景实战
现在有个接口dongshi.daddy.action.JFormAction#canMoveToApplication
,执行时长超过100ms
,不符合接口要求,需要排查到底哪个方法执行慢,按照如下步骤操作:
[arthas@12784]$ trace dongshi.daddy.action.JFormAction canMoveToApplication -n5
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 2) cost in 222 ms, listenerId: 2
`---ts=2022-04-21 18:50:17;thread_name=http-bio-446-exec-11;id=31;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.WebappClassLoader@4d62a1fd
`---[115.7377ms] dongshi.daddy.action.JFormAction$$EnhancerByCGLIB$$4d98ee8e:canMoveToApplication()
`---[115.5969ms] org.springframework.cglib.proxy.MethodInterceptor:intercept()
`---[115.5403ms] dongshi.daddy.action.JFormAction:canMoveToApplication()
+---[0.0239ms] javax.servlet.http.HttpServletRequest:getParameter() #1006
+---[0.7267ms] org.slf4j.Logger:info() #1007
+---[14.1981ms] dongshi.daddy.action.JFormAction:getSessionUserInfo2() #1010
+---[0.0147ms] org.apache.commons.lang3.StringUtils:isBlank() #1018
+---[0.0131ms] javax.servlet.http.HttpServletRequest:getParameter() #1019
+---[0.0048ms] org.apache.commons.lang3.StringUtils:isBlank() #1020
+---[1.0612ms] dongshi.daddy.useformextract.factory.FormExtractorFactory:produceFormExtractor() #1021
+---[0.0531ms] dongshi.daddy.framework.web.bean.UserInfo:getDeptId() #1021
+---[0.0266ms] dongshi.daddy.action.JFormAction$1:<init>() #1021
+---[95.032ms] dongshi.daddy.useformextract.UseFormExtractor:extract() #1021
+---[1.6599ms] org.slf4j.Logger:info() #1028
+---[0.1656ms] com.alibaba.fastjson.JSON:toJSONString() #1029
`---[2.1009ms] dongshi.daddy.action.JFormAction:renderText() #1029
通过+---[95.032ms] dongshi.daddy.useformextract.UseFormExtractor:extract() #1021
可以看到,是1021行的UseFormExtractor:extract()耗时多,继续看:
[arthas@12784]$ trace dongshi.daddy.useformextract.UseFormExtractor extract
Press Q or Ctrl+C to abort.
Affect(class count: 5 , method count: 1) cost in 282 ms, listenerId: 4
`---ts=2022-04-21 18:56:54;thread_name=http-bio-446-exec-14;id=34;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.WebappClassLoader@4d62a1fd
`---[24.1424ms] dongshi.daddy.useformextract.BaseUseFormExtractor:extract()
+---[0.2228ms] org.slf4j.Logger:info() #21
+---[0.0169ms] dongshi.daddy.useformextract.BaseUseFormExtractor:beforeExtract() #22
+---[83.2874ms] dongshi.daddy.useformextract.BaseUseFormExtractor:doExtract() #23
+---[0.2633ms] org.slf4j.Logger:info() #24
+---[0.015ms] dongshi.daddy.useformextract.BaseUseFormExtractor:afterExtract() #25
`---[0.2138ms] org.slf4j.Logger:info() #26
+---[83.2874ms] dongshi.daddy.useformextract.BaseUseFormExtractor:doExtract() #23
可以看出是23行的BaseUseFormExtractor:doExtract(),继续看:
[arthas@12784]$ trace dongshi.daddy.useformextract.BaseUseFormExtractor doExtract -n1
Press Q or Ctrl+C to abort.
Affect(class count: 4 , method count: 4) cost in 400 ms, listenerId: 6
`---ts=2022-04-22 09:57:31;thread_name=http-bio-446-exec-19;id=39;is_daemon=true;priority=5;TCCL=org.apache.catalina.loader.WebappClassLoader@4d62a1fd
`---[45.2156ms] dongshi.daddy.useformextract.PhoneUseFormExtractor:doExtract()
+---[0.2457ms] org.slf4j.Logger:info() #26
+---[56.2093ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryDesignDetailByFormId() #28
+---[0.0828ms] org.apache.commons.lang.StringUtils:isBlank() #29
+---[5.2509ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryFieldsInMblConfig() #31
+---[0.046ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryTargetField() #32
+---[4.3191ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryFieldsInConfig() #36
+---[0.0421ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryByJfType() #39
+---[0.2355ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryByJfType() #61
+---[2.4586ms] dongshi.daddy.useformextract.PhoneUseFormExtractor:installJumpSetting() #62
+---[0.6199ms] dongshi.daddy.useformextract.PhoneUseFormExtractor:installDatasource() #64
+---[0.0262ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryByJfType() #66
+---[0.0103ms] dongshi.daddy.useformextract.PhoneUseFormExtractor:installDatasource() #67
+---[0.0207ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryByJfType() #69
`---[0.7863ms] org.slf4j.Logger:info() #99
+---[56.2093ms] dongshi.daddy.useformextract.BaseUseFormExtractor:queryDesignDetailByFormId() #28
处可以看出调用的是BaseUseFormExtractor:queryDesignDetailByFormId()
方法,该方法源码如下:
class FakeCls {
protected String queryDesignDetailByFormId(String formId) {
IJFormDao jFormDao = SpringContextUtil.getBean("jFormDao");
String designDetail = jFormDao.getDesignDetailById(formId);
return designDetail;
}
}
继续看jFormDao.getDesignDetailById(formId)
:
class FakeCls {
@Override
public String getDesignDetailById(String formId) {
String sql = "select name from user where id = '"+ xxx +"' and version = '1'";
return this.findSqlUnique(sql);
}
}
就能知道是因为sql语句select name from user where id = '"+ xxx +"' and version = '1'
,接下来就是查看该sql语句为什么慢了。
2.4:sc
通过sc -d 类名
可以查看类的信息,如类名称,修饰符,是否接口,是否枚举,父类等信息,如下:
[arthas@1548]$ sc -d dongshi.daddy.MathGame
class-info dongshi.daddy.MathGame
code-source /E:/workspace-idea/dongshidaddy-labs-new/myjvm/arthas/target/classes/
name dongshi.daddy.MathGame
isInterface false
isAnnotation false
isEnum false
isAnonymousClass false
isArray false
isLocalClass false
isMemberClass false
isPrimitive false
isSynthetic false
simple-name MathGame
modifier public
annotation
interfaces
super-class +-java.lang.Object
class-loader +-sun.misc.Launcher$AppClassLoader@18b4aac2
+-sun.misc.Launcher$ExtClassLoader@14ae5a5
classLoaderHash 18b4aac2
Affect(row-cnt:1) cost in 3 ms.
查看加载类:
[arthas@1548]$ sc dongshi.*
dongshi.daddy.MathGame
Affect(row-cnt:1) cost in 6 ms.
查看类字段:
[arthas@1548]$ sc -d -f dongshi.daddy.MathGame
class-info dongshi.daddy.MathGame
...
classLoaderHash 18b4aac2
fields name random
type java.util.Random
modifier private,static
value java.util.Random@73c46669
name illegalArgumentCount
type int
modifier private
2.5:jad
直接将jvm正在运行的class的字节码byte code反编译为Java 代码,这在运行环境出现问题,而又怀疑是因为代码没有部署成功造成时很有用,在知道arthas该工具之前我的做法如下一般是首先将线上的class下载到本地,然后使用jd-gui 工具反编译查看,但是如果是使用jad就简单的多了,只需要一条命令即可:
2.5.1:查看源码
[arthas@1548]$ jad --source-only dongshi.daddy.MathGame
/*
* Decompiled with CFR.
*/
package dongshi.daddy;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.concurrent.TimeUnit;
public class MathGame {
private static Random random = new Random();
private int illegalArgumentCount = 0;
public static void main(String[] args) throws InterruptedException {
MathGame game = new MathGame();
while (true) {
/*17*/ game.run();
/*18*/ TimeUnit.SECONDS.sleep(1L);
}
}
// 省略部分
public List<Integer> primeFactors(int number) {
try {
/*45*/ Thread.sleep(1000L);
}
// 省略部分
/*64*/ return result;
}
}
需要注意,反编译的代码和源码在行号上并非完全一致,但是会在前面通过/*源码行号*/来标记实际的行号。
2.5.2:反编译指定的方法
[arthas@1548]$ jad --source-only dongshi.daddy.MathGame print
public static void print(int number, List<Integer> primeFactors) {
StringBuffer sb = new StringBuffer(number + "=");
/*34*/ for (int factor : primeFactors) {
/*36*/ sb.append(factor).append('*');
}
/*38*/ if (sb.charAt(sb.length() - 1) == '*') {
/*39*/ sb.deleteCharAt(sb.length() - 1);
}
/*40*/ System.out.println(sb);
}
2.6:watch
方法执行数据观测,让你能方便的观察到指定方法的调用情况。能观察到的范围为:返回值、抛出异常、入参,通过编写 OGNL 表达式进行对应变量的查看。
2.6.1:方法执行之前打印信息
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors -b -n1
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 16 ms, listenerId: 4
method=dongshi.daddy.MathGame.primeFactors location=AtEnter
ts=2022-04-22 13:56:46; [cost=0.0579ms] result=@ArrayList[
@Object[][isEmpty=false;size=1],
@MathGame[dongshi.daddy.MathGame@49c2faae],
null,
]
-b:表示方法执行之前打印信息。
-n1:表示打印一次。
location=AtEnter:表示位置是方法进入前。
result=@ArrayList[:表示方法的返回值是ArrayList。
@Object[][isEmpty=false;size=1]:表示有1个不为空的入参。
@MathGame[dongshi.daddy.MathGame@49c2faae],:表示this对象。
null,:表示返回值,因为是方法调用前,所以是null。
看到入参和this信息展示的信息不够详细,可以通过设置-x 2
来设置属性遍历深度,如下:
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors -b -n1 -x2
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 18 ms, listenerId: 5
method=dongshi.daddy.MathGame.primeFactors location=AtEnter
ts=2022-04-22 14:09:27; [cost=0.1004ms] result=@ArrayList[
@Object[][
@Integer[-52359],
],
@MathGame[
random=@Random[java.util.Random@73c46669],
illegalArgumentCount=@Integer[3317],
],
null,
]
可以看到入参,this展示的更加详细了,其中入参是一个Integer,this有两个属性random和illegalArgumentCount。如果是我们不关心this,即不想要返回this信息,可以增加参数"{params,returnObj}"
,如下:
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors "{params,returnObj}" -b -n1
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 17 ms, listenerId: 11
method=dongshi.daddy.MathGame.primeFactors location=AtEnter
ts=2022-04-22 14:53:21; [cost=0.0826ms] result=@ArrayList[
@Object[][isEmpty=false;size=1],
null,
]
2.6.2:方法执行之后打印信息
使用的是-f
在正常返回和throw异常都会打印。
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors -f -n2 -x2
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 16 ms, listenerId: 9
method=dongshi.daddy.MathGame.primeFactors location=AtExit
ts=2022-04-22 14:36:19; [cost=4.162564302241E8ms] result=@ArrayList[
@Object[][
@Integer[1],
],
@MathGame[
random=@Random[java.util.Random@73c46669],
illegalArgumentCount=@Integer[3722],
],
@ArrayList[
@Integer[2],
@Integer[41149],
],
]
method=dongshi.daddy.MathGame.primeFactors location=AtExceptionExit
ts=2022-04-22 14:36:21; [cost=1002.2582ms] result=@ArrayList[
@Object[][
@Integer[-180175],
],
@MathGame[
random=@Random[java.util.Random@73c46669],
illegalArgumentCount=@Integer[3723],
],
null,
]
第一个输出location=AtExit
代表是方法正常返回了,返回的值是[2,41149]
的ArrayList(返回值都能看到,真心强大!)
,第二个输出location=AtExceptionExit
表示发生了异常,方法是以throw异常结束的。
2.6.3:方法返回前和返回后
这种方式通过方法返回前输出可以看到具体入参的值,通过方法返回后输出可以看到具体的返回值,如下:
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors "{params[0],returnObj}" -s -b -n5 -x2
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 19 ms, listenerId: 17
method=dongshi.daddy.MathGame.primeFactors location=AtEnter
ts=2022-04-22 15:20:55; [cost=0.0547ms] result=@ArrayList[
@Integer[194389],
null,
]
method=dongshi.daddy.MathGame.primeFactors location=AtExit
ts=2022-04-22 15:20:56; [cost=4.189333472393E8ms] result=@ArrayList[
@Integer[1],
@ArrayList[
@Integer[13],
@Integer[19],
@Integer[787],
],
]
通过@Integer[194389]
可看到方法的入参值是194389
,返回值是[13,19,787]
。
2.6.4:过滤
- 根据参数值过滤
过滤第一个参数小于0的方法调用:
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors "{params[0],returnObj}" "params[0]<0" -b -f -x2 -n1
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 20 ms, listenerId: 21
method=dongshi.daddy.MathGame.primeFactors location=AtEnter
ts=2022-04-22 15:43:58; [cost=0.0131ms] result=@ArrayList[
@Integer[-50140],
null,
]
Command execution times exceed limit: 1, so command will exit. You can set it with -n option.
- 根据参数类型过滤
需要修改严格模式为false。
[arthas@1548]$ options strict false
NAME BEFORE-VALUE AFTER-VALUE
-----------------------------------
strict true false
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors "{params, params[0].class.name}" 'params[0].class.name == "java.lang.Integer"' -b -f -x2 -n2
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 19 ms, listenerId: 27
method=dongshi.daddy.MathGame.primeFactors location=AtEnter
ts=2022-04-22 15:53:58; [cost=0.0907ms] result=@ArrayList[
@Object[][
@Integer[-125336],
],
@String[java.lang.Integer],
]
method=dongshi.daddy.MathGame.primeFactors location=AtExceptionExit
ts=2022-04-22 15:53:59; [cost=4.209165192229E8ms] result=@ArrayList[
@Object[][
@Integer[-125336],
],
@String[java.lang.Integer],
]
Command execution times exceed limit: 2, so command will exit. You can set it with -n option.
- 根据参数个数过滤
[arthas@1548]$ watch dongshi.daddy.MathGame primeFactors -b '{params,returnObj}' 'params.length == 1'
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 24 ms, listenerId: 36
method=dongshi.daddy.MathGame.primeFactors location=AtEnter
ts=2022-04-22 18:13:48; [cost=0.0611ms] result=@ArrayList[
@Object[][isEmpty=false;size=1],
null,
]
2.6.5:实际场景使用
当我在愉快的划水时,测试突然反馈一个问题,说某个接口不返回数据了,数据最终是通过如下方法返回的:
class FakeCls {
public ResultData selectData(String id) {
ResultData entity = formDao.queryEntryFormId(id);
return entity;
}
}
因为参数ID是前端传过来的,所以怀疑是这个参数的问题,便使用如下命令获取传入的参数值:
[arthas@13736]$ watch com.jh.jcs.alignForm.service.impl.FormApiService$$EnhancerByCGLIB$$5220dc21 selectFormDataByFormId '{params[0]}' -b -x2
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 93 ms, listenerId: 7
method=com.jh.jcs.alignForm.service.impl.FormApiService$$EnhancerByCGLIB$$5220dc21.selectFormDataByFormId location=AtEnter
ts=2022-04-22 18:36:18; [cost=0.0356ms] result=@ArrayList[
@String[2c91c81a7033d2d2ddd2c91c81a70],
]
可以看到获取的参数值是2c91c81a7033d2d2ddd2c91c81a70
,结果证明我的怀疑是错误的,但至少排除了一种错误的可能性,那没办法,继续找问题吧!
3:火焰图
这部分看下如何通过arthas的火焰图来定位相关问题。
3.1:火焰图定位CPU问题
如下测试代码:
public class B {
public static void main(String[] args) {
aaaaaaaa();
}
private static void aaaaaaaa() {
bbbbbbbbbbbbb();
}
private static void bbbbbbbbbbbbb() {
cccccccccccccc();
}
private static void cccccccccccccc() {
try {
while (true) {
TimeUnit.SECONDS.sleep(2);
System.out.println("cccccccccc");
new Thread(() -> {
while (true) {
System.out.println((UUID.randomUUID().toString().getBytes()));
System.out.println((UUID.randomUUID().toString().getBytes()));
System.out.println((UUID.randomUUID().toString().getBytes()));
System.out.println((UUID.randomUUID().toString().getBytes()));
System.out.println((UUID.randomUUID().toString().getBytes()));
}
}, "useManyCpuThread").start();
}
} catch (InterruptedException e) {
}
}
}
运行程序后使用arthas执行如下操作:
[root@localhost arthas]# java -jar arthas-boot-1.jar
[INFO] arthas-boot version: 3.6.0
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 2200 dongshi/daddy/huohuo/B
1 # 1:选择对应的PID
[INFO] arthas home: /root/.arthas/lib/3.7.0/arthas
[INFO] Try to attach process 2200
[INFO] Attach process 2200 success.
[INFO] arthas-client connect 127.0.0.1 3658
,---. ,------. ,--------.,--. ,--. ,---. ,---.
/ O \ | .--. ''--. .--'| '--' | / O \ ' .-'
| .-. || '--'.' | | | .--. || .-. |`. `-.
| | | || |\ \ | | | | | || | | |.-' |
`--' `--'`--' '--' `--' `--' `--'`--' `--'`-----'
...
[arthas@2200]$ profiler start # 2:开始采样CPU信息
Profiling started
[arthas@2200]$ profiler getSamples # 3:获取采样数
450
[arthas@2200]$ profiler getSamples
523
[arthas@2200]$ profiler stop # 4:停止采样,会直接生一个html,即下边的结果
OK
profiler output file: /root/study/arthas/arthas-output/20230807-114216.html
接下来我们可以打开这个html,如下图只需要看蓝色的,其他色的都不是Java代码
:
其中大红色对勾(注意是我自己打的)
,就是我们程序中消耗CPU的代码,为了更清楚的看到哪些操作分别消耗了多少CPU,可以单击对勾处,使其变为
100%,如下图:
其中sout占用了72.12%
,String.getBytes占用了1.96%
,UUDI.randomUUID占用了12.82%
,UUID.toString占用了8.2%
,对应到程序如下图:
3.2:火焰图定位内存问题
如下测试代码:
public class B {
public static void main(String[] args) {
aaaaaaaa();
}
private static void aaaaaaaa() {
bbbbbbbbbbbbb();
}
private static void bbbbbbbbbbbbb() {
cccccccccccccc();
}
private static void cccccccccccccc() {
// 一个EmptyObject大小
// 标记字 8字节 , class指针 4字节 ,int num 4字节,short shortNum 2字节,5个double 共40字节 此时共58字节,非8字节整数倍,alignment对齐填充6字节 共64字节
// 每秒生成5000个对象,即每秒堆内存占用64*1000*5=320k
List<EmptyObject> XYZZZList = new ArrayList<EmptyObject>();
try {
while (true) {
TimeUnit.MILLISECONDS.sleep(1);
System.out.println("cccccccccc");
XYZZZList.add(new EmptyObject());
XYZZZList.add(new EmptyObject());
XYZZZList.add(new EmptyObject());
XYZZZList.add(new EmptyObject());
XYZZZList.add(new EmptyObject());
}
} catch (InterruptedException e) {
}
}
}
class EmptyObject {
int num;
short shortNum;
double d1;
double d2;
double d3;
double d4;
double d5;
}
运行程序后使用arthas执行如下操作:
[root@localhost arthas]# java -jar arthas-boot-1.jar
[INFO] arthas-boot version: 3.6.0
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 2019 dongshi/daddy/huohuo/B
1
[INFO] arthas home: /root/.arthas/lib/3.7.0/arthas
[INFO] Try to attach process 2019
[INFO] Attach process 2019 success.
[INFO] arthas-client connect 127.0.0.1 3658
,---. ,------. ,--------.,--. ,--. ,---. ,---.
/ O \ | .--. ''--. .--'| '--' | / O \ ' .-'
| .-. || '--'.' | | | .--. || .-. |`. `-.
| | | || |\ \ | | | | | || | | |.-' |
`--' `--'`--' '--' `--' `--' `--'`--' `--'`-----'
wiki https://arthas.aliyun.com/doc
tutorials https://arthas.aliyun.com/doc/arthas-tutorials.html
version 3.7.0
main_class
pid 2019
time 2023-08-08 13:29:59
[arthas@2019]$ profiler start --event=alloc
Profiling started
[arthas@2019]$ profiler getSamples
4
[arthas@2019]$ profiler status
Profiling is running for 32 seconds
[arthas@2019]$ profiler stop
OK
profiler output file: /root/study/arthas/arthas-output/20230808-133051.html
--event=alloc
设置对内存占用采样。打开生成的html报告,如下图:
可以清除的看到main方法调用aaaaaa方法最终调用到ccccc方法,并在ccccc方法中产生了大量的EmptyObject对象,占用了总内存的53.85%
,酱!!!
写在后面
参考文章列表
更多推荐
所有评论(0)