有时候,我们在使用Java做一些操作时,可能性能上并不能达到我们满意的效果,就拿最近工作中的遇到的一个场景来说,需要对大量的小文件进行合并成一个大文件。

最开始的想法是使用Java做文件操作,遍历所有小文件然后往一个文件写(可以做成并发写),但是发现操作过程中遇到个问题,写一千多个小文件在本机Windows下需要花费几十秒的时间,即使在Linux环境下高配置的机器也需要将近十秒,这明显对接口的响应时间产生重要影响。这块怎么优化下呢?

我们都知道在Linux下可以进行大文件的分割和合并,分别采用split和cat命令,于是做了个实验,在Linux下对相同的一个1G文件进行切割成1000个小文件,然后对这一千多个小文件进行合并。效果是惊人的!!!竟然瞬间就能合成完成了!这更加让我坚定了应该使用系统命令进行批量小文件进行合并的想法。

我们这里封装一个类,用来调用系统命令,然后得到系统调用的返回结果。

我们先封装了一个返回结果类:

package com.majing.learning.fileupload.common.process;

public class ProcessResult {
	private boolean success = false;
	private String errorMessage;
	private String outputMessage;
	public boolean isSuccess() {
		return success;
	}
	public void setSuccess(boolean success) {
		this.success = success;
	}
	public String getErrorMessage() {
		return errorMessage;
	}
	public void setErrorMessage(String errorMessage) {
		this.errorMessage = errorMessage;
	}
	public String getOutputMessage() {
		return outputMessage;
	}
	public void setOutputMessage(String outputMessage) {
		this.outputMessage = outputMessage;
	}
	
}

接着我们给出封装的系统调用实现类:

package com.majing.learning.fileupload.common.process;

import java.io.IOException;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;

import org.apache.commons.lang3.StringUtils;

public class CommandUtils {
	
	public static ProcessResult runCmdTest(ExecutorService executorService, String command) throws IOException, InterruptedException {
		StringBuilder queryInputResult = new StringBuilder();
		StringBuilder queryErroInputResult = new StringBuilder();
		ProcessResult processResult = new ProcessResult();
		String[] cmd = { "/bin/sh", "-c", command};
		Process pro = Runtime.getRuntime().exec(cmd);
		CountDownLatch lock = new CountDownLatch(2);
		executorService.submit(new ProcessCheckTask(queryInputResult, lock, pro.getInputStream()));
		executorService.submit(new ProcessCheckTask(queryErroInputResult, lock, pro.getErrorStream()));
		boolean done = false;
		while (!done) {
			lock.await();
			done = true;
		}
		processResult.setOutputMessage(queryInputResult.toString());
		processResult.setErrorMessage(queryErroInputResult.toString());
		processResult.setSuccess(StringUtils.isBlank(processResult.getErrorMessage()));
		return processResult;
	}
}

其中ProcessCheckTask类如下:

package com.majing.learning.fileupload.common.process;

import java.io.BufferedReader;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.concurrent.CountDownLatch;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.majing.learning.fileupload.common.ConstValues;

public class ProcessCheckTask implements Runnable {
	
	private static Logger logger = LoggerFactory.getLogger(ProcessCheckTask.class);

	/** 锁 */
	private CountDownLatch lock;

	/** 执行结果输入流 */
	private InputStream inputStream;

	/** 字符拼接 */
	private StringBuilder queryInputResult;

	public ProcessCheckTask(StringBuilder queryInputResult, CountDownLatch lock, InputStream inputStream) {
		super();
		this.lock = lock;
		this.inputStream = inputStream;
		this.queryInputResult = queryInputResult;
	}

	@Override
	public void run() {
		try {
			BufferedReader bf = new BufferedReader(new InputStreamReader(inputStream));
			String line = null;
			while ((line = bf.readLine()) != null && line.length() > 0) {
				queryInputResult.append(line).append("\n");
			}
		} catch (Exception e) {
			logger.error(ConstValues.EXCEPTION_OCCURED, e);
		} finally {
			lock.countDown();
		}
	}
}

上面是一个简单实现,但是可能会存在一个问题,那就是执行系统命令的时间如果本身比较长,如果不想一直等待到系统命令执行完,而是在一段时间没有返回就直接认为失败,所以需要增加过期时间的考虑。这里我借助于Future框架,将上面的调用系统命令的方法封装成一个Callable对象。

package com.majing.learning.fileupload.common.process;

import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;

public class CommandTask implements Callable<ProcessResult>{
	
	private ExecutorService executorService;
	
	private String command;
	
	public CommandTask(ExecutorService executorService, String command){
		this.executorService = executorService;
		this.command = command;
	}

	@Override
	public ProcessResult call() throws Exception {
		return CommandUtils.runCmdTest(executorService, command);
	}

}

然后在上面的CommandUtils的基础上再封装一层变成CommandHelper,具体实现如下:

package com.majing.learning.fileupload.common.process;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.majing.learning.fileupload.common.ConstValues;

public class CommandHelper {
	
	private static Logger logger = LoggerFactory.getLogger(CommandHelper.class);
	
	private static ExecutorService executorService=Executors.newFixedThreadPool(50);
	
	private static long default_timeout = 8000;
	
	public static ProcessResult process(String command){
		return process(command, default_timeout, TimeUnit.MILLISECONDS);
	}
	
	public static ProcessResult process(String command, long timeout, TimeUnit unit){
		CommandTask commandTask = new CommandTask(executorService, command);
		Future<ProcessResult> processResult = executorService.submit(commandTask);
		ProcessResult result = null;
		try{
			result = processResult.get(timeout, unit);
		}catch(Exception e){
			logger.error(ConstValues.EXCEPTION_OCCURED, e);
		}
		return result;
	}

}

至此,我们在需要调用系统命令时直接调用CommandHelper.process(command)就可以了,然后拿到返回结果ProcessResult。我也是自己做个记录,有需要的朋友可以直接拿去用。

顺便说一句,采用封装的这个类在完成上面相同的任务时,时间都在相同的机器上,耗时从原来的10s瞬间减少至200ms以内,由此可见,在适当的场景调用系统命令是多么重要啊。

GitHub 加速计划 / li / linux-dash
10.39 K
1.2 K
下载
A beautiful web dashboard for Linux
最近提交(Master分支:2 个月前 )
186a802e added ecosystem file for PM2 4 年前
5def40a3 Add host customization support for the NodeJS version 4 年前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐