LightGBM是微软推出的一款开源boosting工具,现在已经成为各类机器学习竞赛常用的一大利器。不过由于LightGBM是c++编写的,并且其预测功能的主要使用方式是命令行调用处理批量数据,比较难用于在线实时预测。lightgbm_predict4j是针对其预测代码用java重新实现的一个小工具,在用LightGBM离线生成模型之后,可以用lightgbm_predict4j加载模型,然后在java应用中用来做在线实时的预测。项目地址:https://github.com/lyg5623/lightgbm_predict4j

用法很简单,比如生成的模型文件为LightGBM_model.txt,以下为预测代码示例:

import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.URLDecoder;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.junit.Test;
import org.lightgbm.predict4j.v2.Boosting;
import org.lightgbm.predict4j.v2.OverallConfig;
import org.lightgbm.predict4j.v2.Predictor;
import org.lightgbm.predict4j.SparseVector;

/**
 * @author lyg5623
 */
public class UseageTest {
    //your model path
    private static String modelPath = "LightGBM_model.txt";

    @Test
    public void test() throws FileNotFoundException, IOException {
        String path = UseageTest.class.getClassLoader().getResource(modelPath).getPath();
      //your model path
        path = URLDecoder.decode(path, "utf8");

        Boosting boosting = Boosting.createBoosting(path);
        // predict config, just like predict.conf in LightGBM
        Map<String, String> map = new HashMap<String, String>();
        OverallConfig config = new OverallConfig();
        config.set(map);
        Predictor predictor =
                new Predictor(boosting, config.io_config.num_iteration_predict, config.io_config.is_predict_raw_score,
                        config.io_config.is_predict_leaf_index, config.io_config.pred_early_stop,
                        config.io_config.pred_early_stop_freq, config.io_config.pred_early_stop_margin);

        // your data to predict
        int[] indices = {2, 6, 9};
        double[] values = {0.2, 0.4, 0.7};

        SparseVector v = new SparseVector(values, indices);
        List<Double> predicts = predictor.predict(v);
        System.out.println("predict values " + predicts.toString());

    }

}



Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐