用OpenVINO C++ API编写YOLOv8-Seg实例分割模型推理程序

作者：英特尔创新大使战鹏州本文章将介绍使用C++ API开发YOLOv8-Seg实例分割(Instance Segmentation)模型的AI推理程序。本文C++范例程序的开发环境是Windows + Visual Studio Community 2022，请读者。请克隆本文的代码仓：git clone https://gitee.com/ppov-nuc/yolov8_openvino_c

文章共2,092字 · 阅读需要大约7分钟

一键AI生成摘要，助你高效阅读

问答

英特尔开发人员专区

4961人浏览 · 2023-06-20 11:26:28

英特尔开发人员专区 · 2023-06-20 11:26:28 发布

作者：英特尔创新大使战鹏州

1.1 简介

本文章将介绍使用OpenVINO™ 2023.0 C++ API开发YOLOv8-Seg实例分割(Instance Segmentation)模型的AI推理程序。本文C++范例程序的开发环境是Windows + Visual Studio Community 2022，请读者先配置基于Visual Studio的OpenVINO C++开发环境。

请克隆本文的代码仓：git clone https://gitee.com/ppov-nuc/yolov8_openvino_cpp.git

1.2 导出YOLOv8-Seg OpenVINO IR 模型

YOLOv8是Ultralytics公司基于YOLO框架，发布的一款面向物体检测与跟踪、实例分割、图像分类和姿态估计任务的SOTA模型工具套件。

首先用命令pip install -r requirements.txt 安装ultralytics和openvino-dev。

然后使用命令：yolo export model=yolov8n-seg.pt format=openvino half=True，导出FP16精度的OpenVINO IR模型，如下图所示。

接着使用命令：benchmark_app -m yolov8n-seg.xml -d GPU.1，获得yolov8n-seg.xml模型在A770m独立显卡上的异步推理计算性能，如下图所示。

1.3 使用OpenVINO C++ API编写YOLOv8-Seg实例分割模型推理程序

使用OpenVINO C++ API编写YOLOv8-Seg实例分割模型推理程序主要有5个典型步骤：

采集图像&图像解码
图像数据预处理
AI推理计算(基于OpenVINO C++ API)
对推理结果进行后处理
将处理后的结果可视化

YOLOv8-Seg实例分割模型推理程序的图像数据预处理和AI推理计算的实现方式跟YOLOv8目标检测模型推理程序的实现方式几乎一模一样，可以直接复用。

1.3.1 图像数据预处理

使用Netron打开yolov8n-seg.onnx，如下图所示，可以看到：

输入节点的名字：“images”；数据： float32[1,3,640,640]
输出节点1的名字：“output0”；数据：float32[1,116,8400]。其中116的前84个字段跟 YOLOv8目标检测模型输出定义完全一致，即cx,cy,w,h和80类的分数；后32个字段为掩膜置信度，用于计算掩膜数据。
输出节点2的名字：“output1”；数据：float32[1,32,160,160]。output0后32个字段与output1的数据做矩阵乘法后得到的结果，即为对应目标的掩膜数据

图像数据预处理的目标就是将任意尺寸的图像数据转变为形状为[1,3,640,640]，精度为FP32的张量。YOLOv8-Seg模型的输入尺寸为正方形，为了解决将任意尺寸数据放缩为正方形带来的图像失真问题，在图像放缩前，采用letterbox算法先保持图像的长宽比，如下图所示，然后再使用cv::dnn::blobFromImage函数对图像进行放缩。

图像数据预处理的范例程序如下所示

Mat letterbox(const Mat& source)

{

    int col = source.cols;

    int row = source.rows;

    int _max = MAX(col, row);

    Mat result = Mat::zeros(_max, _max, CV_8UC3);

    source.copyTo(result(Rect(0, 0, col, row)));

    return result;

}

Mat img = cv::imread("bus.jpg");

Mat letterbox_img = letterbox(img);

Mat blob = blobFromImage(letterbox_img, 1.0/255.0, Size(640,640), Scalar(), true);

1.3.2 AI同步推理计算

用OpenVINO C++ API实现同步推理计算，主要有七步：

实例化Core对象：ov::Core core;
编译并载入模型：core.compile_model();
创建推理请求：infer_request = compiled_model.create_infer_request()；
读取图像数据并完成预处理；
将输入数据传入模型：infer_request.set_input_tensor(input_tensor);
启动推理计算：infer_request.infer();
获得推理结果： output0 = infer_request.get_output_tensor(0);

output1 = infer_request.get_output_tensor(1);

范例代码如下所示：

    // -------- Step 1. Initialize OpenVINO Runtime Core --------

    ov::Core core;

    // -------- Step 2. Compile the Model --------

    auto compiled_model = core.compile_model("yolov8n-seg.xml", "CPU");

    // -------- Step 3. Create an Inference Request --------

    ov::InferRequest infer_request = compiled_model.create_infer_request();

    // -------- Step 4.Read a picture file and do the preprocess --------

    Mat img = cv::imread("bus.jpg");

    // Preprocess the image

    Mat letterbox_img = letterbox(img);

    float scale = letterbox_img.size[0] / 640.0;

    Mat blob = blobFromImage(letterbox_img, 1.0 / 255.0, Size(640, 640), Scalar(), true);

    // -------- Step 5. Feed the blob into the input node of the Model -------

    // Get input port for model with one input

    auto input_port = compiled_model.input();

    // Create tensor from external memory

    ov::Tensor input_tensor(input_port.get_element_type(), input_port.get_shape(), blob.ptr(0));

    // Set input tensor for model with one input

    infer_request.set_input_tensor(input_tensor);

    // -------- Step 6. Start inference --------

    infer_request.infer();

    // -------- Step 7. Get the inference result --------

    auto output0 = infer_request.get_output_tensor(0); //output0

    auto output1 = infer_request.get_output_tensor(1); //otuput1

1.3.3 理结果后处理

实例分割推理程序的后处理是从结果中拆解出预测别类(class_id)，类别分数(class_score)，类别边界框(box)和类别掩膜(mask)，范例代码如下所示：

   // -------- Step 8. Postprocess the result --------

    Mat output_buffer(output0_shape[1], output0_shape[2], CV_32F, output0.data<float>());

    Mat proto(32, 25600, CV_32F, output1.data<float>()); //[32,25600]

    transpose(output_buffer, output_buffer); //[8400,116]

    float score_threshold = 0.25;

    float nms_threshold = 0.5;

    std::vector<int> class_ids;

    std::vector<float> class_scores;

    std::vector<Rect> boxes;

    std::vector<Mat> mask_confs;

    // Figure out the bbox, class_id and class_score

    for (int i = 0; i < output_buffer.rows; i++) {

        Mat classes_scores = output_buffer.row(i).colRange(4, 84);

        Point class_id;

        double maxClassScore;

        minMaxLoc(classes_scores, 0, &maxClassScore, 0, &class_id);



        if (maxClassScore > score_threshold) {

            class_scores.push_back(maxClassScore);

            class_ids.push_back(class_id.x);

            float cx = output_buffer.at<float>(i, 0);

            float cy = output_buffer.at<float>(i, 1);

            float w = output_buffer.at<float>(i, 2);

            float h = output_buffer.at<float>(i, 3);

            int left = int((cx - 0.5 * w) * scale);

            int top = int((cy - 0.5 * h) * scale);

            int width = int(w * scale);

            int height = int(h * scale);

            cv::Mat mask_conf = output_buffer.row(i).colRange(84, 116);

            mask_confs.push_back(mask_conf);

            boxes.push_back(Rect(left, top, width, height));

        }

    }

    //NMS

    std::vector<int> indices;

    NMSBoxes(boxes, class_scores, score_threshold, nms_threshold, indices);

完整范例参考参见：yolov8_seg_ov_infer.cpp，运行结果如下图所示：