描述

在java环境中使用opencv和tesserac识别一个图片表格

环境:opencv和tesseract安装在linux环境下,docker将运行springboot服务

opencv和tesseract的安装和docker加载可参考之前的文章

过程

将图片进行预处理,过滤掉颜色等干扰元素

提取图片的水平线和垂直线,并进行重叠过滤

得到水平线和垂直线的交点,根据交点构建单元格

对每个单元格进行识别

1.转换

将image转换成mat

private  Mat bufferedImageToMat(BufferedImage bufferedImage) {
      Mat mat = new Mat();
      try {
            // Convert BufferedImage to byte array
            ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
         
            ImageIO.write(bufferedImage, "png", byteArrayOutputStream);
      
            byteArrayOutputStream.flush();

            byte[] imageInByte = byteArrayOutputStream.toByteArray();

            byteArrayOutputStream.close();

            // Convert byte array to Mat
            MatOfByte matOfByte = new MatOfByte(imageInByte);

            mat = Imgcodecs.imdecode(matOfByte, Imgcodecs.IMREAD_UNCHANGED);
  
        } catch (IOException e) {
            e.printStackTrace();
        }
    return mat;
}

2.图片预处理

原图:

将图片灰度化,并进行边缘检测

灰度化

 //image为加载的图片
 Mat imread = bufferedImageToMat(image);
 Mat gray = new Mat();
 Imgproc.cvtColor(imread, gray,Imgproc.COLOR_BGR2GRAY);

边缘检测

Mat edges = new Mat();
Imgproc.Canny(gray, edges, 50, 150);

3.检测水平线和垂直线

识别水平线和垂直线

            List<MatOfPoint> verticalLines = new ArrayList<>();
            List<MatOfPoint> horizontalLines = new ArrayList<>();

            for (int i = 0; i < lines.rows(); i++) {
                double[] val = lines.get(i, 0);
                if (isVertical(val)) {
                    verticalLines.add(new MatOfPoint(new Point(val[0], val[1]), new Point(val[2], val[3])));

                } else if (isHorizontal(val)) {
                    horizontalLines.add(new MatOfPoint(new Point(val[0], val[1]), new Point(val[2], val[3])));
                }

            }

水平线和垂直线的阈值可根据实际情况调节 

    private  boolean isVertical(double[] line) {
        // 实现判断线是否垂直的逻辑
        return Math.abs(line[0] - line[2]) < 1.0; // 这里的阈值需要根据实际情况调整
    }

    private  boolean isHorizontal(double[] line) {
        // 实现判断线是否水平的逻辑
        return Math.abs(line[1] - line[3]) < 1.0; // 这里的阈值需要根据实际情况调整
    }

4.重叠过滤

过滤掉相邻太近,应该为同一条线的线段

    private  List<MatOfPoint> overlappingFilter(List<MatOfPoint> lines, int sortingIndex) {
        List<MatOfPoint> uniqueLines = new ArrayList<>();

        // 按照 sortingIndex 进行排序
        if(sortingIndex == 0){
            //行,检查y坐标
            lines.sort(Comparator.comparingDouble(line -> calculateLineCenter(line).y));
        }else{
            //列检查x坐标
            lines.sort(Comparator.comparingDouble(line -> calculateLineCenter(line).x));
        }


        double distanceThreshold = 5;
        for (int i = 0; i < lines.size(); i++) {
            MatOfPoint line1 = lines.get(i);
            Point[] pts1 = line1.toArray();

            // 如果 uniqueLines 为空或当前线与最后一条线不重复,则添加到 uniqueLines 中
            if (uniqueLines.isEmpty() || !isDuplicate(pts1, uniqueLines.get(uniqueLines.size() - 1).toArray(), distanceThreshold)) {
                uniqueLines.add(line1);
            }
        }

        return uniqueLines;
    }

    private  Point calculateLineCenter(MatOfPoint line) {
        Point[] pts = line.toArray();
        return new Point((pts[0].x + pts[1].x) / 2, (pts[0].y + pts[1].y) / 2);
    }

5.水平线和垂直线的焦点

得到水平线和垂直线的焦点

            List<List<Point>> intersectionList = new ArrayList<>();//交点列表
                for (MatOfPoint hLine : horizontalLines) {
                List<Point> intersectionRow = new ArrayList<>();
                for (MatOfPoint vLine : verticalLines) {
                    Point intersection = getIntersection(hLine, vLine);
                    intersectionRow.add(intersection);
                }
                intersectionList.add(intersectionRow);
            }

获取两条线的焦点

    private Point getIntersection(MatOfPoint line1, MatOfPoint line2) {
        Point[] points1 = line1.toArray();
        Point[] points2 = line2.toArray();

        double x1 = points1[0].x, y1 = points1[0].y, x2 = points1[1].x, y2 = points1[1].y;
        double x3 = points2[0].x, y3 = points2[0].y, x4 = points2[1].x, y4 = points2[1].y;

        double det = (x1 - x2) * (y3 - y4) - (y1 - y2) * (x3 - x4);
        double x = ((x1 * y2 - y1 * x2) * (x3 - x4) - (x1 - x2) * (x3 * y4 - y3 * x4)) / det;
        double y = ((x1 * y2 - y1 * x2) * (y3 - y4) - (y1 - y2) * (x3 * y4 - y3 * x4)) / det;

        return new Point(x, y);
    }

6.构建单元格

             List<List<Rect>> cells = new ArrayList<>();

            // 构建单元格
            for (int i = 0; i < intersectionList.size() - 1; i++) {
                List<Rect> rowCells = new ArrayList<>();
                for (int j = 0; j < intersectionList.get(i).size() - 1; j++) {
                    Point p1 = intersectionList.get(i).get(j);
                    Point p2 = intersectionList.get(i).get(j + 1);
                    Point p3 = intersectionList.get(i + 1).get(j);
                    Rect cell = new Rect((int) p1.x, (int) p1.y, (int) (p2.x - p1.x), (int) (p3.y - p1.y));
                    rowCells.add(cell);
                }
                cells.add(rowCells);
            }

7.对每个单元格进行识别

           for(int i=0;i<cells.size();i++){
                List<String> row = new ArrayList<>();
                for(int j=0;j<cells.get(i).size();j++){
                    Rect cell = cells.get(i).get(j);
                    Mat cellImage = new Mat(gray, cell);
                    BufferedImage bufferedImage = matToBufferedImage(cellImage);
                    if(bufferedImage == null)continue;
                    String text = tess.doOCR(bufferedImage);
                    row.add(text);
                }
            }
    private  BufferedImage matToBufferedImage(Mat mat) {
        int type = BufferedImage.TYPE_BYTE_GRAY;
        if (mat.channels() > 1) {
            type = BufferedImage.TYPE_3BYTE_BGR;
        }
        int bufferSize = mat.channels() * mat.cols() * mat.rows();
        byte[] buffer = new byte[bufferSize];
        mat.get(0, 0, buffer); // 获取所有像素值
        BufferedImage image = new BufferedImage(mat.cols(), mat.rows(), type);
        final byte[] targetPixels = ((DataBufferByte) image.getRaster().getDataBuffer()).getData();
        System.arraycopy(buffer, 0, targetPixels, 0, buffer.length);
        return image;
    }

GitHub 加速计划 / te / tesseract
60.1 K
9.29 K
下载
tesseract-ocr/tesseract: 是一个开源的光学字符识别(OCR)引擎,适用于从图像中提取和识别文本。特点是可以识别多种语言,具有较高的识别准确率,并且支持命令行和API调用。
最近提交(Master分支:23 天前 )
bc490ea7 Don't check for a directory, because a symbolic link is also allowed. Signed-off-by: Stefan Weil <sw@weilnetz.de> 2 个月前
2991d36a - 2 个月前
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐