一种OSD 简单实现 (文字反色---opencv、字体切换---freetype2(中文、空格))

opencv

OpenCV: 开源计算机视觉库

项目地址：https://gitcode.com/gh_mirrors/opencv31/opencv

免费下载资源

Iflyinsky2013

4952人浏览 · 2019-04-28 17:08:59

Iflyinsky2013 · 2019-04-28 17:08:59 发布

#PS:这个只是《我自己》理解，如果和你的

#原则相冲突，请谅解，勿喷

背景：

某机器视觉项目中，往往会在一些图片上显示一些算法结果或者一些其他的文字信息来增强算法可视化或者提示演示效果。说白了，就是需要在图片上的某位置显示文字。

OSD简介

这里的OSD是on screen display的简写，翻译过来就是在"屏幕"上的显示。这里的"屏幕”是指在一副画面。所以，osd可以理解为在一副画面上叠加信息。

文字反色和字体切换

文字反色：顾名思义就是根据一些条件（背景图的情况），让文字变为和之前相反的颜色。列如：黑和白。

字体切换：字体这个东西，就是一个字显示出来是什么样子的。列如楷体、草书、宋体等等。

平均灰度和freetype2

平均灰度：就是用opencv计算对应字体的bitmap位置的图像数据进行平均灰度计算。主要是判断这一块图像数据的亮度情况，如果过亮（白），就黑色，如果过黑，就白色。

freetype2：这是一个开源的加载各种标准字体格式的开源框架，你可以根据你传入的字，得到对应字的bitmap。

python 实例（c++版用到项目中了，就不发了，非常类似）

这里我就不分析了，简单注释一下，说明一下思路即可，大致就这个样子就可以实现我想要的功能。这里我强烈建议：如果做验证类的代码，python用起来是要快点，操作很方便。
思路：
1 得到想要的文字的字体的bitmap
2 根据输入的起始位置和当前第几个字符的信息，计算出当前字要显示到对应图像坐标中的哪个块。
3 把对应的块截出来形成一个小图像，并计算平均灰度。
4 根据灰度值来决定显示黑色还是白色，主要是利用了色差，使人更显目的知道显示了什么。

from freetype import *

import numpy as np
import cv2
import math
import numpy
import matplotlib.pyplot as plt

#return bgr mat and gray mat
def GetBGRAndGrayImg(filename):
    #BGR
    img = cv2.imread(filename)
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    return img, img_gray

#初始化freetype2字库
freetype_face = None
def InitFreeType(path):
    global freetype_face
    freetype_face = Face(path)
    return
#设置字库你要获取的文字大小，这里的大小是一个近似值（离已有尺寸最近的尺寸）。字库里面的每个字可能存在多种尺寸。
def SetFreeTypeCharPixelSize(pixel_w, pixel_h):
    freetype_face.set_pixel_sizes( pixel_w, pixel_h )
    return
#设置字体旋转
def SetFreeTypeCharRotate(angle):
    matrix = Matrix(int((math.cos(angle)) * 0x10000), int((math.sin(angle)) * 0x10000),
                    int((math.sin(angle)) * 0x10000), int((math.cos(angle)) * 0x10000))
    freetype_face.set_transform(matrix, Vector(0, 0))
    return

#return a matrix of char, white pixel is actual font， black pixel is background of font
def GetCharMatrixFromFont(char):
    freetype_face.load_char(char)
    bitmap = freetype_face.glyph.bitmap
    return numpy.array(bitmap.buffer).reshape(bitmap.rows, bitmap.width), bitmap.width, bitmap.rows

# 下面就是根据传入的位置，文字，然后计算每个文字bitmap的矩阵对应的实际图像对应位置的区域平均灰度，决定显示什么颜色，然后进行像素替换即可。注意：这里的用的第一个字所在的下边界作为对齐的标准线。
def GetOSDImg(img, img_g, text, start_pos, interval=0):
    next_char_pos_x = start_pos[0]
    cur_pos_x = start_pos[0]
    next_char_pos_y = start_pos[1]
    cur_pos_y = start_pos[1]
    baseline_y =  start_pos[1]
    for text_i, text_e in enumerate(text):
        char_array, char_width, char_height = GetCharMatrixFromFont(text_e)
        #caculate gray
        #截出对应位置的小图像
        gray_matrix = img_g[next_char_pos_x:next_char_pos_x + char_width, next_char_pos_y:next_char_pos_y+char_height]
        #计算平均灰度
        gray_matrix_mean = gray_matrix.mean()

        if text_i == 0:
            baseline_y += char_height

        cur_pos_y = baseline_y-char_height

        for h, h_e in enumerate(char_array):
            for w, w_e in enumerate(h_e):
                if w_e == 0:
                    continue
                if gray_matrix_mean > 128:
                    #RGB
                    img[ cur_pos_y + h, cur_pos_x + w ] = [0, 0, 0]

                else:
                    #RGB
                    img[ cur_pos_y + h, cur_pos_x + w ] = [255, 255, 255]


        #caculate next char position
        cur_pos_x += char_width + interval
        #cur_pos_y += char_height
        next_char_pos_x += char_width
        next_char_pos_y += char_height
    return img
if __name__ == "__main__":

    img_t, img_g_t = GetBGRAndGrayImg("test.jpg")
    img = cv2.resize(img_t, (352, 288), interpolation=cv2.INTER_AREA)
    img_g = cv2.resize(img_g_t, (352, 288), interpolation=cv2.INTER_AREA)

    InitFreeType('mmm.ttf')
    # SetFreeTypeCharPixelSize(10, 10)
    freetype_face.set_char_size(5*64, 0, 300, 0)
    SetFreeTypeCharRotate(0)

    osd_img = GetOSDImg(img, img_g, "km/habcdefg你好吗？", np.array([50, 50]), 3)
    #plt.imshow(osd_img)
    #osd_img = osd_img.reshape(288, 352, 3)[:, :, (2, 1, 0)]

    cv2.imshow('osd', osd_img)
    cv2.waitKey(1)
    # plt.imshow(osd_img)
    plt.xticks([]), plt.yticks([])
    plt.show()


    #
    # # First pass to compute bbox
    # width, height, baseline = 0, 0, 0
    # previous = 0
    # for i, c in enumerate(text):
    #     face.load_char(c)
    #     bitmap = slot.bitmap
    #     height = max(height,
    #                  bitmap.rows + max(0,-(slot.bitmap_top-bitmap.rows)))
    #     baseline = max(baseline, max(0,-(slot.bitmap_top-bitmap.rows)))
    #     kerning = face.get_kerning(previous, c)
    #     width += (slot.advance.x >> 6) + (kerning.x >> 6)
    #     previous = c
    #
    # Z = numpy.zeros((height,37), dtype=numpy.ubyte)
    # print(Z.shape)
    # # Second pass for actual rendering
    # x, y = 0, 0
    # previous = 0
    # for c in text:
    #     face.load_char(c)
    #     bitmap = slot.bitmap
    #     top = slot.bitmap_top
    #     left = slot.bitmap_left
    #     w,h = bitmap.width, bitmap.rows
    #     y = height-baseline-top
    #     kerning = face.get_kerning(previous, c)
    #     x += (kerning.x >> 6)
    #     print(x, y ,h, w)
    #     Z[y:y+h,x:x+w] += numpy.array(bitmap.buffer, dtype='ubyte').reshape(h,w)
    #     x += (slot.advance.x >> 6)
    #     previous = c
    # print(Z.shape)
    # img = cv2.imread("test.jpg")
    # img_g = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    #
    # array = np.array(img)
    # #RGB
    # array = array.reshape(520, 520, 3)[:, :, (2, 1, 0)]
    #
    # print(array)
    #
    # array_g = np.array(img_g)
    # array_g = array_g.reshape(520, 520)
    #
    # print(array_g.shape)
    # x = 250
    # y = 250
    #
    # front_matrix = array_g[x: x + Z.shape[0], y: y+Z.shape[1] ]
    # front_matrix_mean = front_matrix.mean()
    # print(front_matrix_mean)
    # for h, h_e in enumerate(Z):
    #     for w, w_e in enumerate(h_e):
    #         if w_e == 0:
    #             continue
    #         if front_matrix_mean > 128:
    #             #R
    #             array[ x + w, y + h , 0] = 0
    #             #G
    #             array[ x + w, y + h, 1 ] = 0
    #             #B
    #             array[ x + w, y + h , 2] = 0
    #         else:
    #             #R
    #             array[ x + w, y + h , 0] = 255
    #             #G
    #             array[ x + w, y + h, 1 ] = 255
    #             #B
    #             array[ x + w, y + h , 2] = 255
    #
    # # plt.figure(figsize=(10, 10*Z.shape[0]/float(Z.shape[1])))
    # plt.imshow(array)
    # plt.xticks([]), plt.yticks([])
    # plt.show()

结果

实际测试结果：
在这里插入图片描述

字体文件打开后：
在这里插入图片描述

2019/7/1更新—关于C++版本中，中文支持和空格问题

中文问题

设定freetype 为unicode解码

FT_Select_Charmap(ft_face,FT_ENCODING_UNICODE);

用wchar_t来代替char,std::wstring 代替 std::string 。用这个wchar_t作为index去freetype查字形。

std::wstring tmp_str = L"Fu*k You!!!    \x20星星";
const char * tmp = L"Fu*k You!!!    \x20星星";

空格问题

我发现空格在freetype中查出来是空的。长宽为0，所以需要特殊处理空格。适当的增加x轴偏移代替空格即可。

解决以上问题后效果

在这里插入图片描述
#PS：请尊重原创，不喜勿喷

有问题请留言，看到后我会第一时间回复

GitHub 加速计划 / opencv31 / opencv

77.38 K

55.71 K

下载

OpenCV: 开源计算机视觉库

最近提交(Master分支：2 个月前 )

c3747a68 Added Universal Windows Package build to CI. 5 天前

9b635da5 - 5 天前

GitCode 开源社区

旨在为数千万中国开发者提供一个无缝且高效的云端环境，以支持学习、使用和贡献开源项目。

更多推荐

[转载]在Windows环境下安装GNU Radio

转自：在Windows环境下安装GNURadio_恐弱智_新浪博客GNU Radio是用Python开发的，大部分开源的工程能够在Linux环境下运行良好，而Windows下却运行的很勉强，而且安装配置都很复杂。GNU Radio算是个例外了，不光提供了Windows的二进制安装，还有比较详细的说明。我是Python小白，所以折腾了好久才弄好，特意记录下来，免得以后再装还折腾。GNU Radio的

GitCode 开源社区

centOS 8 使用dnf安装Docker

DNF是什么？CentOS 8使用YUM软件包管理器版本v4.0.4。现在，该版本使用DNF(已删除YUM)。DNF是软件包管理器。它会在Linux发行版上安装，执行更新并删除软件包。使用DNF安装Docker跳过具有损坏依赖性的程序包一个有效的解决方案是使您的CentOS 8系统使用以下--nobest命令安装最符合条件的版本：sudo dnf install docker...

GitCode 开源社区

定时同步数据库表(mysql+linux+crontab)

sync.sh里面的参数需要改变，ip/username/password/database/tablesync.sh#!/bin/sh# Please change the IP and password of the data source db.# Then change the table name.filename=/home/nington/db/$(date +%Y-%m