利用numpy的矩阵运算实现自定义转换矩阵的YCbCr2RGB工具

发表于 2018-01-19 更新于 2026-01-07 分类于 linux ， image

OpenCV自带的cvtColor色彩空间转换矩阵的转换矩阵好像不太好替换, 目前我没有找到合适的方法来搞定. 最后自己研究了一下, 利用numpy的矩阵运算可以和cvtColor一样快的把YCbCr图片转换成RGB图片.

这里使用了numpy的多维array的dot乘法. 在numpy的document中, 写明了多维array的计算方式:

dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])

首先, 我们需要把NV12/NV16的YCbCr数据转换成YCbCr444模式.

之后, 目标是把YCbCr的三个通道换算成RGB通道, 也就是说通过一个3x3的矩阵左乘一个[Y, Cb, Cr].T, 就变成了[R, G, B].T了. 由于我们使用的YCbCr是基于BT709的FULL SWING的版本, 也就是说Y, Cb, Cr都是0-255范围的. 因此为了计算的正确性, 先把0 ~ 255的整数除以256换成浮点, 然后把Cb, Cr都减去0.5变成-0.5 ~ 0.5的范围. 然后做矩阵运算, 就可以算出0 ~ 1范围的RGB值了.

考虑YCbCr三个通道是[height x width x 3], 为了能让YCbCr能被[3 x 3]的矩阵左乘, 需要换一下通道的顺序, 变成[height x 3 x height], 通过numpy.swapaxis可以实现, 根据前面的dot乘法公式可以知道

[3 x 3] dot [height x 3 x width] --> [3 x height x width]

因此算出的RGB通道同样需要切换一下顺序. 最后把计算好的RGB进行一下范围的限制, 防止溢出就可以了, 最终的程序如下

#!/usr/bin/env python3

import fire
import matplotlib.pyplot as plt
import numpy as np
import os
import scipy.misc


class ShowYuv:
    _bt709_mat = np.linalg.inv(np.array([[0.213, 0.715, 0.072],
                                         [-0.115, -0.385, 0.500],
                                         [0.500, -0.454, -0.046]]))

    @staticmethod
    def _show_rgb(rgb):
        plt.imshow(rgb)
        plt.show()

    def _save_rgb(rgb, filename):
        if filename is not None:
            print(f'saving {filename}...')
            scipy.misc.imsave(filename, rgb)

    def _process_yuv444(yuv444, filename, bmp):
        rgb = ShowYuv._yuv2rgb(yuv444)
        ShowYuv._show_rgb(rgb)
        if bmp:
            ShowYuv._save_rgb(rgb, os.path.splitext(filename)[0] + '.bmp')

    @staticmethod
    def _yuv2rgb(yuv444):
        yuv444 = yuv444 / 256
        yuv444[:, :, 1:] -= 0.5
        yuv444 = np.transpose(yuv444, (0, 2, 1))
        rgb = np.dot(ShowYuv._bt709_mat, yuv444)
        rgb = np.transpose(rgb, (1, 2, 0))
        rgb = np.around(rgb * 256)
        rgb = np.clip(rgb, 0, 255).astype(np.uint8)
        return rgb

    @staticmethod
    def yuv422sp(filename, width, height, bmp=0):
        yuv422 = np.fromfile(filename, dtype=np.uint8)
        if 2 * width * height != yuv422.shape[0]:
            raise ValueError('width or height error')
        yuv444 = np.empty([height, width, 3], dtype=np.uint8)
        yuv444[:, :, 0] = yuv422[:width * height].reshape(height, width)
        u = yuv422[width * height::2].reshape(height, width // 2)
        yuv444[:, :, 1] = scipy.misc.imresize(u, (height, width))
        v = yuv422[width * height + 1::2].reshape(height, width // 2)
        yuv444[:, :, 2] = scipy.misc.imresize(v, (height, width))
        ShowYuv._process_yuv444(yuv444, filename, bmp)

    @staticmethod
    def yuv420sp(filename, width, height, bmp=0):
        yuv420 = np.fromfile(filename, dtype=np.uint8)
        if 3 * width * height // 2 != yuv420.shape[0]:
            raise ValueError('width or height error')
        yuv444 = np.empty([height, width, 3], dtype=np.uint8)
        yuv444[:, :, 0] = yuv420[:width * height].reshape(height, width)
        u = yuv420[width * height::2].reshape(height // 2, width // 2)
        yuv444[:, :, 1] = scipy.misc.imresize(u, (height, width))
        v = yuv420[width * height + 1::2].reshape(height // 2, width // 2)
        yuv444[:, :, 2] = scipy.misc.imresize(v, (height, width))
        ShowYuv._process_yuv444(yuv444, filename, bmp)


if __name__ == '__main__':
    fire.Fire(ShowYuv)