使用 libav (ffmpeg) 将 RGB 转换为 YUV 使图像一式三份答案

【问题标题】：RGB to YUV conversion with libav (ffmpeg) triplicates image使用 libav (ffmpeg) 将 RGB 转换为 YUV 使图像一式三份
【发布时间】：2021-07-10 23:35:24
【问题描述】：

我正在构建一个小程序来捕获视频的屏幕（使用X11 MIT-SHM extension）。如果我为捕获的帧创建单独的 PNG 文件，效果很好，但现在我正在尝试集成 libav (ffmpeg) 来创建视频，我得到了......有趣的结果。

我能到达的最远的地方就是这个。预期的结果（直接从 XImage 文件的 RGB 数据创建的 PNG）是这样的：

但是，我得到的结果是这样的：

如您所见，颜色很时髦，图像被裁剪了 3 次。我有一个捕获屏幕的循环，首先我生成单个 PNG 文件（当前在下面的代码中注释），然后我尝试使用 libswscale 从 RGB24 转换为 YUV420：

while (gRunning) {
        printf("Processing frame framecnt=%i \n", framecnt);

        if (!XShmGetImage(display, RootWindow(display, DefaultScreen(display)), img, 0, 0, AllPlanes)) {
            printf("\n Ooops.. Something is wrong.");
            break;
        }

        // PNG generation
        // snprintf(imageName, sizeof(imageName), "salida_%i.png", framecnt);
        // writePngForImage(img, width, height, imageName);

        unsigned long red_mask = img->red_mask;
        unsigned long green_mask = img->green_mask;
        unsigned long blue_mask = img->blue_mask;

        // Write image data
        for (int y = 0; y < height; y++) {
            for (int x = 0; x < width; x++) {
                unsigned long pixel = XGetPixel(img, x, y);

                unsigned char blue = pixel & blue_mask;
                unsigned char green = (pixel & green_mask) >> 8;
                unsigned char red = (pixel & red_mask) >> 16;

                pixel_rgb_data[y * width + x * 3] = red;
                pixel_rgb_data[y * width + x * 3 + 1] = green;
                pixel_rgb_data[y * width + x * 3 + 2] = blue;
            }
        }

        uint8_t* inData[1] = { pixel_rgb_data };
        int inLinesize[1] = { in_w };

        printf("Scaling frame... \n");
        int sliceHeight = sws_scale(sws_context, inData, inLinesize, 0, height, pFrame->data, pFrame->linesize);

        printf("Obtained slice height: %i \n", sliceHeight);
        pFrame->pts = framecnt * (pVideoStream->time_base.den) / ((pVideoStream->time_base.num) * 25);

        printf("Frame pts: %li \n", pFrame->pts);
        int got_picture = 0;

        printf("Encoding frame... \n");
        int ret = avcodec_encode_video2(pCodecCtx, &pkt, pFrame, &got_picture);

//                int ret = avcodec_send_frame(pCodecCtx, pFrame);

        if (ret != 0) {
            printf("Failed to encode! Error: %i\n", ret);
            return -1;
        }

        printf("Succeed to encode frame: %5d - size: %5d\n", framecnt, pkt.size);

        framecnt++;

        pkt.stream_index = pVideoStream->index;
        ret = av_write_frame(pFormatCtx, &pkt);

        if (ret != 0) {
            printf("Error writing frame! Error: %framecnt \n", ret);
            return -1;
        }

        av_packet_unref(&pkt);
    }

我已经放置了整个代码at this gist。 This question right here 看起来和我的很相似，但不完全一样，而且该解决方案对我不起作用，尽管我认为这与计算行距的方式有关。

【问题讨论】：

标签： c video ffmpeg yuv libav

【解决方案1】：

不要使用av_image_alloc，使用av_frame_get_buffer。

（与您的问题无关，但现在使用avcodec_encode_video2 被认为是不好的做法，应替换为avcodec_send_frame 和avcodec_receive_packet）

【讨论】：

感谢您的建议。我使用的是旧样式，因为我有旧版本的 libav（我在 CentOS 7 上），但我将尝试编译更新版本的 libav 以使用新样式。

【解决方案2】：

最后，错误不在libav的使用上，而是在将像素数据从XImage填充到rgb向量的代码上。而不是使用：

                pixel_rgb_data[y * width + x * 3    ] = red;
                pixel_rgb_data[y * width + x * 3 + 1] = green;
                pixel_rgb_data[y * width + x * 3 + 2] = blue;

我应该用这个：

                pixel_rgb_data[3 * (y * width + x)    ] = red;
                pixel_rgb_data[3 * (y * width + x) + 1] = green;
                pixel_rgb_data[3 * (y * width + x) + 2] = blue;

不知何故，我只乘以矩阵内的水平位移，而不是垂直位移。在我改变它的那一刻，它就完美地工作了。

【讨论】：