【问题标题】:Add coverart into ogg containing an opus audio stream with ffmpeg without re-encoding the audio stream使用ffmpeg将coverart添加到包含opus音频流的ogg中,而无需重新编码音频流
【发布时间】:2020-08-04 06:46:01
【问题描述】:

我正在尝试使用 ffmpeg 将封面添加到 ogg 文件中:

这是我的 source.oggsource.jpg 文件:

$ ffprobe -hide_banner source.ogg 
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
$ identify source.jpg 
source.jpg JPEG 480x360 480x360+0+0 8-bit DirectClass 15.1KB 0.000u 0:00.000

我试过这个:

$ ffmpeg -hide_banner -i source.ogg -i source.jpg -map 0 -map 1 -c:a copy -c copy -map_metadata 0 dest.ogg -y && echo && ffprobe -hide_banner dest.ogg 
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
Input #1, image2, from 'source.jpg':
  Duration: 00:00:00.04, start: 0.000000, bitrate: 3023 kb/s
    Stream #1:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 480x360 [SAR 1:1 DAR 4:3], 25 tbr, 25 tbn, 25 tbc
[ogg @ 0x5655578064c0] Unsupported codec id in stream 1
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #1:0 -> #0:1 (copy)
    Last message repeated 1 times
[ogg @ 0x5655577e8540] Format ogg detected only with low score of 1, misdetection possible!
dest.ogg: End of file

我也找到了this answer,但它没有解释如何使用ffmpeg

我在 ogg 容器中读到了一个“METADATA_BLOCK_PICTURE”元数据,它可能包含 base64 中的图片,所以我尝试了这个:

$ ffmpeg -hide_banner -i source.ogg -map 0 -c:a copy -c copy -metadata METADATA_BLOCK_PICTURE="$(base64 source.jpg)" dest.ogg
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
File 'dest.ogg' already exists. Overwrite ? [y/N] y
Output #0, ogg, to 'dest.ogg':
  Metadata:
    METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz
                    : ODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj
                    ..............................................................................
                    : nVmaS2E/urUWVbH6ORI9z2l8zyRfFpkLooIHSBuk9lFFoC6OBnP1SON8rEooqM2WOVHDdRRAAUVK
                    : KiiCWRRRRBJ//9k=
    encoder         : Lavf58.20.100
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
      METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz
                      : ODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj
                      : Y2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQED
                      ..............................................................................
                      : nVmaS2E/urUWVbH6ORI9z2l8zyRfFpkLooIHSBuk9lFFoC6OBnP1SON8rEooqM2WOVHDdRRAAUVK
                      : KiiCWRRRRBJ//9k=
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=    1658kB time=00:03:02.41 bitrate=  74.5kbits/s speed=1.01e+03x    
video:0kB audio:1624kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.100392%

它有点“有效”,但ffplaympv 都无法解析封面:

$ ffplay -hide_banner dest.ogg
[ogg @ 0x5655577e8540] Failed to parse cover art block.
Input #0, ogg, from 'dest.ogg':
  Duration: 00:03:02.44, start: 0.000000, bitrate: 74 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
   3.95 M-A: -0.000 fd=   0 aq=   14KB vq=    0KB sq=    0B f=0/0    
$ mpv dest.ogg 
Playing: dest.ogg
[ffmpeg/demuxer] ogg: Failed to parse cover art block.
 (+) Audio --aid=1 (opus 2ch 48000Hz)
AO: [pulse] 48000Hz stereo 2ch float
A: 00:00:03 / 00:03:02 (2%)


Exiting... (Quit)

我还尝试了-metadata:s:a 以及base64--wrap 0(我忘记指定了,哎呀 :)):

$ ffmpeg -i source.ogg -map 0 -c:a copy -c copy -metadata:s:a METADATA_BLOCK_PICTURE="$(base64 --wrap 0 source.jpg)" dest.ogg
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
File 'dest.ogg' already exists. Overwrite ? [y/N] y
Output #0, ogg, to 'dest.ogg':
  Metadata:
    encoder         : Lavf58.20.100
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
      METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQEDEQH/xAAaAAACAwEBAAAAAAAAAAA
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=    1658kB time=00:03:02.41 bitrate=  74.5kbits/s speed=1.22e+03x    
video:0kB audio:1624kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.084397%

dest.ogg jpg 封面仍然无法正确读取:

$ ffprobe -hide_banner dest.ogg 
[ogg @ 0x5655577e8540] Invalid picture type: -2555936.
[ogg @ 0x5655577e8540] Could not read mimetype from an attached picture.
Input #0, ogg, from 'dest.ogg':
  Duration: 00:03:02.44, start: 0.000000, bitrate: 74 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100

你能帮帮我吗?

【问题讨论】:

  • 到目前为止找到解决方案是否幸运?
  • @cregox 我刚刚更新了我的问题的标题,因为它不能代表我需要的内容。可以看看吗?
  • 是的。我的类似问题已关闭,但没有回答。 opusenc 显然不能像 ffmpeg 那样压缩。事实上,我只能用 16k 比特率获得我想要的两倍大小。我希望您已经有了解决方案……您尝试过该脚本或其他类似工具了吗?

标签: ffmpeg command-line-interface ogg


【解决方案1】:

这对我有用:

ffmpeg -i mysong.ogg -i coverart.jpg song_with_art.ogg

【讨论】:

  • 能否请您在输出文件 (song_with_art.ogg) 上添加 ffprobe 并粘贴输出?
  • 在我的示例中,我的 ogg 文件包含 opus 音频流而不是 vorbis 音频流。您可以在不重新编码音频的情况下使用作品流来做到这一点吗?
【解决方案2】:

FFmpeg 4.4 版自动支持使用 Theora 视频编解码器将专辑封面嵌入到 Ogg 容器中(有关支持的编解码器列表,请参阅 Wikipedia 上的“Ogg codecs”,尽管 FFmpeg 可能不支持它们)。

这与 MP3 文件不同,后者将专辑封面作为二进制编码字符串存储在特殊用途的标签中。这允许媒体播放器正确地将其检测为音频文件(例如,使用mpv--audio-display 选项)并防止在播放期间重绘帧。 Ogg 容器不支持此功能,因此 FFmpeg 只是将常规视频流添加到文件中。此视频流的帧速率设置为(至少对于 JPEG)为 90000,从而产生无害的警告。

这至少不会降低 mpv 的性能,它只会在屏幕刷新率允许的情况下重绘。视频流中只编码了一个帧,可以按照建议的in this answer 运行ffprobe -v error -select_streams v:0 -count_packets -show_entries stream=nb_read_packets -of csv=p=0 input.ogg 手动验证。如果需要,可以使用-r:v 1 选项手动将帧速率设置为 1。更多讨论请参见 cmets。

下面是一个示例,将带有包含专辑封面的视频轨道的 MP3 文件转换为带有 Opus 编码音频和 Theora 编码视频的 Ogg 文件:

$ ffprobe -hide_banner '01 - State of Grace.mp3' 
[mp3 @ 0x5594cbafe320] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '01 - State of Grace.mp3':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
  Duration: 00:04:55.81, start: 0.000000, bitrate: 321 kb/s
  Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
  Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 600x600 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
    Metadata:
      title           : Cover
      comment         : Cover (front)
$ ffmpeg -hide_banner -i '01 - State of Grace.mp3' -c:a libopus -b:a 128000 -c:v libtheora -q:v 10 '01 - State of Grace.ogg'
[mp3 @ 0x55ebe6d3cc40] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '01 - State of Grace.mp3':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
  Duration: 00:04:55.81, start: 0.000000, bitrate: 321 kb/s
  Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
  Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 600x600 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
    Metadata:
      title           : Cover
      comment         : Cover (front)
Stream mapping:
  Stream #0:1 -> #0:0 (mjpeg (native) -> theora (libtheora))
  Stream #0:0 -> #0:1 (mp3 (mp3float) -> opus (libopus))
Press [q] to stop, [?] for help
[swscaler @ 0x55ebe6db69e0] deprecated pixel format used, make sure you did set range correctly
[ogg @ 0x55ebe6d44c80] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
Output #0, ogg, to '01 - State of Grace.ogg':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
    encoder         : Lavf58.76.100
  Stream #0:0: Video: theora, yuv444p(tv, bt470bg/unknown/unknown, progressive), 600x600 [SAR 1:1 DAR 1:1], q=2-31, 200 kb/s, 90k fps, 90k tbn (attached pic)
    Metadata:
      title           : Cover
      DESCRIPTION     : Cover (front)
      encoder         : Lavc58.134.100 libtheora
      lyrics-eng      :  
      copyright       : š 2012 Big Machine Records, LLC.
      ALBUMARTIST     : Taylor Swift
      album           : Red (Deluxe Version)
      date            : 2012
      TRACKNUMBER     : 01/22
      genre           : Country
      composer        : Taylor Swift
      DISCNUMBER      : 1/1
  Stream #0:1: Audio: opus, 48000 Hz, stereo, flt, 128 kb/s
    Metadata:
      encoder         : Lavc58.134.100 libopus
      lyrics-eng      :  
      copyright       : š 2012 Big Machine Records, LLC.
      title           : State of Grace
      ALBUMARTIST     : Taylor Swift
      album           : Red (Deluxe Version)
      date            : 2012
      TRACKNUMBER     : 01/22
      genre           : Country
      composer        : Taylor Swift
      DISCNUMBER      : 1/1
      DESCRIPTION     : Taylor Swift
[mp3float @ 0x55ebe6d96360] Header missing time=00:04:31.63 bitrate=   0.1kbits/s speed=59.8x    64x    
Error while decoding stream #0:0: Invalid data found when processing input
frame=    1 fps=0.2 q=-0.0 Lsize=    4929kB time=00:04:55.79 bitrate= 136.5kbits/s speed=59.8x    
video:58kB audio:4830kB subtitle:0kB other streams:0kB global headers:3kB muxing overhead: 0.845459%
$ mpv '01 - State of Grace.ogg'
 (+) Video --vid=1 'Cover' (theora 600x600)
 (+) Audio --aid=1 'State of Grace' (opus 2ch 48000Hz)
AO: [alsa] 48000Hz stereo 2ch float
VO: [gpu] 600x600 yuv444p
(Paused) AV: -00:00:00 / 00:04:55 (0%)

Exiting... (Quit)
$ 

请注意,-q:v 10 Theora video codec option 用于尽可能高的视频质量。如果没有这个选项,专辑封面默认是极低的分辨率,并且使用最高质量时的大小差异可以忽略不计,因为只对单个帧进行编码。

这需要使用 libtheora(以及用于 Opus 编码音频的 libopus)构建 FFmpeg。以下是ffmpeg -codecs 的输出,删除了不相关的编解码器并改进了格式:

$ ffmpeg -codecs
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11.1.0
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64
  --docdir=/usr/share/doc/ffmpeg-4.4.1-r1/html --mandir=/usr/share/man
  --enable-shared --cc=x86_64-pc-linux-gnu-gcc
  --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar
  --nm=x86_64-pc-linux-gnu-nm --ranlib=x86_64-pc-linux-gnu-ranlib
  --pkg-config=x86_64-pc-linux-gnu-pkg-config --optflags='-O2 -pipe
  -march=native -ggdb3' --extra-libs= --enable-static --enable-avfilter
  --enable-avresample --disable-stripping --disable-optimizations
  --disable-libcelt --enable-nonfree --disable-indev=v4l2
  --disable-outdev=v4l2 --disable-indev=oss --disable-indev=jack
  --disable-indev=sndio --disable-outdev=oss --disable-outdev=sndio
  --enable-bzlib --enable-runtime-cpudetect --disable-debug
  --disable-gcrypt --enable-gnutls --disable-gmp --enable-gpl
  --disable-hardcoded-tables --enable-iconv --disable-libxml2 --enable-lzma
  --enable-network --disable-opencl --enable-openssl --enable-postproc
  --disable-libsmbclient --disable-ffplay --disable-sdl2 --disable-vaapi
  --disable-vdpau --disable-vulkan --enable-xlib --enable-libxcb
  --enable-libxcb-shm --enable-libxcb-xfixes --enable-zlib
  --disable-libcdio --disable-libiec61883 --disable-libdc1394
  --disable-libcaca --enable-openal --enable-opengl --disable-libv4l2
  --disable-libpulse --disable-libdrm --disable-libjack
  --disable-libopencore-amrwb --disable-libopencore-amrnb
  --disable-libcodec2 --enable-libdav1d --disable-libfdk-aac
  --disable-libopenjpeg --disable-libbluray --disable-libgme
  --disable-libgsm --disable-libaribb24 --disable-mmal --disable-libmodplug
  --enable-libopus --disable-libilbc --disable-librtmp --disable-libssh
  --disable-libspeex --disable-libsrt --disable-librsvg --disable-ffnvcodec
  --disable-libvorbis --disable-libvpx --disable-libzvbi --disable-appkit
  --disable-libbs2b --disable-chromaprint --disable-cuda-llvm
  --disable-libflite --disable-frei0r --disable-libfribidi
  --enable-fontconfig --disable-ladspa --disable-libass
  --disable-libtesseract --disable-lv2 --disable-libfreetype
  --disable-libvidstab --disable-librubberband --disable-libzmq
  --disable-libzimg --disable-libsoxr --enable-pthreads
  --disable-libvo-amrwbenc --disable-libmp3lame --disable-libkvazaar
  --enable-libaom --disable-libopenh264 --disable-librav1e
  --disable-libsnappy --enable-libtheora --disable-libtwolame
  --disable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid
  --disable-gnutls --disable-armv5te --disable-armv6 --disable-armv6t2
  --disable-neon --disable-vfp --disable-vfpv3 --disable-armv8
  --disable-mipsdsp --disable-mipsdspr2 --disable-mipsfpu --disable-altivec
  --disable-vsx --disable-power8 --disable-amd3dnow --disable-amd3dnowext
  --disable-aesni --disable-avx --disable-avx2 --disable-fma3
  --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4
  --disable-sse42 --disable-xop --cpu=host --disable-doc
  --disable-htmlpages --enable-manpages
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Codecs:
 D..... = Decoding supported
 .E.... = Encoding supported
 ..V... = Video codec
 ..A... = Audio codec
 ..S... = Subtitle codec
 ...I.. = Intra frame-only codec
 ....L. = Lossy compression
 .....S = Lossless compression
 -------
 [...]
 DEV.L. theora               Theora (encoders: libtheora )
 [...]
 DEAIL. opus                 Opus (Opus Interactive Audio Codec)
                             (decoders: opus libopus ) (encoders: opus libopus )
 [...]
$ 

FFmpeg 还可以从单独的文件中添加专辑封面(或任何视频曲目),而不是直接将原始专辑封面映射到输出。这是一个示例,说明如何将原始 MJPEG 专辑封面提取为单独的文件,然后将其传回并使用 -map 选项仅使用 MP3 中的音轨和 MJPEG 中的视频轨道(我删除了大部分命令的输出,因为它们基本相同):

$ ffmpeg -i '01 - State of Grace.mp3' -map 0:v -c:v copy '01 - State of Grace.jpg'
[...]
$ ffmpeg -i '01 - State of Grace.mp3' -i '01 - State of Grace.jpg' -map 0:a -map 1:v '01 - State of Grace.ogg'
[...]
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> flac (native))
Stream #1:0 -> #0:1 (mjpeg (native) -> theora (libtheora))
[...]

我还省略了音频和视频编解码器及其选项(我不建议这样做),因此 FFmpeg 使用 FLAC 作为默认音频编解码器,而 Theora 作为 Ogg 容器的默认视频编解码器。

希望这会有所帮助!

【讨论】:

  • 另外值得注意的是,ffprobe 输出中没有显示任何 FPS,它给出了Stream #0:0: Video: theora, yuv444p, 600x600 [SAR 1:1 DAR 1:1], 90k tbr, 90k tbn, 90k tbc。也许这是 libtheora 和 ffmpeg 之间的内部不一致,其中 libtheora 正确编码但 ffmpeg 认为它是视频流并使用一些占位符 FPS 值。
  • 我检查了mediainfo 工具(它不依赖于ffmpeg),它没有在mp3 上显示FPS,但在@987654343 上显示:Frame rate : 90 000.000 FPS @输出文件。
  • 如果您使用mpv 播放ogg 输出文件并按下i 键(用于“实时”信息),您会注意到Redraw Frame Timings 在视频流部分保持变化.如果播放原始mp3 输入文件,则Redraw Frame Timings 坚持00000
  • 似乎 MP3 文件将专辑封面作为二进制字符串存储在标签中,并且在媒体播放器中通过特殊处理进行识别,因此不需要播放。 Ogg 容器没有这个功能,所以 FFmpeg 只是添加了一个普通的视频流。 Theora 视频流具有帧率。我相信90000 is chosen here,并且通常使用90000 as a timebase。但是,只有一帧被编码,FPS 不会改变性能。
  • 您可以使用FFmpeg将每一帧解码为一个单独的图像ffmpeg -i input.ogg frame%d.png,它只会保存一个名为frame1.png的图像。可能有更好的方法,但这适用于手动验证。此外,mpv 不会将带有视频流的 Ogg 文件识别为音频文件,因此它不会打印它们的标签,--audio-display=no 没有效果。否则,它似乎按预期工作。如果此编码有问题,请告诉我。
猜你喜欢
  • 2016-04-11
  • 2017-02-02
  • 2021-03-11
  • 1970-01-01
  • 2020-07-12
  • 1970-01-01
  • 2018-07-18
  • 2014-12-06
相关资源
最近更新 更多