如何使用流读取音频文件？答案

【问题标题】：How can I read an audio file using streams?如何使用流读取音频文件？
【发布时间】：2019-05-28 11:59:41
【问题描述】：

我目前正在寻找流式传输音频文件。我想从给定的 .wav 文件中读取 x 秒，做我的分析任务并重复.....

这里有一些代码来了解我想要什么：

`read_x_seconds = 30
 file_length_in_min = 15
 for x in range(file_length_in_min * (60 / read_x_seconds)):
    y, fs = librosa.core.load(FILENAME, offset=x * read_x_seconds,
    duration=read_x_seconds)
    do_analysis(y, fs)`

【问题讨论】：

音频是流吗？或者您只是有要分块处理的离线文件？

标签： python audio stream

【解决方案1】：

假设我们正在考虑读取大量本地 WAV 文件的情况：

import wave
import numpy as np

def read_wav_part_from_local(path: str, start_s: float, duration_s: float):
    with wave.open(path, mode='rb') as wavread:
        fs = wavread.getframerate()
        start = int(start_s * fs)
        duration = int(duration_s * fs)
        wavread.setpos(start)
        wav_bytes = wavread.readframes(duration)

        if wavread.getsampwidth() == 2:
            dtype = 'int16'
        elif wavread.getsampwidth() == 4:
            dtype = 'int32'
        else:
            raise NotImplemented('I give up!')

        wav_array = np.frombuffer(wav_bytes, dtype=dtype)
        return wav_array, fs

使用方法：

audio_chunk, fs = read_wav_part_from_local('your.wav', offset_in_s, duration_in_s)

【讨论】：

您的解决方案的问题是，结果值不在 -1 和 1 之间。
@FelixHohnstein 您没有指定您希望它介于 -1 和 1 之间，因此没有缩放。您可以通过将向量除以该向量的绝对最大值来缩放它们，即wav_array = np.max(np.abs(wav_array))。

【解决方案2】：

with open(stream_file, 'rb') as audio_file:
    content = audio_file.read(BYTES_PER_SECOND)

【讨论】：

在 io.BytesIO 的文档中写到，io.BytesIO() 使用的是初始字节列表，不是文件路径。
查看编辑后的答案，如果您有本地文件，您可以简单地以“rb”模式打开它
我不认为 OP 打算这样做。首先，您会错过 44 或 46 字节的标头，具体取决于文件。也没有偏移量。
这没有提供问题的答案。要批评或要求作者澄清，请在他们的帖子下方留下评论。 - From Review
@L.F. - 是的，它确实。这可能不是一个好的答案，但这不是 VLQ 队列的用途。

【解决方案3】：

我有两种用于按块读取/流式传输 wav 文件的解决方案。

这是第一。我自己写的，请勿转载。

def stream_gen(path: str):
    WINDOW_s = 10
    HEADER = 44

    bytes_per_sampling: int
    samplerate: int
    CHUNk: int

    first_block = True
    run = True

    with open(path, 'rb') as stream:
                data = stream.read(HEADER)
                samplerate = int.from_bytes(data[24:28], byteorder='little')
                bits_per_sampling = int.from_bytes(data[34:36], byteorder='little')

                if bits_per_sampling == 16:
                    dtype = 'int16'
                elif bits_per_sampling == 32:
                    dtype = 'int32'
                else:
                    raise IOError()

                CHUNK = WINDOW_s * samplerate * (bits_per_sampling // 8)

                while run:
                    data = stream.read(CHUNK)
                    if data == b'':
                        break
                    yield(np.frombuffer(data, dtype=dtype))

第二个是显而易见的选择。是专业人士写的。

def soundfile_gen(path):
    window_s = 10
    samplerate = sf.info(path).samplerate
    blocksize = samplerate * window_s
    block_gen = sf.blocks(path, blocksize=blocksize)
    return block_gen

【讨论】：