Python Wave 库字符串到字节答案

【问题标题】：Python Wave Library String to BytesPython Wave 库字符串到字节
【发布时间】：2013-05-20 18:26:39
【问题描述】：

所以基本上我正在尝试读取波形文件的信息，以便我可以获取字节信息并创建一个时间数组->振幅点。

import wave

class WaveFile:

    # `filename` is the name of the wav file to open
    def __init__(self, fileName):
        self.wf = wave.open(fileName, 'r')
        self.soundBytes = self.wf.readframes(-1)
        self.timeAmplitudeArray = self.__calcTimeAmplitudeArray()


     def __calcTimeAmplitudeArray(self):
         self.internalTimeAmpList = [] # zero out the internal representation

         byteList = self.soundBytes
         if((byteList[i+1] & 0x080) == 0):
             amp = (byteList[i] & 0x0FF) + byteList[i+1] << 8
             #more code continues.....

错误：

if((int(byteList[i+1]) & 0x080) == 0):
TypeError: unsupported operand type(s) for &: 'str' and 'int'

我曾尝试使用int() 转换为整数类型，但无济于事。我来自 Java 背景，这将使用 byte 类型完成，但这似乎不是 Python 的语言特性。任何方向将不胜感激。

【问题讨论】：

标签： python python-2.7 wav

【解决方案1】：

您的问题来自于 wave 库只是为您提供原始二进制数据（以字符串的形式）的事实。

您可能需要使用self.wf.getparams() 检查数据的形式。这将返回 (nchannels, sampwidth, framerate, nframes, comptype, compname)。如果您确实有 1 个通道，样本宽度为 2，并且没有压缩（相当常见的波形类型），您可以使用以下方法（将 numpy 作为 np 导入）来获取数据：

byteList = np.fromstring(self.soundBytes,'<h')

这将返回一个包含数据的 numpy 数组。你不需要循环。如果您有不同的样本宽度，您将需要在第二个参数中有所不同。我用一个简单的.wav 文件和plot(byteList); show()（iPython 中的pylab 模式）进行了测试。

请参阅Reading *.wav files in Python 了解执行此操作的其他方法。

Numpyless 版本

如果你需要避免 numpy，你可以这样做：

import array
bytelist = array.array('h')
byteList.fromstring(self.soundBytes)

这和以前一样工作（使用plot(byteList); show() 测试）。 'h' 表示有符号短。 len 等有效。这确实会一次全部导入 wav 文件，但是 .wav 通常很小。并非总是如此。

【讨论】：

很遗憾我不能使用 numpy，因为我们的开发人员在 64 位 Windows 机器上工作。
不管怎样，使用self.wf.getnframes()来获取数组的长度。
顺便说一下，我通常在 64 位 windows 上使用 32 位 python，所以我使用 numpy 等的官方版本。如果你运行 64 位 python，这个非常有用的站点应该会有所帮助：@987654322 @。如果你有兴趣走那条路。

【解决方案2】：

我通常为此使用array-module 和fromstring 方法。

我对数据块进行操作的标准模式是这样的：

def bytesfromfile(f):
    while True:
        raw = array.array('B')
        raw.fromstring(f.read(8192))
        if not raw:
            break
        yield raw

with open(f_in, 'rb') as fd_in:
    for byte in bytesfromfile(fd_in):
        # do stuff

'B' 以上表示无符号字符，即 1 字节。

如果文件不是很大，那么你可以直接吞下它：

In [8]: f = open('foreman_cif_frame_0.yuv', 'rb')

In [9]: raw = array.array('B')

In [10]: raw.fromstring(f.read())

In [11]: raw[0:10]
Out[11]: array('B', [10, 40, 201, 255, 247, 254, 254, 254, 254, 254])

In [12]: len(raw)
Out[12]: 152064

Guido can't be wrong...

如果您更喜欢numpy，我倾向于使用：

    fd_i = open(file.bin, 'rb')
    fd_o = open(out.bin, 'wb')

    while True:
        # Read as uint8
        chunk = np.fromfile(fd_i, dtype=np.uint8, count=8192)
        # use int for calculations since uint wraps
        chunk = chunk.astype(np.int)
        if not chunk.any():
            break
        # do some calculations
        data = ...

        # convert back to uint8 prior to writing.
        data = data.astype(np.uint8)
        data.tofile(fd_o)

    fd_i.close()
    fd_o.close()

或读取整个文件：

In [18]: import numpy as np

In [19]: f = open('foreman_cif_frame_0.yuv', 'rb')

In [20]: data = np.fromfile(f, dtype=np.uint8)

In [21]: data[0:10]
Out[21]: array([ 10,  40, 201, 255, 247, 254, 254, 254, 254, 254], dtype=uint8)

【讨论】：

不幸的是，我无法使用numpy，因为我在 64 位 Windows 上进行开发。最终环境将是一个 linux 机器，但我在 Windows 机器上开发。 :( 你能解释一下里面的 8192 幻数吗？
然后采用数组方法。在上面添加了一个示例... 8192 只是块大小，在这种情况下，我每次迭代读取 8192 字节...
从返回数组中获取字节数的好方法是什么？ len() 不适用于“生成器”。抱歉，我对 Python 很陌生，所以我对标准库甚至基本语言语法都不太了解。