使用 Python [摘要] 读取 wav 文件的最简单方法是什么？答案

【问题标题】：What is the easiest way to read wav-files using Python [summary]?使用 Python [摘要] 读取 wav 文件的最简单方法是什么？
【发布时间】：2011-01-05 00:25:56
【问题描述】：

我想使用 Python 访问一个 wav 文件并将其内容写入一种允许我分析它的形式（比如说数组）。

我听说“audiolab”是一个合适的工具（它将 numpy 数组转换为 wav，反之亦然）。
我已经安装了“audiolab”，但 numpy 的版本有问题（我无法“从 numpy.testing 导入测试器”）。我有 1.1.1。 numpy 的版本。
我在 numpy (1.4.0) 上安装了更新版本。但后来我得到了一组新的错误：

Traceback（最近一次调用最后一次）：文件“test.py”，第 7 行，在导入 scikits.audiolab 文件“/usr/lib/python2.5/site-packages/scikits/audiolab/init.py”，第 25 行，在从 pysndfile 导入格式信息，sndfile 文件“/usr/lib/python2.5/site-packages/scikits/audiolab/pysndfile/init.py”，第 1 行，在从 _sndfile 导入 Sndfile，格式，available_file_formats，available_encodings 文件“numpy.pxd”，第 30 行，在 scikits.audiolab.pysndfile._sndfile (scikits/audiolab/pysndfile/_sndfile.c:9632) ValueError: numpy.dtype 似乎不是正确的类型对象
我放弃了使用 audiolab 并认为我可以使用“wave”包来读取 wav 文件。我问了一个问题，但人们建议改用 scipy。好的，我决定专注于 scipy（我有 0.6.0. 版本）。
但是当我尝试执行以下操作时：

从 scipy.io 导入 wavfile
x = wavfile.read('/usr/share/sounds/purple/receive.wav')

我得到以下信息：

Traceback (most recent call last):
  File "test3.py", line 4, in <module>
    from scipy.io import wavfile
  File "/usr/lib/python2.5/site-packages/scipy/io/__init__.py", line 23, in <module>
    from numpy.testing import NumpyTest
ImportError: cannot import name NumpyTest

所以，我放弃了使用 scipy。我可以只使用wave包吗？我不需要太多。我只需要具有人类可读格式的 wav 文件内容，然后我会弄清楚如何处理它。

【问题讨论】：

您到底是如何安装 audiolab 的？
这与您之前关于完全相同主题的问题有何不同？
audiolab 很棒。试着让它工作。确保您已安装软件包 libsndfile 和 setuptools。你遵循秒吗？ 2.4在手册中？
升级 Numpy 版本时是否获得了更新版本的 Scipy？我使用 wave 来读取下面建议的 James Roth 的 wave 文件，但是如果你想使用 Scipy，你应该检查你的 Scipy 版本是否是最新的。根据您收到的错误消息，我猜它不是。
你见过这个吗？ stackoverflow.com/questions/2060628/… 时间戳是（更多）最近的，2011 年 3 月。

标签： python audio wav scipy wave

【解决方案1】：

pydub 提供了一个更简单的解决方案，无需安装任何依赖项（对于 wav 文件）。我目前在生产中使用这种方法没有任何问题。

from pydub import AudioSegment
awesome_song = AudioSegment.from_wav('awesome_song.wav')
print('Duration in seconds is {}'.format(awesome_song.duration_seconds))

【讨论】：

【解决方案2】：

audiolab 好像不再维护了，你应该试试PySoundFile。

安装很简单：

pip install PySoundFile --user

同时读取声音文件：

import soundfile as sf
x, fs = sf.read('/usr/share/sounds/purple/receive.wav')

看看这个overview about different Python libraries for handling sound files。

【讨论】：

【解决方案3】：

您还可以使用 wave 模块和 numpy.fromstring() 函数将其转换为数组

import wave
import numpy

fp = wave.open('test.wav')
nchan = fp.getnchannels()
N = fp.getnframes()
dstr = fp.readframes(N*nchan)
data = numpy.fromstring(dstr, numpy.int16)
data = numpy.reshape(data, (-1,nchan))

【讨论】：

【解决方案4】：

这对我来说已经足够了

import numpy as np
x = np.fromfile(open('song.wav'),np.int16)[24:]

它忽略了前 24 个值，因为那不是音频，而是标题。

另外，如果文件是立体声的，你的频道会有交替的索引，所以我通常会先用 Audacity 把它简化为单声道。

【讨论】：

如果您知道文件格式（通道数、采样率）并且知道文件中没有奇怪的东西（如多个数据块 - 请参阅 ccrma.stanford.edu/courses/422/projects/WaveFormat），则此方法有效。

【解决方案5】：

我在 std 库中的 wave 模块上编写了一个简单的包装器。它被称为pydub，它有一个从音频数据中读取样本作为整数的方法。

>>> from pydub import AudioSegment
>>> song = AudioSegment.from_wav("your_song.wav")
<pydub.audio_segment.AudioSegment at 0x1068868d0>

>>> # This song is stereo
>>> song.channels
2

>>> # get the 5000th "frame" in the song
>>> frame = song.get_frame(5000)

>>> sample_left, sample_right = frame[:2], frame[2:]
>>> def sample_to_int(sample): 
        return int(sample.encode("hex"), 16)

>>> sample_to_int(sample_left)
8448

>>> sample_to_int(sample_right)
9984

希望这会有所帮助

【讨论】：

【解决方案6】：

在尝试了这么多不起作用的事情之后，我使用了来自Use (Python) Gstreamer to decode audio (to PCM data) 的解码库并构建了一个函数来将原始 pcm 数据解析为 scipy 数组。

很好，可以打开gstreamer可以打开的任何音频文件： http://gist.github.com/592776（有关使用信息，请参阅测试和文件末尾）

【讨论】：

【解决方案7】：

你试过wave模块吗？它的依赖项更少：

http://docs.python.org/library/wave.html

def everyOther (v, offset=0):
   return [v[i] for i in range(offset, len(v), 2)]

def wavLoad (fname):
   wav = wave.open (fname, "r")
   (nchannels, sampwidth, framerate, nframes, comptype, compname) = wav.getparams ()
   frames = wav.readframes (nframes * nchannels)
   out = struct.unpack_from ("%dh" % nframes * nchannels, frames)

   # Convert 2 channles to numpy arrays
   if nchannels == 2:
       left = array (list (everyOther (out, 0)))
       right = array (list  (everyOther (out, 1)))
   else:
       left = array (out)
       right = left

【讨论】：

使用 out[0::2] 和 out[1::2] 代替 everyOther。
结合外部转换工具打开其他格式。 assembla.com/code/freesound/git/nodes/freesound/utils/…

【解决方案8】：

audiolab 是最好的方法，但它并不适用于所有环境，开发人员也没有致力于它。我仍在使用 Python 2.5，所以我可以使用它。

你安装libsndfile了吗？

【讨论】：