使用 SciPy 和 PyTables 将 .wav 音频文件转换为 .h5 (hdf) 文件答案

【问题标题】：Converting .wav audio files to .h5 (hdf) files using SciPy and PyTables使用 SciPy 和 PyTables 将 .wav 音频文件转换为 .h5 (hdf) 文件
【发布时间】：2020-11-02 03:35:58
【问题描述】：

我需要将音频 .wav 文件转换为 .hf 或 .npz 格式，因为它们是使用 FBK-Fairseq-ST (https://github.com/mattiadg/FBK-Fairseq-ST) 训练语音翻译系统所支持的格式。以下脚本旨在以python script.py /path/file.wav 的身份从终端运行，并编写一个新的 hdf 文件，将 .wav 文件的信息存储在同一文件夹中。

from scipy.io import wavfile
import tables
import numpy
import sys

#read data from wav
#fs, data = wavfile.read('/home/vittoria/Documents/corpus-test/01.wav')
fs, data = wavfile.read(sys.argv[1])

#ouput
folder=sys.argv[1][:-6]
name= sys.argv[1][-6:-3]+"h5"

#save_to acoular h5 format
acoularh5 = tables.open_file(folder+name, mode = "w", title = name)
acoularh5.create_earray('/','time_data', atom=None, title='', filters=None, \
                         expectedrows=100000, chunkshape=[256,64], \
                         byteorder=None, createparents=False, obj=data)
acoularh5.set_node_attr('/time_data','sample_freq', fs)
acoularh5.close()

但是，它会引发一个值错误：ValueError: shape ((0,)) 和 chunkshape ((256, 64)) 等级必须相等。

终端输入： python 2hf.py 01_83.wav" (python script.py 相对文件路径)

回溯错误，请注意“environments/hdf/lib/python3.6/”中的“hdf”是虚拟环境的根文件夹。 “/tables/”是在虚拟环境中通过 pip 命令安装的包表 3.6.1 (https://pypi.org/project/tables/) 的文件夹。

Traceback (most recent call last):
  File "2hf.py", line 18, in <module>
    byteorder=None, createparents=False, obj=data)
  File "/home/giuseppe/environments/hdf/lib/python3.6/site-packages/tables/file.py", line 1384, in create_earray
    track_times=track_times)
  File "/home/giuseppe/environments/hdf/lib/python3.6/site-packages/tables/earray.py", line 160, in __init__
    track_times)
  File "/home/giuseppe/environments/hdf/lib/python3.6/site-packages/tables/carray.py", line 212, in __init__
    (shape, chunkshape))
ValueError: the shape ((0,)) and chunkshape ((256, 64)) ranks must be equal.
Closing remaining open files:01_83.h5...done

【问题讨论】：

因为我也在尝试自学 acoular (github.com/acoular/acoular)，所以上面的脚本尝试复制到 h5 acoular 格式的转换，如此链接 github.com/acoular/acoular/issues/25
什么是data.shape？是否还有更多您未包含的错误消息？如果是这样，请将 complete 错误消息（即完整的回溯）添加到问题中。
谢谢，我对问题进行了编辑，以便包含完整的错误消息

标签： python-3.x scipy speech-recognition wav hdf

【解决方案1】：

我遇到了同样的错误并通过这种方式更改脚本解决了它

from scipy.io import wavfile
import tables
import numpy
import sys

#read data from wav
#fs, data = wavfile.read('/home/vittoria/Documents/corpus-test/01.wav')
fs, data = wavfile.read(sys.argv[1])

#ouput
folder=sys.argv[1][:-6]
name= sys.argv[1][-6:-3]+"h5"

#save_to acoular h5 format
acoularh5 = tables.open_file(folder+name, mode = "w", title = name)
acoularh5.create_earray('/','time_data', atom=None, title='', filters=None, \
                         expectedrows=100000, \
                         byteorder=None, createparents=False, obj=data)
acoularh5.set_node_attr('/time_data','sample_freq', fs)
acoularh5.close()

我基本上只是删除了这部分, chunkshape=[256,64] :-)

希望这会有所帮助。

【讨论】：