【发布时间】:2020-01-07 14:45:03
【问题描述】:
对不起,如果这是一个非常明显的问题。我正在使用 matplotlib 生成一些频谱图,用作机器学习模型中的训练数据。频谱图是音乐的短片,我想模拟随机加快或减慢歌曲的速度,以在数据中产生变化。我在下面显示了用于生成每个频谱图的代码。我临时修改了它,从歌曲的同一点开始生成 2 张图像,一张有变化,一张没有,以便比较它们,看看它是否按预期工作。
from pydub import AudioSegment
import matplotlib.pyplot as plt
import numpy as np
BPM_VARIATION_AMOUNT = 0.2
FRAME_RATE = 22050
CHUNK_SIZE = 2
BUFFER = FRAME_RATE * 5
def generate_random_specgram(track):
# Read audio data from file
audio = AudioSegment.from_file(track.location)
audio = audio.set_channels(1).set_frame_rate(FRAME_RATE)
samples = audio.get_array_of_samples()
start = np.random.randint(BUFFER, len(samples) - BUFFER)
chunk = samples[start:start + int(CHUNK_SIZE * FRAME_RATE)]
# Plot specgram and save to file
filename = ('specgrams/%s-%s-%s.png' % (track.trackid, start, track.bpm))
plt.figure(figsize=(2.56, 0.64), frameon=False).add_axes([0, 0, 1, 1])
plt.axis('off')
plt.specgram(chunk, Fs = FRAME_RATE)
plt.savefig(filename)
plt.close()
# Perform random variations to the BPM
frame_rate = FRAME_RATE
bpm = track.bpm
variation = 1 - BPM_VARIATION_AMOUNT + (
np.random.random() * BPM_VARIATION_AMOUNT * 2)
bpm *= variation
bpm = round(bpm, 2)
# I thought this next line should have been /= but that stretched the wrong way?
frame_rate *= (bpm / track.bpm)
# Read audio data from file
chunk = samples[start:start + int(CHUNK_SIZE * frame_rate)]
# Plot specgram and save to file
filename = ('specgrams/%s-%s-%s.png' % (track.trackid, start, bpm))
plt.figure(figsize=(2.56, 0.64), frameon=False).add_axes([0, 0, 1, 1])
plt.axis('off')
plt.specgram(chunk, Fs = frame_rate)
plt.savefig(filename)
plt.close()
我认为通过更改为 specgram 函数提供的 Fs 参数,这将沿 x 轴拉伸数据,但它似乎正在调整整个图形的大小并以奇怪且不可预测的方式在图像顶部引入空白.我确定我错过了一些东西,但我看不到它是什么。下面是一张图片来说明我得到了什么。
【问题讨论】:
标签: python numpy matplotlib pydub