您需要检查 WAV 文件才能确定声音何时出现。最简单的方法是寻找吵闹和安静的时期。因为声音与波一起工作,所以当它安静时,wave 文件中的值不会有太大变化,而当它响亮时,它们会发生很大变化。
估计响度的一种方法是variance。正如你在文章中看到的,这可以定义为E[(X - mu)^2],可以写成average((X - average(X))^2)。这里,X 是信号在给定点的值(存储在 WAV 文件中的值,在代码中称为 sample)。如果变化很大,差异会很大。
这可以让您计算整个文件的响度。但是,您想跟踪文件在任何给定时间的响度,这意味着您需要moving average 的形式。一个简单的方法是使用first-order low-pass filter。
我没有测试过下面的代码,所以它不太可能工作,但它应该能让你开始。它加载 WAV 文件,使用低通滤波器跟踪均值和方差,并计算出方差何时高于和低于某个阈值。然后,在播放 WAV 文件时,它会记录开始播放以来的时间,并打印出 WAV 文件是响亮还是安静。
以下是您可能仍需要做的事情:
- 修复我在代码中故意犯的所有错误
- 添加一些有用的东西来应对响亮/安静的变化
- 更改阈值和反应时间以获得良好的音频效果
- 添加一些hysteresis(可变阈值)以停止灯光闪烁
我希望这会有所帮助!
import wave
import struct
import time
def get_loud_times(wav_path, threshold=10000, time_constant=0.1):
'''Work out which parts of a WAV file are loud.
- threshold: the variance threshold that is considered loud
- time_constant: the approximate reaction time in seconds'''
wav = wave.open(wav_path, 'r')
length = wav.getnframes()
samplerate = wav.getframerate()
assert wav.getnchannels() == 1, 'wav must be mono'
assert wav.getsampwidth() == 2, 'wav must be 16-bit'
# Our result will be a list of (time, is_loud) giving the times when
# when the audio switches from loud to quiet and back.
is_loud = False
result = [(0., is_loud)]
# The following values track the mean and variance of the signal.
# When the variance is large, the audio is loud.
mean = 0
variance = 0
# If alpha is small, mean and variance change slower but are less noisy.
alpha = 1 / (time_constant * float(sample_rate))
for i in range(length):
sample_time = float(i) / samplerate
sample = struct.unpack('<h', wav.readframes(1))
# mean is the average value of sample
mean = (1-alpha) * mean + alpha * sample
# variance is the average value of (sample - mean) ** 2
variance = (1-alpha) * variance + alpha * (sample - mean) ** 2
# check if we're loud, and record the time if this changes
new_is_loud = variance > threshold
if is_loud != new_is_loud:
result.append((sample_time, new_is_loud))
is_loud = new_is_loud
return result
def play_sentence(wav_path):
loud_times = get_loud_times(wav_path)
pygame.mixer.music.load(wav_path)
start_time = time.time()
pygame.mixer.music.play()
for (t, is_loud) in loud_times:
# wait until the time described by this entry
sleep_time = start_time + t - time.time()
if sleep_time > 0:
time.sleep(sleep_time)
# do whatever
print 'loud' if is_loud else 'quiet'