使用 SpeechRecognition identify_google() 从 wav 文件截断语音到文本的输出答案

【问题标题】：Truncated speech-to-text output from wav file with SpeechRecognition recognize_google()使用 SpeechRecognition identify_google() 从 wav 文件截断语音到文本的输出
【发布时间】：2021-05-03 13:18:31
【问题描述】：

我有每个 60 秒的 wav 格式的语音音频文件。但是，输出会被截断，仅捕获大约 15% 的长度。我在本地的 Jupyter Notebook 和 Google Colab 都试过这个。根据文档，此请求低于 API 的阈值。我做错了什么或如何绕过这个限制？

# select a recognizer session
# recognize_google() : Google Web Speech API
r = sr.Recognizer()

interview = sr.AudioFile('sample.wav')
with interview as source:
  print('Ready...')
  r.pause_threshold = 2
  audio = r.record(source, duration=60)

type(audio)
transcription = r.recognize_google(audio, language='en_CA')
print(transcription)

【问题讨论】：

将 try 和 except 块放入 r.recognize_google() 或删除语言可能会起作用！
@BhavyaParikh 我想应该澄清一下我本身没有收到错误，而是识别器没有计算整个音频文件。你能澄清一下我应该把try和exceptblocks放在哪里吗？输出当前看起来像这样Ready... information we want a baller or persons also for personal safety like LeBron houses his bases the same thing when we contact the other team like，但在文本输出完成后语音音频包含更多内容。

标签： python speech-recognition speech-to-text

【解决方案1】：

尝试使用此代码，如果输出仍然与旧代码相同，您可以尝试除块或更改 pause_threshold 值

import speech_recognition as sr
r = sr.Recognizer()

with sr.AudioFile("sample.wav") as source:
    print("Ready")
    r.pause_threshold = 0.6 
    audio = r.record(source)
try:
    s = r.recognize_google(audio)
    print("Text: "+s)
except sr.UnknownValueError:
    print("Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Error {0}".format(e))

【讨论】：

感谢您提供此代码 - 不幸的是，我运行了它，但它没有输出任何错误。我调整了以下参数，希望能在录音中捕捉到任何东西，但没有运气：energy_threshold = 50、pause_threshold = 0.1、phrase_threshold = 0.1、operation_timeout = None、non_speaking_duration = 0。