【问题标题】:Truncated speech-to-text output from wav file with SpeechRecognition recognize_google()使用 SpeechRecognition identify_google() 从 wav 文件截断语音到文本的输出
【发布时间】:2021-05-03 13:18:31
【问题描述】:

我有每个 60 秒的 wav 格式的语音音频文件。但是,输出会被截断,仅捕获大约 15% 的长度。我在本地的 Jupyter Notebook 和 Google Colab 都试过这个。根据文档,此请求低于 API 的阈值。我做错了什么或如何绕过这个限制?

# select a recognizer session
# recognize_google() : Google Web Speech API
r = sr.Recognizer()

interview = sr.AudioFile('sample.wav')
with interview as source:
  print('Ready...')
  r.pause_threshold = 2
  audio = r.record(source, duration=60)

type(audio)
transcription = r.recognize_google(audio, language='en_CA')
print(transcription)

【问题讨论】:

  • 将 try 和 except 块放入 r.recognize_google() 或删除语言可能会起作用!
  • @BhavyaParikh 我想应该澄清一下我本身没有收到错误,而是识别器没有计算整个音频文件。你能澄清一下我应该把try和exceptblocks放在哪里吗?输出当前看起来像这样Ready... information we want a baller or persons also for personal safety like LeBron houses his bases the same thing when we contact the other team like,但在文本输出完成后语音音频包含更多内容。

标签: python speech-recognition speech-to-text


【解决方案1】:

尝试使用此代码,如果输出仍然与旧代码相同,您可以尝试除块或更改 pause_threshold

import speech_recognition as sr
r = sr.Recognizer()

with sr.AudioFile("sample.wav") as source:
    print("Ready")
    r.pause_threshold = 0.6 
    audio = r.record(source)
try:
    s = r.recognize_google(audio)
    print("Text: "+s)
except sr.UnknownValueError:
    print("Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Error {0}".format(e))

【讨论】:

  • 感谢您提供此代码 - 不幸的是,我运行了它,但它没有输出任何错误。我调整了以下参数,希望能在录音中捕捉到任何东西,但没有运气:energy_threshold = 50pause_threshold = 0.1phrase_threshold = 0.1operation_timeout = Nonenon_speaking_duration = 0
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2013-01-18
  • 2012-04-20
  • 2019-07-21
  • 2019-12-21
  • 1970-01-01
  • 1970-01-01
  • 2010-12-26
相关资源
最近更新 更多