【问题标题】:Accuracy of IBM Watson speech recognition is lowIBM Watson 语音识别准确率低
【发布时间】:2016-08-15 17:09:34
【问题描述】:

我开发了一个应用程序,它使用语音到文本将音频转录为文本。准确度低。有些句子没有意义。有没有办法提高语音转文本的准确性?

这是一个例子:

http://book.vidalab.co/books/alice-in-wonderland

爱丽丝梦游仙境,第 2 部分:

“在家里走白当你这样看广告” 应该是“在家里走白棋这样你看爱丽丝”

“白鼠” 应该是“红白相间”

“白军试图取胜,而红军则在 Trice 双胞胎上” 应该是“白军想赢红军想赢”

【问题讨论】:

  • 它不是人工智能。看看它如何处理这首诗:waylink-english.co.uk/?page=16100
  • 我没想到它会解析诗歌。但它在文学方面做得并不好。也许文学也出界了?

标签: machine-learning artificial-intelligence speech-to-text ibm-watson


【解决方案1】:

您可以尝试不同的服务,例如 Speechmatics,它不太擅长获取说话者,但单词比 Watson 准确得多,结果是这样的:

Credits of Alice in Wonderland by Alice girs Timberg this is a box recording all of her vocal recordings are in the public domain for more information or volunteer. Please visit libber Vox dot org.
I just listed stage directions read by McKayla Curtis Lewis Carroll.
Read by Shannon Brown Alice read by Amanda Friday the Red Queen read by Shauna canat White Queen read by Elizabeth Klatt White Rabbit read by Todd Humpty Dumpty read by Jeff Machado written read by Brett Hirsch.
The Mock Turtle read by Ted the alarm Mad Hatter read by Elliot gage the March Hare by Charlotte Duckett's dormouse read by Kimberly Krauss frog read by Larry Wilson Duchess read by L.A. Cheshire Cat read by Sarah Herschell Tweedle-Dee read.
By Charlotte Brown.
Do you do do I read by the sea a solo the King of Hearts read by Ted alarm the Queen of Hearts read by eating Ray Headrick knave by glorious Joe Carter pillar back at 2 loss to spot read by Dave Harris.
Five Spot read by Dave Harith. Seven of spades read by Dave Hereth end of credits.

姓氏识别是一项非常复杂的任务,没有多少公司做得很好。

【讨论】:

    【解决方案2】:

    任何 STT 系统都有两个主要部分:声学模型和语言模型。第一个是关于音频和扬声器的,处理诸如:噪音、发音、口音等。语言模型是关于给定语言的结构和演讲中使用的单词。

    如果您想测试 STT,请使用尽可能接近目标语音的录音。对于一般语音或医学转录等表现非常好的系统,可能无法很好地处理有关考古学或诗歌的语音。e

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-02-28
      • 1970-01-01
      • 2019-06-23
      • 1970-01-01
      相关资源
      最近更新 更多