处理来自 Google Speech 的响应答案

【问题标题】：Handling the response from Google Speech处理来自 Google Speech 的响应
【发布时间】：2020-08-18 20:33:44
【问题描述】：

我有一个语音到文本的应用程序，我在如何有效地处理响应并将其组织到转录中有点迷茫。我向转录器功能提供 45 秒的数据块，如下所示：all_text = pool.map(transcribe, enumerate(files))。这是我得到的回复：

all text:  [{'idx': 0, 'text': ['users outnumber', ' future'], 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs', 'file_index': 0, 'words': [{'word': 'users', 'start_time': 0, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'outnumber', 'start_time': 0, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'future', 'start_time': 4, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}]}, 
{'idx': 1, 'text': ["and the sustainable energy'], 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs', 'file_index': 1, 'words': [{'word': 'and', 'start_time': 45, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'the', 'start_time': 45, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'sustainable', 'start_time': 45, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'energy', 'start_time': 52, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}]}]

所以这里我有两个 45 秒的片段来自 Elon Musks 的演讲。我削减了大部分响应以使其更短，但如您所见，有两个块，索引为 0 和 1。我想知道如何根据单词 start_time 值从该响应中获取转录？在这里我只花了几秒钟，但当然我也可以得到纳米。是否可以制作另一个列表来推送所有单词，然后使用starting_time对列表进行排序？这让我想到了第二个问题：这效率如何？如果我最终有来自多个用户的一英里长的单词和其他信息列表，可能会有一些问题吗？会有更好的方法吗？

编辑。这是我尝试过的。它适用于短会话，但应用程序在较长会话时崩溃。我想知道这是否与列表太大有关？

words = []
clean_transcript = ''

for word in alternative.words:
    words.append({'word': word.word, 'start_time': word.start_time.seconds, 'participant': participant})

words.sort(key=lambda x: x['start_time'])
print('ALL WORDS: ', words)

for w in words:
    clean_transcript += w['word'] + ' '

print(clean_transcript)

是否有一些明显的“不要这样做”？

【问题讨论】：

首先尝试使用普通的for-loop（甚至使用嵌套的for-loops）。
你可以使用列表clean_transcript = []和clean_transcript.append(w['word'])然后转换成一个字符串clean_transcript = " ".join(clean_transcript)
当它来自转录器时是排序的，但是当我通过该功能运行例如十个音频文件时，它们一次按文件排序。而且我希望它们按所有准备就绪的时间进行排序，无论数据来自哪个文件。因此，在最终列表中，我希望所有不同文件中的数据按时间顺序排列。
现在sort() 有意义。代码看起来不错 - 而且它比我的答案中的列表理解更具可读性。
代码" ".join(clean_transcript) 应该比clean_transcript += w['word'] + ' ' 快，但只有在非常大的数据中才能看到差异。通常循环中的print() 会产生更好的问题，因为显示需要很长时间并且人们首先删除print() 或打印更少的文本（即仅. 点以查看代码是否仍然有效）以使代码更快。我宁愿期待与谷歌语音发送和接收数据的问题，而不是这部分代码。有了大数据，您最终可以将其保存在 pandas.DataFrame 中，它使用 C/C++ 创建的代码，它可以更快地工作。

标签： python python-3.x speech-recognition speech-to-text

【解决方案1】：

首先你应该尝试使用普通的for-loop 或者嵌套的for-loops。

text = [
    {'idx': 0, 'text': ['users outnumber', ' future'], 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs', 'file_index': 0, 'words': [{'word': 'users', 'start_time': 0, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'outnumber', 'start_time': 0, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'future', 'start_time': 4, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}]}, 
    {'idx': 1, 'text': ['and the sustainable energy'], 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs', 'file_index': 1, 'words': [{'word': 'and', 'start_time': 45, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'the', 'start_time': 45, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'sustainable', 'start_time': 45, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}, {'word': 'energy', 'start_time': 52, 'participant': 'str_MIC_Ct3G_con_O6qn4m00bs'}]}
]

for item in text:
    print('---', item['idx'], '---')
    for word in item['words']:
        if word['start_time'] >= 45:
            print(word['start_time'], word['word'])

结果：

--- 0 ---
--- 1 ---
45 and
45 the
45 sustainable
52 energy

稍后您可以尝试将其转换为列表推导式。

result = [[(word['start_time'], word['word'])  for word in item['words'] if word['start_time'] >= 45] for item in text]
print(result)

结果

[[], [(45, 'and'), (45, 'the'), (45, 'sustainable'), (52, 'energy')]]

或者没有 start_time

result = [[word['word'] for word in item['words'] if word['start_time'] >= 45] for item in text]
print(result)

结果

[[], ['and', 'the', 'sustainable', 'energy']]

或者如果您想创建平面列表而不是子列表

result = [word['word'] for item in text for word in item['words'] if word['start_time'] >= 45]
print(result)

结果

['and', 'the', 'sustainable', 'energy']

【讨论】：

谢谢，将在我的代码上尝试。您能否也检查一下我的编辑，有一个示例我是如何尝试解决我的问题的。