【问题标题】:Google Cloud Speech to Text Audio Timeout Error when used with Twilio "Stream" verb and Websocket与 Twilio“流”动词和 Websocket 一起使用时,Google Cloud Speech to Text Audio Timeout Error
【发布时间】:2020-06-09 22:36:31
【问题描述】:

我目前正在尝试制作一个可以实时转录电话的系统,然后在我的命令行中显示对话。为此,我使用了一个 Twilio 电话号码,该号码在被呼叫时会发出一个 http 请求。然后使用 Flask、Ngrok 和 Websockets 编译我的服务器代码,公开我的本地端口并传输数据,使用 TwiML 动词“Stream”将音频数据流式传输到 Google Cloud Speech-Text API。到目前为止,我在 GitHub (https://github.com/twilio/media-streams/tree/master/python/realtime-transcriptions) 上使用了 Twilio 的 python 演示。

我的服务器代码:

from flask import Flask, render_template
from flask_sockets import Sockets

from SpeechClientBridge import SpeechClientBridge
from google.cloud.speech_v1 import enums
from google.cloud.speech_v1 import types

import json
import base64
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "./<KEY>.json"
HTTP_SERVER_PORT = 8080

config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.MULAW,
    sample_rate_hertz=8000,
    language_code='en-US')
streaming_config = types.StreamingRecognitionConfig(
    config=config,
    interim_results=True)

app = Flask(__name__)
sockets = Sockets(app)

@app.route('/home')
def home():
    return render_template("index.html")

@app.route('/twiml', methods=['POST'])
def return_twiml():
    print("POST TwiML")
    return render_template('streams.xml')

def on_transcription_response(response):
    if not response.results:
        return

    result = response.results[0]
    if not result.alternatives:
        return

    transcription = result.alternatives[0].transcript
    print("Transcription: " + transcription)

@sockets.route('/')
def transcript(ws):
    print("WS connection opened")
    bridge = SpeechClientBridge(
        streaming_config, 
        on_transcription_response
    )
    while not ws.closed:
        message = ws.receive()
        if message is None:
            bridge.terminate()
            break

        data = json.loads(message)
        if data["event"] in ("connected", "start"):
            print(f"Media WS: Received event '{data['event']}': {message}")
            continue
        if data["event"] == "media":
            media = data["media"]
            chunk = base64.b64decode(media["payload"])
            bridge.add_request(chunk)
        if data["event"] == "stop":
            print(f"Media WS: Received event 'stop': {message}")
            print("Stopping...")
            break

    bridge.terminate()
    print("WS connection closed")

if __name__ == '__main__':
    from gevent import pywsgi
    from geventwebsocket.handler import WebSocketHandler

    server = pywsgi.WSGIServer(('', HTTP_SERVER_PORT), app, handler_class=WebSocketHandler)
    print("Server listening on: http://localhost:" + str(HTTP_SERVER_PORT))
    server.serve_forever()

streams.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
     <Say> Thanks for calling!</Say>
     <Start>
        <Stream url="wss://<ngrok-URL/.ngrok.io/"/>
     </Start>
     <Pause length="40"/>
</Response>

Twilio WebHook:

http://<ngrok-URL>.ngrok.io/twiml

我在运行服务器代码然后调用 Twilio 号码时收到以下错误:

C:\Users\Max\Python\Twilio>python server.py
Server listening on: http://localhost:8080
POST TwiML
WS connection opened
Media WS: Received event 'connected': {"event":"connected","protocol":"Call","version":"0.2.0"}
Media WS: Received event 'start': {"event":"start","sequenceNumber":"1","start":{"accountSid":"AC8abc5aa74496a227d3eb489","streamSid":"MZe6245f23e2385aa2ea7b397","callSid":"CA5864313b4992607d3fe46","tracks":["inbound"],"mediaFormat":{"encoding":"audio/x-mulaw","sampleRate":8000,"channels":1}},"streamSid":"MZe6245f2397c1285aa2ea7b397"}
Exception in thread Thread-4:
Traceback (most recent call last):
  File "C:\Users\Max\AppData\Local\Programs\Python\Python37\lib\site-packages\google\api_core\grpc_helpers.py", line 96, in next
    return six.next(self._wrapped)
  File "C:\Users\Max\AppData\Local\Programs\Python\Python37\lib\site-packages\grpc\_channel.py", line 416, in __next__
    return self._next()
  File "C:\Users\Max\AppData\Local\Programs\Python\Python37\lib\site-packages\grpc\_channel.py", line 689, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.OUT_OF_RANGE
        details = "Audio Timeout Error: Long duration elapsed without audio. Audio should be sent close to real time."
        debug_error_string = "{"created":"@1591738676.565000000","description":"Error received from peer ipv6:[2a00:1450:4009:807::200a]:443","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Audio Timeout Error: Long duration elapsed without audio. Audio should be sent close to real time.","grpc_status":11}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Max\AppData\Local\Programs\Python\Python37\lib\threading.py", line 917, in _bootstrap_inner
    self.run()
  File "C:\Users\Max\AppData\Local\Programs\Python\Python37\lib\threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Max\Python\Twilio\SpeechClientBridge.py", line 37, in process_responses_loop
    for response in responses:
  File "C:\Users\Max\AppData\Local\Programs\Python\Python37\lib\site-packages\google\api_core\grpc_helpers.py", line 99, in next
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.OutOfRange: 400 Audio Timeout Error: Long duration elapsed without audio. Audio should be sent close to real time.

Media WS: Received event 'stop': {"event":"stop","sequenceNumber":"752","streamSid":"MZe6245f2397c125aa2ea7b397","stop":{"accountSid":"AC8abc5aa74496a60227d3eb489","callSid":"CA5842bc6431314d502607d3fe46"}}
Stopping...
WS connection closed

我无法弄清楚为什么我会收到音频超时错误?这是 Twilio 和 Google 的防火墙问题吗?编码问题?

任何帮助将不胜感激。

系统: 视窗 10 Python 3.7.1 ngrok 2.3.35 烧瓶 1.1.2

【问题讨论】:

    标签: python flask websocket twilio google-speech-to-text-api


    【解决方案1】:

    由于您的streams.xml返回的socket url“wss://

    如果你的套接字以'/'开头,那么你应该重写streams.xml,见下面的例子。

    <?xml version="1.0" encoding="UTF-8"?>
    <Response>
         <Say> Thanks for calling!</Say>
         <Start>
            <Stream url="wss://YOUR_NGROK_ID.ngrok.io/"/>
         </Start>
         <Pause length="40"/>
    </Response>
    

    【讨论】:

    • 感谢您的评论,瑞恩。该 URL 工作正常,因为我获得了良好的连接,您可以在命令行输出的前几个 cmets 中看到。抱歉,如果我没有使 URL 路由非常清楚。问题在于建立连接后 Google Cloud Speech-Text API 超时。
    【解决方案2】:

    我对此进行了一些测试,试图确定发生了什么。我在上面放了一个计时器

    桥 = SpeechClientBridge( 流配置, on_transcription_response)

    部分代码,发现初始化需要大约 10.9 秒。我相信谷歌 API 的超时时间为 10 秒。我尝试在我的谷歌云实例上运行它,它比我的笔记本电脑更有魅力,而且效果很好。无论是这个,还是 GCP 实例上安装了一些不同版本的库/代码等,我需要检查一下。

    【讨论】:

      【解决方案3】:

      这与本期https://github.com/grpc/grpc/issues/4629中描述的geventflask_sockets使用)和grpc(google云语音使用)冲突有关 解决方法是添加以下代码

      import grpc.experimental.gevent as grpc_gevent
      grpc_gevent.init_gevent()
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2019-04-16
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多