如何解决 Cloud Run 中的 Google Speech-to-Text long_running_recognize 错误？答案

【问题标题】：How to solve Google speech-to-text long_running_recognize error in Cloud Run?如何解决 Cloud Run 中的 Google Speech-to-Text long_running_recognize 错误？
【发布时间】：2021-08-28 13:07:33
【问题描述】：

我正在使用谷歌语音到文本 api。当我在 Google Cloud Run 中运行此代码时。

operation = self.client.long_running_recognize(config=self.config, audio=audio)

我收到了这个错误。我在谷歌上搜索了这个错误信息。但是我不能很好地回答。

"/code/app/./moji/speech_api.py", line 105, in _long_recognize operation = self.client.long_running_recognize(config=self.config, audio=audio) File "/usr/local/lib/python3.8/site-packages/google/cloud/speech_v1p1beta1/services/speech/client.py", line 457, in long_running_recognize response = rpc(request, retry=retry, timeout=timeout, metadata=metadata,) File "/usr/local/lib/python3.8/site-packages/google/api_core/gapic_v1/method.py", line 145, in __call__ return wrapped_func(*args, **kwargs) File "/usr/local/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 69, in error_remapped_callable six.raise_from(exceptions.from_grpc_error(exc), exc) File "<string>", line 3, in raise_from google.api_core.exceptions.ServiceUnavailable: 503 Getting metadata from plugin failed with error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) [pid: 11|app: 0|req: 56/2613] 169.254.8.129 () {44 vars in 746 bytes} [Sat Aug 28 18:16:17 2021] GET / => generated 0 bytes in 19 msecs (HTTP/1.1 302) 4 headers in 141 bytes (1 switches on core 0)

这是我的代码。

def speech_to_text(request: WSGIRequest):
    moji_response = MojiResponse()
    threads = []

    def transcript_audio_thread(description):
        if description.length == 0: description.status = 'done'
        if description.status != 'done':
            description.status = 'processing'
            description.save()
            if description.api_name == 'amivoice' and description.lang == 'ja-JP':
                description.transcribe_ami()
            else:
                description.transcribe_gcs()
        description.consume_audio_limit()
        description.update_at = timezone.now()
        description.status = 'done'
        description.save()
        description.save_words(description.words)
        moji_response.append(description.get_result_dict())
        send_mail_for_over_10_min(description, description.user)

    def transcribe_gcs(self, gcs_uri, is_long_recognize):
        audio = speech.RecognitionAudio(uri=gcs_uri)
        logger.info(f'gcs_uri: {gcs_uri} speech.config: {self.config}')
        try:
            if is_long_recognize:
                self._long_recognize(audio)
            else:
                self._recognize(audio)
        except Exception as e:
            logger.error(f'recognized file gcs_uri: {gcs_uri} {e}')
            raise e

    # noinspection PyTypeChecker
    def set_config(self, language_code, sample_rate_hertz, phrases, channels=1, ):
        logger.info(f'language_code: {language_code} sample_rate_hertz: {sample_rate_hertz} channels: {channels}')
        encoding = speech.RecognitionConfig.AudioEncoding.FLAC
        # encoding = speech.RecognitionConfig.AudioEncoding.LINEAR16
        self.config = {
            'encoding': encoding,
            'sample_rate_hertz': sample_rate_hertz,
            'language_code': language_code,
            'enable_automatic_punctuation': True,
            'enable_word_time_offsets': True,
            'audio_channel_count': channels,
        }
        if len(phrases) > 0: self.config['speech_contexts'] = phrases

    @stop_watch
    def _long_recognize(self, audio):
        operation = self.client.long_running_recognize(config=self.config, audio=audio)
        logger.info("Waiting for speech_to_text to complete...")
        self.response = operation.result(timeout=60 * 90)

    def transcript_audio(file_path: str, user: CustomUser, language_code: str, api_name: str,
                         is_trial: bool) -> Description
        t = threading.Thread(target=transcript_audio_thread, args=(description,))
        t.start()

        if description.length > 600 and user.plan != 'anonymous':
            moji_response.append({'id': description.pk, 'file_name': description.file.name,
                                  'text': MyMessage.OVER_10_MIN})
            return
        threads.append(t)

这个错误只发生在少数文件中。大多数文件都可以转录。

错误文件信息

类型：flac
持续时间：1181 秒
语言：ja-JP
采样率：16000Hz
大小：33.7MB
频道：1

当我在本地开发环境中转录相同的文件和相同的代码时，它运行良好。

认证

服务帐户

我下载了 auth json 文件。然后将路径设置为GOOGLE_APPLICATION_CREDENTIALS。

from google.oauth2 import service_account

GS_CREDENTIALS = service_account.Credentials.from_service_account_file(
    os.environ.get("GOOGLE_APPLICATION_CREDENTIALS")
)

此身份验证工作正常。因为当我转录音频文件时可以做到。该错误仅发生在几个文件中。

有没有人帮我，如何解决这个错误？

【问题讨论】：

您能否提供有关您的方案的更多详细信息（您是否创建了 ServiceAccount、如何对其进行身份验证等）以及有关哪些文件有效和哪些文件失败的更多详细信息？音频文件有多长，它们有多少 MB？是否可以提供您的代码？这些文件有什么特定的（音频格式？）？那是完整的错误输出吗？您能否提供在 GCP 上复制此问题的所有步骤？
感谢您的评论。我添加了一些信息。
总结一下，所有发出的文件都是FLAC格式的？您的文件在 GCS 中，如果您从您的机器上运行它，它可以正常工作，但如果您将在 Google Cloud Run 中运行代码，它就不起作用。你用的是什么插件？您能否具体说明您在 Cloud Run 上如何使用它，您使用的是 CLI？
对不起，我无法解释这个问题。我发布的代码不足以回答这个问题。但是，最后我解决了这个问题。我添加了我的答案。感谢您的支持。
太棒了！请接受您的回答，它不会给您积分，但会更容易被其他用户看到。

标签： google-speech-to-text-api

【解决方案1】：

我解决了这个错误。这不是文字转语音错误。这是threading 错误。我忘了在return 之前附加Thread。

        t = threading.Thread(target=transcript_audio_thread, args=(description,))
        t.start()

        if description.length > 600 and user.plan != 'anonymous':
            moji_response.append({'id': description.pk, 'file_name': description.file.name,
                                  'text': MyMessage.OVER_10_MIN})
            return
        threads.append(t) <-- point

我改变了上面的代码，如下所示。

        t = threading.Thread(target=transcript_audio_thread, args=(description,))
        t.start()
        threads.append(t) <-- point

        if description.length > 600 and user.plan != 'anonymous':
            moji_response.append({'id': description.pk, 'file_name': description.file.name,
                                  'text': MyMessage.OVER_10_MIN})

在本地环境可以在 HttpResponse 之后运行Thread。但是 Cloud Run 在 HttpResponse 之后停止服务器。所以我只在 Cloud Run 中遇到了这个错误。

GET / => generated 0 bytes in 19 msecs (HTTP/1.1 302) 4 headers in 141 bytes (1 switches on core 0)

【讨论】：