【问题标题】:How to send an audio file to Google Speech-to-Text from Firebase Storage?如何将音频文件从 Firebase 存储发送到 Google Speech-to-Text?
【发布时间】:2020-03-17 21:36:20
【问题描述】:

我正在尝试使用 Firebase Cloud Functions 将一个小音频文件(几秒钟)从 Firebase 存储发送到 Google Cloud Speech-to-Text。 documentation 表示将此同步代码用于小音频文件:

// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');

// Creates a client
const client = new speech.SpeechClient();

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const gcsUri = 'gs://my-bucket/audio.raw';
// const encoding = 'Encoding of the audio file, e.g. LINEAR16';
// const sampleRateHertz = 16000;
// const languageCode = 'BCP-47 language code, e.g. en-US';

const config = {
  encoding: encoding,
  sampleRateHertz: sampleRateHertz,
  languageCode: languageCode,
};
const audio = {
  uri: gcsUri,
};

const request = {
  config: config,
  audio: audio,
};

// Detects speech in the audio file
const [response] = await client.recognize(request);
const transcription = response.results
  .map(result => result.alternatives[0].transcript)
  .join('\n');
console.log(`Transcription: `, transcription);

该代码无法运行,因为它有await 而没有async

此代码的另一个问题是它无法捕获错误。修复这些问题,并放入 Firebase Cloud Functions 触发器,我有以下代码:

exports.Google_Speech_to_Text = functions.firestore.document('Users/{userID}/Pronunciation_Test/downloadURL').onUpdate((change, context) => {
    return async function syncRecognizeGCS() {
      // [START speech_transcribe_sync_gcs]
      // Imports the Google Cloud client library
      const speech = require('@google-cloud/speech');

      // Creates a client
      const client = new speech.SpeechClient();

      const gcsUri = 'gs://my-app.appspot.com/my-file';
      const encoding = 'Opus';
      const sampleRateHertz = 48000;
      const languageCode = 'en-US';

      const config = {
        encoding: encoding,
        sampleRateHertz: sampleRateHertz,
        languageCode: languageCode,
      };
      const audio = {
        uri: gcsUri,
      };

      const request = {
        config: config,
        audio: audio,
      };

      // Detects speech in the audio file
      const [response] = await client.recognize(request)
      .catch((err) => { console.error(err); });

      const transcription = response.results
      .map(result => result.alternatives[0].transcript)
      .join('\n');
      console.log(`Transcription: `, transcription);
      // [END speech_transcribe_sync_gcs]
    }

  }); // close Google_Speech_to_Text

函数执行,返回ok,没有别的:

没有错误信息。我没有发现 Storage 中的文件有任何问题:

我尝试了一个不同的文件,这次是mp3。结果相同,只是函数执行时间为 17 毫秒,因为文件更小。

我无法确定mediaDevices.getUserMedia() 在 Chrome 中使用的音频编码和采样赫兹速率。这个blog post 表示音频编码为Opus,采样率为48000。有时我收到错误 INVALID_ARGUMENT: Invalid recognition 'config': bad encoding.. documentationYour audio data might not be encoded correctly or is encoded with a codec different than what you've declared in the RecognitionConfig. 可以将 encodingsampleRateHertz 留空,Google Speech-to-Text 可以解决吗?

有什么建议吗?

【问题讨论】:

    标签: google-cloud-functions firebase-storage google-cloud-speech


    【解决方案1】:

    问题在于 Google 提供的代码无法捕获错误。当我重构代码以使用 Promise 而不是 await 时,我收到了一条错误消息。

    exports.Google_Speech_to_Text = functions.firestore.document('Users/{userID}/Pronunciation_Test/downloadURL').onUpdate((change, context) => {
            // Imports the Google Cloud client library
            const speech = require('@google-cloud/speech');
    
            // Creates a client
            const client = new speech.SpeechClient();
    
            const gcsUri = 'gs://my-app.appspot.com/my-file';
            const encoding = 'Opus';
            const sampleRateHertz = 48000;
            const languageCode = 'en-US';
    
            const config = {
              encoding: encoding,
              sampleRateHertz: sampleRateHertz,
              languageCode: languageCode,
            };
            const audio = {
              uri: gcsUri,
            };
    
            const request = {
              config: config,
              audio: audio,
            };
    
            // Detects speech in the audio file
            return response = client.recognize(request)
            .then(function(response) {
              console.log(response);    
            })
            .catch((err) => { console.error(err); });
        });
    

    错误是INVALID_ARGUMENT: Invalid recognition 'config': bad encoding..,换句话说,音频编码器不是Opus

    删除const encoding = 'Opus'; 行会导致错误消息encoding is not defined

    使用const encoding = ''; 会导致错误消息INVALID_ARGUMENT: Invalid recognition 'config': bad encoding..

    我需要弄清楚 Chrome 现在使用什么音频编码器。太糟糕了,谷歌语音无法解决这个问题。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-03-12
      • 1970-01-01
      • 2019-04-17
      • 1970-01-01
      • 1970-01-01
      • 2022-01-08
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多