【发布时间】:2020-03-17 21:36:20
【问题描述】:
我正在尝试使用 Firebase Cloud Functions 将一个小音频文件(几秒钟)从 Firebase 存储发送到 Google Cloud Speech-to-Text。 documentation 表示将此同步代码用于小音频文件:
// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');
// Creates a client
const client = new speech.SpeechClient();
/**
* TODO(developer): Uncomment the following lines before running the sample.
*/
// const gcsUri = 'gs://my-bucket/audio.raw';
// const encoding = 'Encoding of the audio file, e.g. LINEAR16';
// const sampleRateHertz = 16000;
// const languageCode = 'BCP-47 language code, e.g. en-US';
const config = {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
};
const audio = {
uri: gcsUri,
};
const request = {
config: config,
audio: audio,
};
// Detects speech in the audio file
const [response] = await client.recognize(request);
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(`Transcription: `, transcription);
该代码无法运行,因为它有await 而没有async。
此代码的另一个问题是它无法捕获错误。修复这些问题,并放入 Firebase Cloud Functions 触发器,我有以下代码:
exports.Google_Speech_to_Text = functions.firestore.document('Users/{userID}/Pronunciation_Test/downloadURL').onUpdate((change, context) => {
return async function syncRecognizeGCS() {
// [START speech_transcribe_sync_gcs]
// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');
// Creates a client
const client = new speech.SpeechClient();
const gcsUri = 'gs://my-app.appspot.com/my-file';
const encoding = 'Opus';
const sampleRateHertz = 48000;
const languageCode = 'en-US';
const config = {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
};
const audio = {
uri: gcsUri,
};
const request = {
config: config,
audio: audio,
};
// Detects speech in the audio file
const [response] = await client.recognize(request)
.catch((err) => { console.error(err); });
const transcription = response.results
.map(result => result.alternatives[0].transcript)
.join('\n');
console.log(`Transcription: `, transcription);
// [END speech_transcribe_sync_gcs]
}
}); // close Google_Speech_to_Text
没有错误信息。我没有发现 Storage 中的文件有任何问题:
我尝试了一个不同的文件,这次是mp3。结果相同,只是函数执行时间为 17 毫秒,因为文件更小。
我无法确定mediaDevices.getUserMedia() 在 Chrome 中使用的音频编码和采样赫兹速率。这个blog post 表示音频编码为Opus,采样率为48000。有时我收到错误 INVALID_ARGUMENT: Invalid recognition 'config': bad encoding.. documentation 说 Your audio data might not be encoded correctly or is encoded with a codec different than what you've declared in the RecognitionConfig. 可以将 encoding 和 sampleRateHertz 留空,Google Speech-to-Text 可以解决吗?
有什么建议吗?
【问题讨论】:
标签: google-cloud-functions firebase-storage google-cloud-speech