Google Dialogflow 客户体验 | StreamingDetectIntent 匹配第一个意图后不处理音频答案

【问题标题】：Google Dialogflow CX | StreamingDetectIntent doesn't process audio after matching first intentGoogle Dialogflow 客户体验 | StreamingDetectIntent 匹配第一个意图后不处理音频
【发布时间】：2021-11-12 09:38:18
【问题描述】：

环境详情

操作系统：Windows 10、11。Debian 9（拉伸）
Node.js 版本：12.18.3、12.22.1
npm 版本：7.19.0、7.15.0
@google-cloud/dialogflow-cx 版本：2.13.0

问题

StreamingDetectIntent 在匹配第一个意图后不处理音频。我能够看到转录并且它能够匹配第一个意图，但是在匹配第一个意图之后，音频继续流式传输但我没有收到转录，并且on('data') 回调也没有被触发。 简而言之，匹配第一个意图后什么都没有发生

解决它的一件事是我必须结束 detectStream 然后重新初始化它。然后它按预期工作。

重现步骤

我已尝试使用 const {SessionsClient} = require("@google-cloud/dialogflow-cx"); 和 const {SessionsClient} = require("@google-cloud/dialogflow-cx").v3;

// Create a stream for the streaming request.
const detectStream = client
    .streamingDetectIntent()
    .on('error', console.error)
    .on('end', (data)=>{
        console.log(`streamingDetectIntent: -----End-----: ${JSON.stringify(data)}`);
    })
    .on('data', data => {
        console.log(`streamingDetectIntent: Data: ----------`);
        if (data.recognitionResult) {
            console.log(`Intermediate Transcript: ${data.recognitionResult.transcript}`);
        } else {
            console.log('Detected Intent:');
            if(!data.detectIntentResponse) return
            const result = data.detectIntentResponse.queryResult;

            console.log(`User Query: ${result.transcript}`);
            for (const message of result.responseMessages) {
                if (message.text) {
                    console.log(`Agent Response: ${message.text.text}`);
                }
            }
            if (result.match.intent) {
                console.log(`Matched Intent: ${result.match.intent.displayName}`);
            }
            console.log(`Current Page: ${result.currentPage.displayName}`);
        }
    });

const initialStreamRequest = {
        session: sessionPath,
        queryInput: {
            audio: {
                config: {
                    audioEncoding: encoding,
                    sampleRateHertz: sampleRateHertz,
                    singleUtterance: true,
                },
            },
            languageCode: languageCode,
        }
    };
detectStream.write(initialStreamRequest);

我尝试通过文件 (.wav) 和使用麦克风流式传输音频，但结果相同。

await pump(
        recordingStream, // microphone stream <OR> fs.createReadStream(audioFileName),
        // Format the audio stream into the request format.
        new Transform({
            objectMode: true,
            transform: (obj, _, next) => {
                next(null, {queryInput: {audio: {audio: obj}}});
            },
        }),
        detectStream
    );

我也提到了这个implementation 和这个rpc based doc，但找不到任何理由说明为什么这不起作用。

谢谢！

【问题讨论】：

标签： node.js google-cloud-platform grpc audio-streaming dialogflow-cx

【解决方案1】：

根据documentation，这似乎是正确的行为：

当 Dialogflow 检测到音频的声音已停止或暂停时，它会停止语音识别并向您的客户端发送识别结果为 END_OF_SINGLE_UTTERANCE 的 StreamingDetectIntentResponse。 Dialogflow 会忽略在收到 END_OF_SINGLE_UTTERANCE 后通过流发送到 Dialogflow 的任何音频。

看来这就是StreamingDetectIntent 在匹配第一个意图后不处理音频的原因。根据相同的文档：

关闭流后，您的客户端应根据需要使用新流启动新请求

您应该开始另一个流。您也可以查看同一主题中的其他github issue。

【讨论】：