【发布时间】:2020-05-14 05:49:57
【问题描述】:
我还是网络开发的新手,我正在制作一个聊天机器人,但我想先通过谷歌的文本到语音来运行响应,然后在客户端上播放声音。所以客户端向服务器发送消息 -> 服务器创建响应 -> 服务器向谷歌发送消息 -> 获取音频数据 -> 将其发送给客户端 -> 客户端播放它。我一直走到最后一步,但现在我已经超出了我的深度。
我一直在做一些谷歌搜索,似乎有很多关于从二进制数据、音频上下文等播放音频的信息,我创建了一个函数,但它不起作用。这是我所做的:
export const SendMessage: Client.Common.Footer.API.SendMessage = async message => {
const baseRoute = process.env.REACT_APP_BASE_ROUTE;
const port = process.env.REACT_APP_SERVER_PORT;
const audioContext = new AudioContext();
let audio: any;
const url = baseRoute + ":" + port + "/ChatBot";
console.log("%c Sending post request...", "background: #1fa67f; color: white", url, JSON.stringify(message));
let responseJson = await fetch(url, {
method: "POST",
mode: "cors",
headers: {
Accept: "application/json",
"Content-Type": "application/json"
},
body: JSON.stringify(message)
});
let response = await responseJson.json();
await audioContext.decodeAudioData(
new ArrayBuffer(response.data.audio.data),
buffer => {
audio = buffer;
},
error => console.log("===ERROR===\n", error)
);
const source = audioContext.createBufferSource();
source.buffer = audio;
source.connect(audioContext.destination);
source.start(0);
console.log("%c Post response:", "background: #1fa67f; color: white", url, response);
};
此函数将消息发送到服务器并取回响应消息和音频数据。我的 response.data.audio.data 中确实有某种二进制数据,但我收到一条错误消息,指出无法解码音频数据(正在触发 decodeAudioData 方法中的错误)。我知道数据是有效的,因为在我的服务器上,我使用以下代码将其转换为可以正常播放的 mp3 文件:
const writeFile = util.promisify(fs.writeFile);
await writeFile("output/TTS.mp3", response.audioContent, "binary");
我几乎不知道这里如何处理二进制数据以及可能出现的问题。我是否需要指定更多参数才能正确解码二进制数据?我怎么知道哪个?我想了解这里实际发生的情况,而不仅仅是复制粘贴一些解决方案。
编辑:
因此,似乎没有正确创建数组缓冲区。如果我运行这段代码:
console.log(response);
const audioBuffer = new ArrayBuffer(response.data.audio.data);
console.log("===audioBuffer===", audioBuffer);
audio = await audioContext.decodeAudioData(audioBuffer);
响应如下:
{message: "Message successfully sent.", status: 1, data: {…}}
message: "Message successfully sent."
status: 1
data:
message: "Sorry, I didn't understand your question, try rephrasing."
audio:
type: "Buffer"
data: Array(14304)
[0 … 9999]
[10000 … 14303]
length: 14304
__proto__: Array(0)
__proto__: Object
__proto__: Object
__proto__: Object
但缓冲区记录如下:
===audioBuffer===
ArrayBuffer(0) {}
[[Int8Array]]: Int8Array []
[[Uint8Array]]: Uint8Array []
[[Int16Array]]: Int16Array []
[[Int32Array]]: Int32Array []
byteLength: 0
__proto__: ArrayBuffer
显然 JS 不理解我的响应对象中的格式,但这就是我从 google 的文本到语音 API 中得到的。也许我从我的服务器发送错误?正如我之前所说,在我的服务器上,以下代码将该数组转换为 mp3 文件:
const writeFile = util.promisify(fs.writeFile);
await writeFile("output/TTS.mp3", response.audioContent, "binary");
return response.audioContent;
其中 response.audioContent 也像这样发送到客户端:
//in index.ts
...
const app = express();
app.use(bodyParser.json());
app.use(cors(corsOptions));
app.post("/TextToSpeech", TextToSpeechController);
...
//textToSpeech.ts
export const TextToSpeechController = async (req: Req<Server.API.TextToSpeech.RequestQuery>, res: Response) => {
let response: Server.API.TextToSpeech.ResponseBody = {
message: null,
status: CONSTANTS.STATUS.ERROR,
data: undefined
};
try {
console.log("===req.body===", req.body);
if (!req.body) throw new Error("No message recieved");
const audio = await TextToSpeech({ message: req.body.message });
response = {
message: "Audio file successfully created!",
status: CONSTANTS.STATUS.SUCCESS,
data: audio
};
res.send(response);
} catch (error) {
response = {
message: "Error converting text to speech: " + error.message,
status: CONSTANTS.STATUS.ERROR,
data: undefined
};
res.json(response);
}
};
...
我觉得奇怪的是,在我的服务器上,response.audioContent 记录为:
===response.audioContent=== <Buffer ff f3 44 c4 00 00 00 03 48 01 40 00 00 f0
a3 0f fc 1a 00 11 e1 48 7f e0 e0 87 fc b8 88 40 1c 7f e0 4c 03 c1 d9 ef ff ec
3e 4c 02 c7 88 7f ff f9 ff ff ... >
但是,在客户端,它是
audio:
type: "Buffer"
data: Array(14304)
[0 … 9999]
[10000 … 14303]
length: 14304
__proto__: Array(0)
__proto__: Object
我尝试将 response.data、response.data.audio 和 response.data.audio.data 传递给 new ArrayBuffer(),但都导致相同的空缓冲区。
【问题讨论】:
标签: javascript audio