如何在 Windows 窗体中使用 google text to Speech api？答案

【问题标题】：How can I use google text to speech api in windows form?如何在 Windows 窗体中使用 google text to Speech api？
【发布时间】：2012-03-03 20:29:24
【问题描述】：

我想在我的 Windows 窗体应用程序中使用 google text to Speech，它会读取一个标签。我添加了 System.Speech 参考。它如何通过按钮单击事件读取标签？ http://translate.google.com/translate_tts?q=testing+google+speech这是google text to speech api，或者我如何使用微软的原生文本到语音？

【问题讨论】：

您必须决定要使用哪个公司的文本转语音 API。您提供的链接已失效，您将有更好的机会使用 System.Speech.Synthesize.SpeechSynthezer 类。使用它的 SpeakAsync() 方法在 .NET 世界中获得最大的收益，而不是“明天会更好/不同”的互联网世界。
@HansPassant - 该链接对我有用。我想知道为什么它对你不起作用。
嗯，我也想知道为什么。拥有零种调试方式对我来说就足够了。
@HansPassant - 得分。不过，它最终成为了一个非常棒的周六下午项目。

标签： c# winforms desktop-application

【解决方案1】：

更新 Google 的 TTS API 不再公开可用。底部关于 Microsoft 的 TTS 的注释仍然具有相关性并提供等效功能。

您可以在 WinForm 应用程序中使用 Google 的 TTS API，方法是使用this question 答案的变体播放响应（我花了一段时间，但我有真正的解决方案）： p>

public partial class Form1 : Form
{
    public Form1()
    {
        InitializeComponent();
        this.FormClosing += (sender, e) =>
            {
                if (waiting)
                    stop.Set();
            };
    }

    private void ButtonClick(object sender, EventArgs e)
    {
        var clicked = sender as Button;
        var relatedLabel = this.Controls.Find(clicked.Tag.ToString(), true).FirstOrDefault() as Label;

        if (relatedLabel == null)
            return;

        var playThread = new Thread(() => PlayMp3FromUrl("http://translate.google.com/translate_tts?q=" + HttpUtility.UrlEncode(relatedLabel.Text)));
        playThread.IsBackground = true;
        playThread.Start();
    }

    bool waiting = false;
    AutoResetEvent stop = new AutoResetEvent(false);
    public void PlayMp3FromUrl(string url)
    {
        using (Stream ms = new MemoryStream())
        {
            using (Stream stream = WebRequest.Create(url)
                .GetResponse().GetResponseStream())
            {
                byte[] buffer = new byte[32768];
                int read;
                while ((read = stream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    ms.Write(buffer, 0, read);
                }
            }

            ms.Position = 0;
            using (WaveStream blockAlignedStream =
                new BlockAlignReductionStream(
                    WaveFormatConversionStream.CreatePcmStream(
                        new Mp3FileReader(ms))))
            {
                using (WaveOut waveOut = new WaveOut(WaveCallbackInfo.FunctionCallback()))
                {
                    waveOut.Init(blockAlignedStream);
                    waveOut.PlaybackStopped += (sender, e) =>
                    {
                        waveOut.Stop();
                    };

                    waveOut.Play();
                    waiting = true;
                    stop.WaitOne(10000);
                    waiting = false;
                }
            }
        }
    }
}

注意：上面的代码需要 NAudio 才能工作（免费/开源）和System.Web、System.Threading 和 NAudio.Wave 的 using 语句。

我的Form1 上有 2 个控件：

一个名为label1的标签
一个名为button1 的按钮，其Tag 为label1（用于将按钮绑定到其标签）

如果您对每个按钮/标签组合使用不同的事件（未经测试），上述代码可以稍微简化：

    private void ButtonClick(object sender, EventArgs e)
    {
        var clicked = sender as Button;

        var playThread = new Thread(() => PlayMp3FromUrl("http://translate.google.com/translate_tts?q=" + HttpUtility.UrlEncode(label1.Text)));
        playThread.IsBackground = true;
        playThread.Start();
    }

虽然这个解决方案存在问题（这个列表可能不完整；我相信 cmets 和实际使用会找到其他解决方案）：

注意第一个代码 sn-p 中的stop.WaitOne(10000);。 10000 表示最多播放 10 秒的音频，因此如果您的标签需要更长的时间才能读取，则需要对其进行调整。这是必要的，因为当前版本的 NAudio (v1.5.4.0) 在确定流何时完成播放时似乎存在问题。它可能会在以后的版本中得到修复，或者可能有一个我没有花时间找到的解决方法。一种临时解决方法是使用ParameterizedThreadStart，它将超时作为线程的参数。这将允许可变超时，但不会从技术上解决问题。
更重要的是，Google TTS API 是非官方的（意味着不被非 Google 应用程序使用）它可能随时更改，恕不另行通知。如果您需要可以在商业环境中使用的东西，我建议您使用 MS TTS 解决方案（如您的问题所示）或众多商业替代方案之一。然而，这些都没有这么简单。

回答你问题的另一面：

System.Speech.Synthesis.SpeechSynthesizer 类非常更易于使用，您可以指望它可靠地使用（如果使用 Google API，它可能明天就会消失）。

这真的很简单，只需包含对System.Speech 的引用和：

public void SaySomething(string somethingToSay)
{
    var synth = new System.Speech.Synthesis.SpeechSynthesizer();

    synth.SpeakAsync(somethingToSay);
}

这可以正常工作。

尝试使用 Google TTS API 是一个有趣的实验，但我很难建议将其用于生产用途，如果您不想为商业替代方案付费，Microsoft 的解决方案与它差不多得到。

【讨论】：

如何在按钮点击事件中使用它？
@user1136403 - 请参阅我的更新以了解如何将其用于按钮单击事件。
@user1136403 - 我已经更新了答案中的代码，以反映基于测试所需的更改。
@user1136403 - 完全修改了我的答案以反映相当深入的测试，并包括问题/担忧列表。我在这方面花费了相当多的时间，因为它听起来很有趣，所以请在时间允许的情况下提供反馈。
System.Speech 更简单，但不支持多种语言。

【解决方案2】：

我知道这个问题有点过时，但最近 Google 发布了 Google Cloud Text To Speech API。

.NET 可以在此处找到 Google.Cloud.TextToSpeech 的客户端版本： https://github.com/jhabjan/Google.Cloud.TextToSpeech.V1

以下是如何使用客户端的简短示例：

GoogleCredential credentials =
    GoogleCredential.FromFile(Path.Combine(Program.AppPath, "jhabjan-test-47a56894d458.json"));

TextToSpeechClient client = TextToSpeechClient.Create(credentials);

SynthesizeSpeechResponse response = client.SynthesizeSpeech(
    new SynthesisInput()
    {
        Text = "Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 32 voices"
    },
    new VoiceSelectionParams()
    {
        LanguageCode = "en-US",
        Name = "en-US-Wavenet-C"
    },
    new AudioConfig()
    {
        AudioEncoding = AudioEncoding.Mp3
    }
);

string speechFile = Path.Combine(Directory.GetCurrentDirectory(), "sample.mp3");

File.WriteAllBytes(speechFile, response.AudioContent);

【讨论】：