使用 System.Speech 将 mp3 文件转换为文本答案

【问题标题】：Using System.Speech to convert mp3 file to text使用 System.Speech 将 mp3 文件转换为文本
【发布时间】：2013-07-27 08:41:27
【问题描述】：

我正在尝试使用 .net 中的语音识别来识别 mp3 文件中播客的语音并将结果作为字符串获取。我看到的所有示例都与使用麦克风有关，但我不想使用麦克风并提供示例 mp3 文件作为我的音频源。谁能指出我的任何资源或发布示例。

编辑 -

我将音频文件转换为wav 文件并在其上尝试了此代码。但它只提取前 68 个单词。

public class MyRecognizer {
    public string ReadAudio() {
        SpeechRecognitionEngine sre = new SpeechRecognitionEngine();
        Grammar gr = new DictationGrammar();
        sre.LoadGrammar(gr);
        sre.SetInputToWaveFile("C:\\Users\\Soham Dasgupta\\Downloads\\Podcasts\\Engadget_Podcast_353.wav");
        sre.BabbleTimeout = new TimeSpan(Int32.MaxValue);
        sre.InitialSilenceTimeout = new TimeSpan(Int32.MaxValue);
        sre.EndSilenceTimeout = new TimeSpan(100000000);
        sre.EndSilenceTimeoutAmbiguous = new TimeSpan(100000000);
        RecognitionResult result = sre.Recognize(new TimeSpan(Int32.MaxValue));
        return result.Text;
    }
}

【问题讨论】：

标签： c# .net speech-recognition speech-to-text

【解决方案1】：

尝试循环阅读。

SpeechRecognitionEngine sre = new SpeechRecognitionEngine();
Grammar gr = new DictationGrammar();
sre.LoadGrammar(gr);
sre.SetInputToWaveFile("C:\\Users\\Soham Dasgupta\\Downloads\\Podcasts\\Engadget_Podcast_353.wav");
sre.BabbleTimeout = new TimeSpan(Int32.MaxValue);
sre.InitialSilenceTimeout = new TimeSpan(Int32.MaxValue);
sre.EndSilenceTimeout = new TimeSpan(100000000);
sre.EndSilenceTimeoutAmbiguous = new TimeSpan(100000000); 

StringBuilder sb = new StringBuilder();
while (true)
{
    try
    {
        var recText = sre.Recognize();
        if (recText == null)
        {               
            break;
        }

        sb.Append(recText.Text);
    }
    catch (Exception ex)
    {   
        //handle exception      
        //...

        break;
    }
}
return sb.ToString();

如果您有 Windows 窗体或 WPF 应用程序，请在单独的线程中运行此代码，否则会阻塞 UI 线程。

【讨论】：

是的，这行得通。我还编辑了您的答案并补充说，如果 OP 使用 WinForms/WPF，他应该在单独的线程中运行代码，否则它会阻塞 UI 线程。
当我使用上面的代码时出现此错误：MyProgram.vshost.exe Information: 0 : SAPI does not implement phonetic alphabet selection.
@MicroR - 尝试将文化设置为您的语言环境stackoverflow.com/questions/27198683/…

【解决方案2】：

我会先看看这里记录的方法：http://msdn.microsoft.com/en-us/library/system.speech.recognition.speechrecognitionengine.setinputtowavefile.aspx

我想你应该可以从这里解决问题。

【讨论】：

MP3 文件不是 Wave (.wav) 文件（SetInputToWaveFile() 仅适用于 Wave 文件），因此您的解决方案将不起作用。
@Soham：我为什么要阅读我的文章？我在里面写错了吗？
我说我读了你的文章。这很好。但是你能提供任何解决我的问题的方法吗？
@Soham：我确实读过“读过你的文章”，但后来我认为你写了“我读过你的文章”。我没有提到“我”。但不幸的是，我没有找到解决您问题的方法。我只找到了将 .wav 文件转换为文本的解决方案。
我已经将我的音频文件转换为 wav 并尝试提取一些文本。我用 engadget 播客试了一下。然而，问题是我不能转录超过 68 个单词。