【问题标题】:PocketSphinx for an Android dictation app用于 Android 听写应用程序的 PocketSphinx
【发布时间】:2016-02-01 05:08:58
【问题描述】:

我正在尝试使用PocketSphinx on Android 和 Keith Vertanen 的language models 之一来实现“听写”功能。我已将the sample 修改为如下所示:

private void setupRecognizer(File assetsDir) throws IOException {
 recognizer = defaultSetup()
     .setAcousticModel(new File(assetsDir, "en-us-ptm"))
     .setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
     .setRawLogDir(assetsDir)
     .setKeywordThreshold(1e-45f)
     .setBoolean("-allphone_ci", true)
      .getRecognizer();
  recognizer.addListener(this);
  File ngramModel = new File(assetsDir, "lm_csr_5k_nvp_2gram.arpa");
  recognizer.addNgramSearch(NGRAM_SEARCH, ngramModel);

lm_csr_5k_nvp_2gram.arpa 来自 Keith Vertanen 网站上的 5K NVP 2-gram 下载。

我收到此错误:

1 18:04:29.861 2837-2863/? I/SpeechRecognizer: Load N-gram model /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/lm_csr_5k_nvp_2gram.arpa
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(399): Trying to read LM in trie binary format
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(410): Header doesn't match
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 103: Bad ngram count
01-31 18:04:29.862 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(489): Trying to read LM in DMP format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 500: Wrong magic header size number a5c6461: /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/lm_csr_5k_nvp_2gram.arpa is not a dump file
01-31 18:04:29.864 2837-2863/? E/AndroidRuntime: FATAL EXCEPTION: AsyncTask #1
                                                 Process: edu.cmu.sphinx.pocketsphinx, PID: 2837
                                                 java.lang.RuntimeException: An error occurred while executing doInBackground()
                                                     at android.os.AsyncTask$3.done(AsyncTask.java:309)
                                                     at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:354)
                                                     at java.util.concurrent.FutureTask.setException(FutureTask.java:223)
                                                     at java.util.concurrent.FutureTask.run(FutureTask.java:242)
                                                     at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:234)
                                                     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)
                                                     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)
                                                     at java.lang.Thread.run(Thread.java:818)
                                                  Caused by: java.lang.RuntimeException: Decoder_setLmFile returned -1
                                                     at edu.cmu.pocketsphinx.PocketSphinxJNI.Decoder_setLmFile(Native Method)
                                                     at edu.cmu.pocketsphinx.Decoder.setLmFile(Decoder.java:172)
                                                     at edu.cmu.pocketsphinx.SpeechRecognizer.addNgramSearch(SpeechRecognizer.java:247)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity.setupRecognizer(PocketSphinxActivity.java:161)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity.access$000(PocketSphinxActivity.java:50)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity$1.doInBackground(PocketSphinxActivity.java:72)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity$1.doInBackground(PocketSphinxActivity.java:66)
                                                     at android.os.AsyncTask$2.call(AsyncTask.java:295)
                                                     at java.util.concurrent.FutureTask.run(FutureTask.java:237)
                                                     at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:234) 
                                                     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113) 
                                                     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588) 
                                                     at java.lang.Thread.run(Thread.java:818) 

线条

01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 103: Bad ngram count

让我认为lm_csr_5k_nvp_2gram.arpa 文件的格式不正确或其他什么。该文件如下所示:

\data\
ngram 1=5000
ngram 2=4331397
ngram 3=0

\1-grams:
-2.11154    </s>    0
-99 <s> -3.13167
-0.3954594  <unk>   -0.4365645
-2.271447   a   -2.953606
-3.384721   a.  -1.85196
-5.788997   a.'s    -0.8137056
-4.139672   abandoned   -0.9728376
-3.904189   ability -1.838658
-4.360272   able    -2.161723
...

至少看起来像示例文件here

我唯一的另一个想法是扩展名可能是错误的,因为this

语言模型可以三种不同的格式存储和加载——文本 ARPA 格式、二进制格式 BIN 和二进制 DMP 格式。 ARPA 格式占用更多空间,但可以对其进行编辑。 ARPA 文件具有 .lm 扩展名。二进制格式占用的空间大大减少,加载速度更快。二进制文件具有 .lm.bin 扩展名。也可以在格式之间进行转换。 DMP 格式已过时,不推荐使用。

这听起来像是文件应该命名为lm_csr_5k_nvp_2gram.lm 而不是lm_csr_5k_nvp_2gram.arpa。但是,我确实尝试重命名文件,但没有对异常进行任何更改。

这样做的正确方法是什么?

【问题讨论】:

    标签: android speech-recognition pocketsphinx pocketsphinx-android


    【解决方案1】:

    嗯,这是模型格式的问题,ngram模型中的这一行会导致问题:

    ngram 3=0
    

    您可以删除违规行或更新 pocketsphinx-android-demo,我刚刚推送了一个新版本,已修复此问题。

    总体而言,电话听写并非易事,因为电话速度确实很慢。我不建议你使用 2-gram,最好使用高度修剪的 3-gram 模型。你可以用 srilm 修剪。

    您也可以阅读optimization doc 了解还有什么要调整的。

    【讨论】:

      【解决方案2】:

      在 sphinx 上使用以下命令将您的 arpa 文件转换为语言模型(lm)。

      sphinx_lm_convert -i lm_csr_5k_nvp_2gram.arpa -o lm_csr_5k_nvp_2gram.lm.dmp
      

      在您的 android 程序中使用生成的语言模型。

      recognizer.addNgramSearch(DIGITS_SEARCH,new File(assetsDir, "lm_csr_5k_nvp_2gram.lm.dmp"))
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2011-07-19
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2011-11-27
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多