【问题标题】:pyttsx3 prints the current word being utteredpyttsx3 打印当前说出的单词
【发布时间】:2021-04-23 16:56:01
【问题描述】:

我基本上希望 tts 在打印出它所说的内容时说话。 我几乎复制并粘贴了 pyttsx3 文档来执行此操作,但它不起作用。

import pyttsx3
def onStart(name):
   print ('starting', name)
def onWord(name, location, length):
   print ('word', name, location, length)
def onEnd(name, completed):
   print ('finishing', name, completed)
engine = pyttsx3.init()
engine.connect('started-utterance', onStart)
engine.connect('started-word', onWord)
engine.connect('finished-utterance', onEnd)
engine.say('The quick brown fox jumped over the lazy dog.')
engine.runAndWait()

结果就是这样。单词事件仅在说话完成后触发,并且没有实际打印任何单词。

starting None
word None 1 0
finishing None True

我已经为此工作了好几天,我尝试了其他库,如 win32com.client.Dispatch('SAPI.Spvoice') 和 gtts,但似乎没有一个能够做我想做的事。 Sapi.spvoice 似乎有一个事件可以做我想要的,但我似乎也无法让它工作。虽然我也不确定我做得是否正确。 https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms723593(v=vs.85)

from win32com.client import Dispatch
import win32com.client

class ContextEvents():
    def onWord():
        print("the word event occured")
        
        # Work with Result
        
s = Dispatch('SAPI.Spvoice')
e = win32com.client.WithEvents(s, ContextEvents)
s.Speak('The quick brown fox jumped over the lazy dog.')

据我了解,事件需要有一个类,并且该事件必须以 On(event) 的形式出现在该类中。或者其他的东西。 我尝试安装 espeak,但也没有成功。 请注意,我是 python 的新手,所以如果有人愿意提供彻底的解释,那就太好了。

【问题讨论】:

    标签: python text-to-speech pyttsx3


    【解决方案1】:

    所以我不熟悉那个库,但很可能发生的事情是在事件能够传递到包装库之前生成和播放流。我可以说,如果你想使用 AWS 的 Polly 将输出字级时间信息 - 你需要两次调用 - 一个用于获取音频流,另一个用于获取 ssml 元数据。

    Windows .net System.Speech.Synthesis 库确实有您可以监听的进度事件,但我不知道是否有 python 库来包装它。

    但是,如果您愿意从 python 运行 powershell 命令,那么您可以尝试使用我编写的this gist,它包装了 Windows 综合功能并输出单词计时。这是一个可以满足您需求的示例:

    $text = "hello world! this is a long sentence with many words";
    $sampleRate = 24000;
    
    # generate tts and save bytes to memory (powershell variable)
    # events holds event timings
    # NOTE: assumes out-ssml-winrt.ps1 is in current directory, change as needed...
    $events = .\out-ssml-winrt.ps1 $text -Variable 'soundstream' -SampleRate $sampleRate -Channels 1 -SpeechMarkTypes 'words';
    
    # estimate duration based on samplerate (rough)
    $estimatedDurationMilliseconds = $global:soundstream.Length / $sampleRate * 1000;
    
    $global:e = $events;
    
    # add a final event at the end of the loop to wait for audio to complete
    $events += @([pscustomobject]@{ type = 'end'; time = $estimatedDurationMilliseconds; value = '' });
    # create background player
    $memstream = [System.IO.MemoryStream]::new($global:soundstream);
    $player = [System.Media.SoundPlayer]::new($memstream)
    $player.Play();
    
    # loop through word events
    $now = 0;
    $events | % {
        $word = $_;
        # milliseconds into wav file event happens
        $when = $word.time;
        # distance from last timestamp to this event
        $delta = $when - $now;
        # wait until right time to display
        if ($delta -gt 0) {
            Start-sleep -Milliseconds $delta;
        }
        $now = $when;
        # output word
        Write-Output $word.value;
    }
    # just to let you know - audio should be finished
    Write-Output "Playback Complete";
    $player.Stop(); $player.Dispose(); $memstream.Dispose();
    

    【讨论】:

      猜你喜欢
      • 2013-03-14
      • 2012-09-10
      • 2019-03-13
      • 1970-01-01
      • 2011-06-27
      • 1970-01-01
      • 1970-01-01
      • 2016-07-01
      • 2020-02-18
      相关资源
      最近更新 更多