pyttsx3 打印当前说出的单词答案

【问题标题】：pyttsx3 prints the current word being utteredpyttsx3 打印当前说出的单词
【发布时间】：2021-04-23 16:56:01
【问题描述】：

我基本上希望 tts 在打印出它所说的内容时说话。我几乎复制并粘贴了 pyttsx3 文档来执行此操作，但它不起作用。

import pyttsx3
def onStart(name):
   print ('starting', name)
def onWord(name, location, length):
   print ('word', name, location, length)
def onEnd(name, completed):
   print ('finishing', name, completed)
engine = pyttsx3.init()
engine.connect('started-utterance', onStart)
engine.connect('started-word', onWord)
engine.connect('finished-utterance', onEnd)
engine.say('The quick brown fox jumped over the lazy dog.')
engine.runAndWait()

结果就是这样。单词事件仅在说话完成后触发，并且没有实际打印任何单词。

starting None
word None 1 0
finishing None True

我已经为此工作了好几天，我尝试了其他库，如 win32com.client.Dispatch('SAPI.Spvoice') 和 gtts，但似乎没有一个能够做我想做的事。 Sapi.spvoice 似乎有一个事件可以做我想要的，但我似乎也无法让它工作。虽然我也不确定我做得是否正确。 https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms723593(v=vs.85)

from win32com.client import Dispatch
import win32com.client

class ContextEvents():
    def onWord():
        print("the word event occured")
        
        # Work with Result
        
s = Dispatch('SAPI.Spvoice')
e = win32com.client.WithEvents(s, ContextEvents)
s.Speak('The quick brown fox jumped over the lazy dog.')

据我了解，事件需要有一个类，并且该事件必须以 On(event) 的形式出现在该类中。或者其他的东西。我尝试安装 espeak，但也没有成功。请注意，我是 python 的新手，所以如果有人愿意提供彻底的解释，那就太好了。

【问题讨论】：

标签： python text-to-speech pyttsx3

【解决方案1】：

所以我不熟悉那个库，但很可能发生的事情是在事件能够传递到包装库之前生成和播放流。我可以说，如果你想使用 AWS 的 Polly 将输出字级时间信息 - 你需要两次调用 - 一个用于获取音频流，另一个用于获取 ssml 元数据。

Windows .net System.Speech.Synthesis 库确实有您可以监听的进度事件，但我不知道是否有 python 库来包装它。

但是，如果您愿意从 python 运行 powershell 命令，那么您可以尝试使用我编写的this gist，它包装了 Windows 综合功能并输出单词计时。这是一个可以满足您需求的示例：

$text = "hello world! this is a long sentence with many words";
$sampleRate = 24000;

# generate tts and save bytes to memory (powershell variable)
# events holds event timings
# NOTE: assumes out-ssml-winrt.ps1 is in current directory, change as needed...
$events = .\out-ssml-winrt.ps1 $text -Variable 'soundstream' -SampleRate $sampleRate -Channels 1 -SpeechMarkTypes 'words';

# estimate duration based on samplerate (rough)
$estimatedDurationMilliseconds = $global:soundstream.Length / $sampleRate * 1000;

$global:e = $events;

# add a final event at the end of the loop to wait for audio to complete
$events += @([pscustomobject]@{ type = 'end'; time = $estimatedDurationMilliseconds; value = '' });
# create background player
$memstream = [System.IO.MemoryStream]::new($global:soundstream);
$player = [System.Media.SoundPlayer]::new($memstream)
$player.Play();

# loop through word events
$now = 0;
$events | % {
    $word = $_;
    # milliseconds into wav file event happens
    $when = $word.time;
    # distance from last timestamp to this event
    $delta = $when - $now;
    # wait until right time to display
    if ($delta -gt 0) {
        Start-sleep -Milliseconds $delta;
    }
    $now = $when;
    # output word
    Write-Output $word.value;
}
# just to let you know - audio should be finished
Write-Output "Playback Complete";
$player.Stop(); $player.Dispose(); $memstream.Dispose();

【讨论】：