【问题标题】:Can google speech API convert text to speech?谷歌语音 API 可以将文本转换为语音吗?
【发布时间】:2018-06-23 23:56:33
【问题描述】:

我使用 Google Speech API ti 使用以下代码成功地将语音转换为文本。

import speech_recognition as sr
import os

#obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

# recognize speech using Google Cloud Speech
GOOGLE_CLOUD_SPEECH_CREDENTIALS = r"""{KEY}
"""
# INSERT THE CONTENTS OF THE GOOGLE CLOUD SPEECH JSON CREDENTIALS FILE HERE
try:
    speechOutput = (r.recognize_google_cloud(audio, credentials_json=GOOGLE_CLOUD_SPEECH_CREDENTIALS, language="si-LK"))
except sr.UnknownValueError:
    speechOutput = ("Google Cloud Speech could not understand audio")
except sr.RequestError as e:
    speechOutput = ("Could not request results from Google Cloud Speech service; {0}".format(e))
print(speechOutput)

我想知道我是否可以使用相同的 API 将文本转换为语音?如果不是要使用什么 API 和示例 python 代码。 谢谢!

【问题讨论】:

    标签: python-3.x google-api google-speech-api


    【解决方案1】:

    为此,您需要使用新的Text-to-Speech API,它目前处于测试阶段。您可以在文档的客户端库部分找到Python quickstart。该示例是python-docs-sample repo 的一部分。在此处添加示例的相关部分以获得更好的可见性:

    def synthesize_text(text):
        """Synthesizes speech from the input string of text."""
        from google.cloud import texttospeech
        client = texttospeech.TextToSpeechClient()
    
        input_text = texttospeech.types.SynthesisInput(text=text)
    
        # Note: the voice can also be specified by name.
        # Names of voices can be retrieved with client.list_voices().
        voice = texttospeech.types.VoiceSelectionParams(
            language_code='en-US',
            ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
    
        audio_config = texttospeech.types.AudioConfig(
            audio_encoding=texttospeech.enums.AudioEncoding.MP3)
    
        response = client.synthesize_speech(input_text, voice, audio_config)
    
        # The response's audio_content is binary.
        with open('output.mp3', 'wb') as out:
            out.write(response.audio_content)
            print('Audio content written to file "output.mp3"')
    

    更新:速率和音高配置

    您可以将文本元素括在<prosody> 标记中以修改ratepitch。例如:

    <prosody rate="slow" pitch="-2st">Can you hear me now?</prosody>
    

    可能的值遵循 W3 规范,可在 here 找到。 SSML docs for Text-to-Speech API 详细说明了这一点,他们还提供了一些示例。

    此外,您还可以使用&lt;audio&gt; 中的speed 选项控制一般音频播放速率,该选项目前接受从50% 到200% 的值(以1% 为增量)。

    【讨论】:

    • 嗨@Guillem,你能告诉我这个api中速度和音高的可能值吗?请!
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-04-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-12-01
    • 1970-01-01
    相关资源
    最近更新 更多