Processing

Please wait...

Settings

Settings

Goto Application

1. IN202027050046 - SYNTHESIS OF SPEECH FROM TEXT IN A VOICE OF A TARGET SPEAKER USING NEURAL NETWORKS

Office
India
Application Number 202027050046
Application Date 17.11.2020
Publication Number 202027050046
Publication Date 12.02.2021
Publication Kind A
IPC
G10L
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
Applicants GOOGLE LLC
Inventors JIA, Ye
CHEN, Zhifeng
WU, Yonghui
SHEN, Jonathan
PANG, Ruoming
WEISS, Ron J.
MORENO, Ignacio Lopez
REN, Fei
ZHANG, Yu
WANG, Quan
NGUYEN, Patrick An Phu
Priority Data 62/672835 17.05.2018 US
Title
(EN) SYNTHESIS OF SPEECH FROM TEXT IN A VOICE OF A TARGET SPEAKER USING NEURAL NETWORKS
Abstract
(EN) Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.