Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020141108 - METHOD, APPARATUS AND SYSTEM FOR HYBRID SPEECH SYNTHESIS

Publication Number WO/2020/141108
Publication Date 09.07.2020
International Application No. PCT/EP2019/086656
International Filing Date 20.12.2019
IPC
G10L 19/08 2013.1
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
19Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
04using predictive techniques
08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
G06N 3/02 2006.1
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
CPC
G06N 20/10
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
20Machine learning
10using kernel methods, e.g. support vector machines [SVM]
G06N 3/0454
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0454using a combination of multiple neural nets
G06N 3/0472
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0472using probabilistic elements, e.g. p-rams, stochastic processors
G06N 3/088
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
088Non-supervised learning, e.g. competitive learning
G10L 19/08
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
19Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
04using predictive techniques
08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
Applicants
  • DOLBY INTERNATIONAL AB [SE]/[NL]
Inventors
  • MUSTAFA, Ahmed
  • BISWAS, Arijit
Agents
  • DOLBY INTERNATIONAL AB PATENT GROUP EUROPE
Priority Data
19150154.303.01.2019EP
62/787,83103.01.2019US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) METHOD, APPARATUS AND SYSTEM FOR HYBRID SPEECH SYNTHESIS
(FR) PROCÉDÉ, APPAREIL ET SYSTÈME DE SYNTHÈSE VOCALE HYBRIDE
Abstract
(EN)
A method of decoding an original speech signal for hybrid adversarial-parametric speech synthesis comprising:(a) receiving quantized original linear prediction coding parameters estimated by applying linear prediction coding analysis filtering to an original speech signal and a quantized compressed representation of a residual of the original speech signal; (b) dequantizing the original linear prediction coding parameters and the compressed representation of the residual; (c) inputting the dequantized compressed representation of the residual into a decoder part of a Generator for applying adversarial mapping from the compressed residual domain to a fake (first) signal domain; (d) outputting, by the decoder part of the Generator, a fake speech signal; (e) applying linear prediction coding analysis filtering to the fake speech signal for obtaining a corresponding fake residual; (f) reconstructing the original speech signal by applying linear prediction coding cross-synthesis filtering to the fake residual and the dequantized original linear prediction coding analysis parameters.
(FR)
L'invention concerne un procédé de décodage d'un signal vocal d'origine pour une synthèse vocale hybride paramétrique antagoniste, comprenant les étapes consistant à : (a) recevoir des paramètres de codage de prédiction linéaire d'origine quantifiés estimés en appliquant un filtrage d'analyse de codage de prédiction linéaire à un signal vocal d'origine et une représentation compressée quantifiée d'un résidu du signal vocal d'origine ; (b) déquantifier les paramètres de codage de prédiction linéaire d'origine et la représentation compressée du résidu ; (c) entrer la représentation compressée déquantifiée du résidu dans une partie de décodeur d'un générateur destinée à appliquer une mise en correspondance antagoniste du domaine résiduel compressé à un faux (premier) domaine de signal ; (d) délivrer, par la partie de décodeur du générateur, un faux signal vocal ; (e) appliquer un filtrage d'analyse de codage de prédiction linéaire au faux signal vocal pour obtenir un faux résidu correspondant ; (f) reconstruire le signal vocal d'origine en appliquant un filtrage de synthèse croisée de codage de prédiction linéaire au faux résidu et aux paramètres d'analyse de codage de prédiction linéaire d'origine déquantifiés.
Related patent documents
EP2019824350This application is not viewable in PATENTSCOPE because the national phase entry has not been published yet or the national entry is issued from a country that does not share data with WIPO or there is a formatting issue or an unavailability of the application.
Latest bibliographic data on file with the International Bureau