WIPO - Search International and National Patent Collections

1.10347241Speaker-invariant training via adversarial learning

US - 09.07.2019

Int.Class G10L 25/30 Appl.No 15934566 Applicant Microsoft Technology Licensing, LLC Inventor Zhong Meng

Systems and methods can be implemented to conduct speaker-invariant training for speech recognition in a variety of applications. An adversarial multi-task learning scheme for speaker-invariant training can be implemented, aiming at actively curtailing the inter-talker feature variability, while maximizing its senone discriminability to enhance the performance of a deep neural network (DNN) based automatic speech recognition system. In speaker-invariant training, a DNN acoustic model and a speaker classifier network can be jointly optimized to minimize the senone (triphone state) classification loss, and simultaneously mini-maximize the speaker classification loss. A speaker invariant and senone-discriminative intermediate feature is learned through this adversarial multi-task learning, which can be applied to an automatic speech recognition system. Additional systems and methods are disclosed.

2.09460711Multilingual, acoustic deep neural networks

US - 04.10.2016

Int.Class G10L 15/00 Appl.No 13862541 Applicant Google Inc. Inventor Vincent Olivier Vanhoucke

Methods and systems for processing multilingual DNN acoustic models are described. An example method may include receiving training data that includes a respective training data set for each of two or more or languages. A multilingual deep neural network (DNN) acoustic model may be processed based on the training data. The multilingual DNN acoustic model may include a feedforward neural network having multiple layers of one or more nodes. Each node of a given layer may connect with a respective weight to each node of a subsequent layer, and the multiple layers of one or more nodes may include one or more shared hidden layers of nodes and a language-specific output layer of nodes corresponding to each of the two or more languages. Additionally, weights associated with the multiple layers of one or more nodes of the processed multilingual DNN acoustic model may be stored in a database.

3.20100228548Enhanced automatic speech recognition using mapping between unsupervised and supervised speech model parameters trained on same acoustic training data

US - 09.09.2010

Int.Class G10L 15/06 Appl.No 12400528 Applicant Microsoft Corporation Inventor Liu Chaojun

Techniques for enhanced automatic speech recognition are described. An enhanced ASR system may be operative to generate an error correction function. The error correction function may represent a mapping between a supervised set of parameters and an unsupervised training set of parameters generated using a same set of acoustic training data, and apply the error correction function to an unsupervised testing set of parameters to form a corrected set of parameters used to perform speaker adaptation. Other embodiments are described and claimed.

4.WO/2018/141061SYSTEM AND METHOD FOR MEASURING PERCEPTUAL EXPERIENCES

WO - 09.08.2018

Int.Class A61B 5/16 Appl.No PCT/CA2018/050116 Applicant CEREBIAN INC. Inventor AYYAD, Karim

There is provided a method for determining perceptual experiences. The method comprises obtaining a plurality of signals acquired by a measurement device comprising a plurality of sensors positioned to measure brain activity of users being measured by the measurement device; providing the plurality of signals, without pre-processing, to a processing system comprising at least one deep learning module, the at least one deep learning module being configured to process the signals to generate at least one capability, wherein combinations of one or more of the at least one capability form the perceptual experiences; and providing an output corresponding to a combination of one or more of the at least one capability to an application utilizing the corresponding perceptual experience.

5.2309487Automatic speech recognition system integrating multiple sequence alignment for model bootstrapping

EP - 13.04.2011

Int.Class G10L 15/14 Appl.No 09171957 Applicant HONDA RES INST EUROPE GMBH Inventor AYLLON CLEMENTE IRENE

The invention proposes a method for training an automatic speech recognition system (which can be used e.g. for robots, hearing aids, speech-controlled machines, ...), the automatic speech recognition system using Hidden Markov Models (HMMs) trained by training data, the HMMs being initialized by applying a bootstrapping method on a reduced set of training data, wherein the bootstrapping method transforms an ergodic HMM topology to a Bakis topology for word-level HMM initialization, the method comprising the following steps: a. Calculation of a cost matrix assigning different costs to the permutation of state transitions as observed in the state sequences resulting from the decoding of the training data, b. Calculation of a similarity measure for all sequences based on the cost matrix, and c. Merging of all sequences based on their similarity measure until only one sequence is left.

6.3384488SYSTEM AND METHOD FOR IMPLEMENTING A VOCAL USER INTERFACE BY COMBINING A SPEECH TO TEXT SYSTEM AND A SPEECH TO INTENT SYSTEM

EP - 10.10.2018

Int.Class G10L 15/22 Appl.No 15909433 Applicant FLUENT AI INC Inventor TOMAR VIKRANT

The present disclosure relates to speech recognition systems and methods that enable personalized vocal user interfaces. More specifically, the present disclosure relates to combining a self-learning speech recognition system based on semantics with a speech-to- text system optionally integrated with a natural language processing system. The combined system has the advantage of automatically and continually training the semantics-based speech recognition system and increasing recognition accuracy.

7.20180358005System and method for implementing a vocal user interface by combining a speech to text system and a speech to intent system

US - 13.12.2018

Int.Class G10L 15/18 Appl.No 15780576 Applicant Fluent.AI Inc. Inventor Vikrant Tomar

The present disclosure relates to speech recognition systems and methods that enable personalized vocal user interfaces. More specifically, the present disclosure relates to combining a self-learning speech recognition system based on semantics with a speech-to-text system optionally integrated with a natural language processing system. The combined system has the advantage of automatically and continually training the semantics-based speech recognition system and increasing recognition accuracy.

8.20200013391Acoustic information based language modeling system and method

US - 09.01.2020

Int.Class G10L 15/18 Appl.No 16575317 Applicant LG ELECTRONICS INC. Inventor Seon Yeong Park

Disclosed are a speech data based language modeling system and method. The speech data based language modeling method includes transcription of text data, and generation of a regional dialect corpus based on the text data and regional dialect-containing speech data and generation of an acoustic model and a language model using the regional dialect corpus. The generation of an acoustic model and a language model is performed by machine learning of an artificial intelligence (AI) algorithm using speech data and marking of word spacing of a regional dialect sentence using a speech data tag. A user is able to use a regional dialect speech recognition service which is improved using 5G mobile communication technologies of eMBB, URLLC, or mMTC.

9.3087780SYSTEM AND METHOD FOR MEASURING PERCEPTUAL EXPERIENCES

CA - 09.08.2018

Int.Class A61B 5/16 Appl.No 3087780 Applicant CEREBIAN INC. Inventor AYYAD, KARIM

There is provided a method for determining perceptual experiences. The method comprises obtaining a plurality of signals acquired by a measurement device comprising a plurality of sensors positioned to measure brain activity of users being measured by the measurement device; providing the plurality of signals, without pre-processing, to a processing system comprising at least one deep learning module, the at least one deep learning module being configured to process the signals to generate at least one capability, wherein combinations of one or more of the at least one capability form the perceptual experiences; and providing an output corresponding to a combinationof one or more of the at least one capability to an application utilizing the corresponding perceptual experience.

10.20200401938Machine learning based generation of ontology for structural and functional mapping

US - 24.12.2020

Int.Class G06K 9/00 Appl.No 16888530 Applicant The Board of Trustees of the Leland Stanford Junior University Inventor Amit Etkin

A method may include applying, to a corpus of data, a first machine learning technique to identify candidate domains of an ontology mapping brain structure to mental function. The corpus of data may include textual data describing a plurality of mental functions and spatial data corresponding to a plurality of brain structures. A second machine technique may be applied to optimize a quantity of domains included in the ontology and/or a quantity of mental function terms included in each domain. The ontology may be applied to phenotype an electronic medical record and predict a clinical outcome for a patient associated with the electronic medical record. Related systems and articles of manufacture, including computer program products, are also provided.

1 / 16

Go to Search input	`CTRL` + `SHIFT` +
Go to Results (selected record)	`CTRL` + `SHIFT` +
Go to Detail (selected tab)	`CTRL` + `SHIFT` +
Go to Next page	`CTRL` +
Go to Previous page	`CTRL` +

Go to Next record / image	/
Go to Previous record / image	/
Scroll Up	`Page Up`
Scroll Down	`Page Down`
Scroll to Top	`CTRL` + `Home`
Scroll to Bottom	`CTRL` + `End`

Processing

National Phase Entries

Authority File

Settings

Feedback

Goto Application

Save query

Query Tree

Refine Options

Full Query

Side-by-side view shortcuts

Analysis

Go to Next tab
Go to Previous tab