Processing

Please wait...

Settings

Settings

Goto Application

Offices all Languages en Stemming true Single Family Member false Include NPL false
RSS feed can only be generated if you have a WIPO account

Save query

A private query is only visible to you when you are logged-in and can not be used in RSS feeds

Query Tree

Refine Options

Offices
All
Specify the language of your search keywords
Stemming reduces inflected words to their stem or root form.
For example the words fishing, fished,fish, and fisher are reduced to the root word,fish,
so a search for fisher returns all the different variations
Returns only one member of a family of patents
Include Non-Patent literature in results

Full Query

AIfunctionalapplicationsSpeechProcessingPhonology

Side-by-side view shortcuts

General
Go to Search input
CTRL + SHIFT +
Go to Results (selected record)
CTRL + SHIFT +
Go to Detail (selected tab)
CTRL + SHIFT +
Go to Next page
CTRL +
Go to Previous page
CTRL +
Results (First, do 'Go to Results')
Go to Next record / image
/
Go to Previous record / image
/
Scroll Up
Page Up
Scroll Down
Page Down
Scroll to Top
CTRL + Home
Scroll to Bottom
CTRL + End
Detail (First, do 'Go to Detail')
Go to Next tab
Go to Previous tab

Analysis

1.10347241Speaker-invariant training via adversarial learning
US 09.07.2019
Int.Class G10L 25/30
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/-G10L21/129
27characterised by the analysis technique
30using neural networks
Appl.No 15934566 Applicant Microsoft Technology Licensing, LLC Inventor Zhong Meng

Systems and methods can be implemented to conduct speaker-invariant training for speech recognition in a variety of applications. An adversarial multi-task learning scheme for speaker-invariant training can be implemented, aiming at actively curtailing the inter-talker feature variability, while maximizing its senone discriminability to enhance the performance of a deep neural network (DNN) based automatic speech recognition system. In speaker-invariant training, a DNN acoustic model and a speaker classifier network can be jointly optimized to minimize the senone (triphone state) classification loss, and simultaneously mini-maximize the speaker classification loss. A speaker invariant and senone-discriminative intermediate feature is learned through this adversarial multi-task learning, which can be applied to an automatic speech recognition system. Additional systems and methods are disclosed.

2.09460711Multilingual, acoustic deep neural networks
US 04.10.2016
Int.Class G10L 15/00
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
Appl.No 13862541 Applicant Google Inc. Inventor Vincent Olivier Vanhoucke

Methods and systems for processing multilingual DNN acoustic models are described. An example method may include receiving training data that includes a respective training data set for each of two or more or languages. A multilingual deep neural network (DNN) acoustic model may be processed based on the training data. The multilingual DNN acoustic model may include a feedforward neural network having multiple layers of one or more nodes. Each node of a given layer may connect with a respective weight to each node of a subsequent layer, and the multiple layers of one or more nodes may include one or more shared hidden layers of nodes and a language-specific output layer of nodes corresponding to each of the two or more languages. Additionally, weights associated with the multiple layers of one or more nodes of the processed multilingual DNN acoustic model may be stored in a database.

3.20100228548Enhanced automatic speech recognition using mapping between unsupervised and supervised speech model parameters trained on same acoustic training data
US 09.09.2010
Int.Class G10L 15/06
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
Appl.No 12400528 Applicant Microsoft Corporation Inventor Liu Chaojun

Techniques for enhanced automatic speech recognition are described. An enhanced ASR system may be operative to generate an error correction function. The error correction function may represent a mapping between a supervised set of parameters and an unsupervised training set of parameters generated using a same set of acoustic training data, and apply the error correction function to an unsupervised testing set of parameters to form a corrected set of parameters used to perform speaker adaptation. Other embodiments are described and claimed.

4.WO/2018/141061SYSTEM AND METHOD FOR MEASURING PERCEPTUAL EXPERIENCES
WO 09.08.2018
Int.Class A61B 5/16
AHUMAN NECESSITIES
61MEDICAL OR VETERINARY SCIENCE; HYGIENE
BDIAGNOSIS; SURGERY; IDENTIFICATION
5Measuring for diagnostic purposes ; Identification of persons
16Devices for psychotechnics; Testing reaction times
Appl.No PCT/CA2018/050116 Applicant CEREBIAN INC. Inventor AYYAD, Karim
There is provided a method for determining perceptual experiences. The method comprises obtaining a plurality of signals acquired by a measurement device comprising a plurality of sensors positioned to measure brain activity of users being measured by the measurement device; providing the plurality of signals, without pre-processing, to a processing system comprising at least one deep learning module, the at least one deep learning module being configured to process the signals to generate at least one capability, wherein combinations of one or more of the at least one capability form the perceptual experiences; and providing an output corresponding to a combination of one or more of the at least one capability to an application utilizing the corresponding perceptual experience.
5.2309487Automatic speech recognition system integrating multiple sequence alignment for model bootstrapping
EP 13.04.2011
Int.Class G10L 15/14
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
14using statistical models, e.g. Hidden Markov Models
Appl.No 09171957 Applicant HONDA RES INST EUROPE GMBH Inventor AYLLON CLEMENTE IRENE
The invention proposes a method for training an automatic speech recognition system (which can be used e.g. for robots, hearing aids, speech-controlled machines, ...), the automatic speech recognition system using Hidden Markov Models (HMMs) trained by training data, the HMMs being initialized by applying a bootstrapping method on a reduced set of training data, wherein the bootstrapping method transforms an ergodic HMM topology to a Bakis topology for word-level HMM initialization, the method comprising the following steps: a. Calculation of a cost matrix assigning different costs to the permutation of state transitions as observed in the state sequences resulting from the decoding of the training data, b. Calculation of a similarity measure for all sequences based on the cost matrix, and c. Merging of all sequences based on their similarity measure until only one sequence is left.
6.3384488SYSTEM AND METHOD FOR IMPLEMENTING A VOCAL USER INTERFACE BY COMBINING A SPEECH TO TEXT SYSTEM AND A SPEECH TO INTENT SYSTEM
EP 10.10.2018
Int.Class G10L 15/22
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
22Procedures used during a speech recognition process, e.g. man-machine dialog
Appl.No 15909433 Applicant FLUENT AI INC Inventor TOMAR VIKRANT
The present disclosure relates to speech recognition systems and methods that enable personalized vocal user interfaces. More specifically, the present disclosure relates to combining a self-learning speech recognition system based on semantics with a speech-to- text system optionally integrated with a natural language processing system. The combined system has the advantage of automatically and continually training the semantics-based speech recognition system and increasing recognition accuracy.
7.20180358005System and method for implementing a vocal user interface by combining a speech to text system and a speech to intent system
US 13.12.2018
Int.Class G10L 15/18
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
18using natural language modelling
Appl.No 15780576 Applicant Fluent.AI Inc. Inventor Vikrant Tomar

The present disclosure relates to speech recognition systems and methods that enable personalized vocal user interfaces. More specifically, the present disclosure relates to combining a self-learning speech recognition system based on semantics with a speech-to-text system optionally integrated with a natural language processing system. The combined system has the advantage of automatically and continually training the semantics-based speech recognition system and increasing recognition accuracy.

8.20200013391Acoustic information based language modeling system and method
US 09.01.2020
Int.Class G10L 15/18
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
18using natural language modelling
Appl.No 16575317 Applicant LG ELECTRONICS INC. Inventor Seon Yeong Park

Disclosed are a speech data based language modeling system and method. The speech data based language modeling method includes transcription of text data, and generation of a regional dialect corpus based on the text data and regional dialect-containing speech data and generation of an acoustic model and a language model using the regional dialect corpus. The generation of an acoustic model and a language model is performed by machine learning of an artificial intelligence (AI) algorithm using speech data and marking of word spacing of a regional dialect sentence using a speech data tag. A user is able to use a regional dialect speech recognition service which is improved using 5G mobile communication technologies of eMBB, URLLC, or mMTC.

9.3087780SYSTEM AND METHOD FOR MEASURING PERCEPTUAL EXPERIENCES
CA 09.08.2018
Int.Class A61B 5/16
AHUMAN NECESSITIES
61MEDICAL OR VETERINARY SCIENCE; HYGIENE
BDIAGNOSIS; SURGERY; IDENTIFICATION
5Measuring for diagnostic purposes ; Identification of persons
16Devices for psychotechnics; Testing reaction times
Appl.No 3087780 Applicant CEREBIAN INC. Inventor AYYAD, KARIM
There is provided a method for determining perceptual experiences. The method comprises obtaining a plurality of signals acquired by a measurement device comprising a plurality of sensors positioned to measure brain activity of users being measured by the measurement device; providing the plurality of signals, without pre-processing, to a processing system comprising at least one deep learning module, the at least one deep learning module being configured to process the signals to generate at least one capability, wherein combinations of one or more of the at least one capability form the perceptual experiences; and providing an output corresponding to a combinationof one or more of the at least one capability to an application utilizing the corresponding perceptual experience.
10.20200401938Machine learning based generation of ontology for structural and functional mapping
US 24.12.2020
Int.Class G06K 9/00
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for recognising patterns
Appl.No 16888530 Applicant The Board of Trustees of the Leland Stanford Junior University Inventor Amit Etkin

A method may include applying, to a corpus of data, a first machine learning technique to identify candidate domains of an ontology mapping brain structure to mental function. The corpus of data may include textual data describing a plurality of mental functions and spatial data corresponding to a plurality of brain structures. A second machine technique may be applied to optimize a quantity of domains included in the ontology and/or a quantity of mental function terms included in each domain. The ontology may be applied to phenotype an electronic medical record and predict a clinical outcome for a patient associated with the electronic medical record. Related systems and articles of manufacture, including computer program products, are also provided.