Processing

Please wait...

Settings

Settings

Goto Application

Offices all Languages en Stemming true Single Family Member false Include NPL false
RSS feed can only be generated if you have a WIPO account

Save query

A private query is only visible to you when you are logged-in and can not be used in RSS feeds

Query Tree

Refine Options

Offices
All
Specify the language of your search keywords
Stemming reduces inflected words to their stem or root form.
For example the words fishing, fished,fish, and fisher are reduced to the root word,fish,
so a search for fisher returns all the different variations
Returns only one member of a family of patents
Include Non-Patent literature in results

Full Query

AIfunctionalapplicationsSpeechProcessingSpeechToSpeech

Side-by-side view shortcuts

General
Go to Search input
CTRL + SHIFT +
Go to Results (selected record)
CTRL + SHIFT +
Go to Detail (selected tab)
CTRL + SHIFT +
Go to Next page
CTRL +
Go to Previous page
CTRL +
Results (First, do 'Go to Results')
Go to Next record / image
/
Go to Previous record / image
/
Scroll Up
Page Up
Scroll Down
Page Down
Scroll to Top
CTRL + Home
Scroll to Bottom
CTRL + End
Detail (First, do 'Go to Detail')
Go to Next tab
Go to Previous tab

Analysis

1.20230117224Neural network-based message communication framework with summarization and on-demand audio output generation
US 20.04.2023
Int.Class G06F 40/58
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
40Processing or translation of natural language
58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Appl.No 17505971 Applicant Dell Products L.P. Inventor Bijan Kumar Mohanty

Methods, apparatus, and processor-readable storage media for a neural network-based message communication framework with summarization and on-demand audio output generation are provided herein. An example computer-implemented method includes obtaining message communication content; determining intent-related information and domain-related information in the obtained message communication content by processing at least a portion of the obtained message communication content using one or more machine learning-based natural language processing techniques; generating a summarization of the obtained message communication content by processing, using at least one neural network, at least a portion of the obtained message communication content in connection with at least a portion of the determined intent-related information and at least a portion of the domain-related information; converting the generated summarization from a text format to an audio format; and performing at least one automated action based at least in part on the generated summarization in the audio format.

2.WO/2024/254360METHODS AND SYSTEMS FOR TRANSLATION OF NEURAL ACTIVITY INTO EMBODIED DIGITAL-AVATAR ANIMATION
WO 12.12.2024
Int.Class A61B 5/372
AHUMAN NECESSITIES
61MEDICAL OR VETERINARY SCIENCE; HYGIENE
BDIAGNOSIS; SURGERY; IDENTIFICATION
5Measuring for diagnostic purposes ; Identification of persons
24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
316Modalities, i.e. specific diagnostic methods
369Electroencephalography
372Analysis of electroencephalograms
Appl.No PCT/US2024/032886 Applicant THE REGENTS OF THE UNIVERSITY OF CALIFORNIA Inventor CHANG, Edward
Methods of assisting individuals with communication are provided. In the disclosed methods, cortical activity from a region of the brain associated with movement, speech production, and/or language perception is recorded while an individual attempts to perform an action (e.g., to say words, express an emotion, perform a movement, etc.). Deep learning computational models are used to detect and decode the attempted action from the recorded brain activity. Decoding of actions from brain activity may be aided by the use of self-supervised machine learning techniques, which discretize each action into one or more action representations that serve as an effective intermediary for decoding neural activity patterns and features into meaningful action outputs. Methods for synthesizing decoded actions into audio and/or visual stimuli are also provided, allowing for more naturalistic and expressive communication for individuals who are unable to speak or experience other mobility limitations that inhibit full embodied communication.
3.20100312562Hidden Markov model based text to speech systems employing rope-jumping algorithm
US 09.12.2010
Int.Class G10L 13/00
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
13Speech synthesis; Text to speech systems
Appl.No 12478342 Applicant Microsoft Corporation Inventor Wang Wenlin

A rope-jumping algorithm is employed in a Hidden Markov Model based text to speech system to determine start and end models and to modify the start and end models by setting small co-variances. Disordered acoustic parameters due to violation of parameter constraints are avoided through the modification and result in stable line frequency spectrum for the generated speech.

4.11856369Methods and systems implementing phonologically-trained computer-assisted hearing aids
US 26.12.2023
Int.Class H04R 25/00
HELECTRICITY
04ELECTRIC COMMUNICATION TECHNIQUE
RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
25Deaf-aid sets
Appl.No 17246673 Applicant Abbas Rafii Inventor Abbas Rafii

A hearing aid system presents a hearing impaired user with customized enhanced intelligibility sound in a preferred language. The system includes a model trained with a set of source speech data representing sampling from a speech population relevant to the user. The model is also trained with a set of corresponding alternative articulation of source data, pre-defined or algorithmically constructed during an interactive session with the user. The model creates a set of selected target speech training data from the set of alternative articulation data that is preferred by the user as being satisfactorily intelligible and clear. The system includes a machine learning model, trained to shift incoming source speech data to a preferred variant of the target data that the hearing aid system presents to the user.

5.WO/2024/096908ON-DEVICE MONITORING AND ANALYSIS OF ON-DEVICE MACHINE LEARNING MODELS
WO 10.05.2024
Int.Class G06N 20/00
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
20Machine learning
Appl.No PCT/US2022/079252 Applicant GOOGLE LLC Inventor AGRAWAL, Akash
A method (400) includes obtaining a pre-trained machine learning model (210T) from a remote system (150), receiving input data (221) captured by a user device (110), and processing, using an on-device machine learning model (210O) corresponding to the pre-trained machine learning model, the input data to generate a plurality of predicted outputs (222). The method also includes obtaining performance data (250) representing one or more performance characteristics of the on-device machine learning model, the one or more performance characteristics characterizing a performance of the on-device machine learning model related based on the plurality of predicted outputs, generating, using the performance data, one or more performance metrics (302) for the on-device machine learning model without exposing content of the input data or the plurality of the predicted outputs to the remote system, and transmitting the one or more performance metrics to the remote system.
6.20220012537Augmentation of audiographic images for improved machine learning
US 13.01.2022
Int.Class G10L 15/06
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
Appl.No 17487548 Applicant Google LLC Inventor Daniel Sung-Joon Park

Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.

7.3753012DIRECT SPEECH-TO-SPEECH TRANSLATION VIA MACHINE LEARNING
EP 23.12.2020
Int.Class G10L 13/033
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
13Speech synthesis; Text to speech systems
02Methods for producing synthetic speech; Speech synthesisers
033Voice editing, e.g. manipulating the voice of the synthesiser
Appl.No 20718959 Applicant GOOGLE LLC Inventor JIA YE
The present disclosure provides systems and methods that train and use machine-learned models such as, for example, sequence-to-sequence models, to perform direct and text-free speech-to-speech translation. In particular, aspects of the present disclosure provide an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.
8.WO/2020/205233DIRECT SPEECH-TO-SPEECH TRANSLATION VIA MACHINE LEARNING
WO 08.10.2020
Int.Class G10L 13/033
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
13Speech synthesis; Text to speech systems
02Methods for producing synthetic speech; Speech synthesisers
033Voice editing, e.g. manipulating the voice of the synthesiser
Appl.No PCT/US2020/023169 Applicant GOOGLE LLC Inventor JIA, Ye
The present disclosure provides systems and methods that train and use machine-learned models such as, for example, sequence-to-sequence models, to perform direct and text-free speech-to-speech translation. In particular, aspects of the present disclosure provide an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.
9.20240221718SYSTEMS AND METHODS FOR PROVIDING LOW LATENCY USER FEEDBACK ASSOCIATED WITH A USER SPEAKING SILENTLY
US 04.07.2024
Int.Class G10L 13/027
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
13Speech synthesis; Text to speech systems
02Methods for producing synthetic speech; Speech synthesisers
027Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
Appl.No 18526682 Applicant Wispr AI, Inc. Inventor Tanay Kothari

Methods and systems are provided for detecting and synthesizing a user's speech, including, for example, vocalized, whispered, and silent speech for the purpose of providing an output substantially in parallel with the user speaking. Such information may be detected by one or more sensors such as, for example, electromyography (EMG) sensors used to monitor and record electrical activity produced by muscles that are activated, for example, speech muscles activated when the user is speaking. Other sensor types may be used, such as audio, optical, inertial measurement unit (IMU), or other types of sensors. The user's speech may be synthesized using one or more machine learning models or a machine learning model in conjunction with other suitable processing devices and methods.

10.20230334249USING MACHINE LEARNING FOR INDIVIDUAL CLASSIFICATION
US 19.10.2023
Int.Class G06F 40/30
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
30Semantic analysis
Appl.No 17722566 Applicant Dell Products L.P. Inventor Dhilip S. Kumar

A method comprises analyzing a plurality of natural language inputs associated with at least one user, and determining a plurality of contexts for the plurality of natural language inputs based, at least in part, on the analysis. In the method, a plurality of relationships linked to the at least one user are identified based, at least in part, on the analysis, and the at least one user is classified in one or more categories based, at least in part, on the plurality of contexts and the plurality of relationships. At least one of the analyzing, determining, identifying and classifying is performed using one or more machine learning models.