Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020117586 - SPEECH INPUT PROCESSING

Publication Number WO/2020/117586
Publication Date 11.06.2020
International Application No. PCT/US2019/063555
International Filing Date 27.11.2019
IPC
G10L 15/18 2013.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
18using natural language modelling
G10L 15/22 2006.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
22Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/16 2006.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
16using artificial neural networks
CPC
G10L 15/16
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
16using artificial neural networks
G10L 15/1815
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
18using natural language modelling
1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
G10L 15/1822
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
18using natural language modelling
1822Parsing for meaning understanding
G10L 15/187
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
18using natural language modelling
183using context dependencies, e.g. language models
187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
G10L 15/197
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
18using natural language modelling
183using context dependencies, e.g. language models
19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
197Probabilistic grammars, e.g. word n-grams
G10L 15/22
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
22Procedures used during a speech recognition process, e.g. man-machine dialogue
Applicants
  • GOOGLE LLC [US]/[US]
Inventors
  • ALEKSIC, Petar
  • MORENO MENGIBAR, Pedro J.
  • VELIKOVICH, Leonid
Agents
  • KRUEGER, Brett A.
Priority Data
62/774,50703.12.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) SPEECH INPUT PROCESSING
(FR) TRAITEMENT D’ENTRÉES VOCALES
Abstract
(EN)
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing contextual grammar selection are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance. The actions further include generating a word lattice that includes multiple candidate transcriptions of the utterance and that includes transcription confidence scores. The actions further include determining a context of the computing device. The actions further include based on the context of the computing device, identifying grammars that correspond to the multiple candidate transcriptions. The actions further include determining, for each of the multiple candidate transcriptions, grammar confidence scores that reflect a likelihood that a respective grammar is a match for a respective candidate transcription. The actions further include selecting, from among the candidate transcriptions, a candidate transcription. The actions further include providing, for output, the selected candidate transcription as a transcription of the utterance.
(FR)
La présente invention concerne des procédés, des systèmes et un appareil, incluant des programmes informatiques codés sur un support de stockage informatique, pour mettre en œuvre une sélection de grammaire contextuelle. Dans un aspect, un procédé comprend les actions consistant à recevoir des données audio d’un énoncé. Les actions comprennent en outre la génération d’un réseau de mots qui inclut de multiples transcriptions candidates de l’énoncé et qui comprend des scores de confiance de transcription. Les actions comprennent en outre la détermination d’un contexte du dispositif informatique. Les actions comprennent en outre, sur la base du contexte du dispositif informatique, l’identification de grammaires qui correspondent aux multiples transcriptions candidates. Les actions comprennent en outre la détermination, pour chacune des multiples transcriptions candidates, de scores de confiance de grammaire qui reflètent une vraisemblance qu’une grammaire respective soit une correspondance pour une transcription candidate respective. Les actions comprennent en outre la sélection, parmi les transcriptions candidates, d’une transcription candidate. Les actions comprennent en outre la fourniture, aux fins de sortie, de la transcription candidate sélectionnée en tant que transcription de l’énoncé.
Latest bibliographic data on file with the International Bureau