Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020112376 - AUDIO PIPELINE FOR SIMULTANEOUS KEYWORD SPOTTING, TRANSCRIPTION, AND REAL TIME COMMUNICATIONS

Publication Number WO/2020/112376
Publication Date 04.06.2020
International Application No. PCT/US2019/061566
International Filing Date 14.11.2019
IPC
G10L 15/22 2006.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
22Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/08 2006.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
CPC
G10L 15/08
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
G10L 15/22
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
22Procedures used during a speech recognition process, e.g. man-machine dialogue
G10L 15/26
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
26Speech to text systems
G10L 2015/088
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
08Speech classification or search
088Word spotting
G10L 2015/223
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
22Procedures used during a speech recognition process, e.g. man-machine dialogue
223Execution procedure of a spoken command
G10L 2021/02082
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
21Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
02Speech enhancement, e.g. noise reduction or echo cancellation
0208Noise filtering
02082the noise being echo, reverberation of the speech
Applicants
  • MICROSOFT TECHNOLOGY LICENSING, LLC [US]/[US]
Inventors
  • VELAYUTHAM, Senthil
  • SRINIVASAN, Sriram
Agents
  • MINHAS, Sandip S.
  • ADJEMIAN, Monica
  • BARKER, Doug
  • CHATTERJEE, Aaron C.
  • CHEN, Wei-Chen Nicholas
  • CHOI, Daniel
  • CHURNA, Timothy
  • DINH, Phong
  • EVANS, Patrick
  • GABRYJELSKI, Henry
  • GOLDSMITH, Micah P.
  • GUPTA, Anand
  • HINOJOSA-SMITH, Brianna L.
  • HWANG, William C.
  • JARDINE, John S.
  • LEE, Sunah
  • LEMMON, Marcus
  • MARQUIS, Thomas
  • MEYERS, Jessica
  • ROPER, Brandon
  • SPELLMAN, Steven
  • SULLIVAN, Kevin
  • SWAIN, Cassandra T.
  • TABOR, Ben
  • WALKER, Matt
  • WIGHT, Stephen A.
  • WISDOM, Gregg
  • WONG, Ellen
  • WONG, Thomas S.
  • ZHANG, Hannah
  • HOLMES, Danielle J.
  • TRAN, Kimberly
Priority Data
16/203,96329.11.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) AUDIO PIPELINE FOR SIMULTANEOUS KEYWORD SPOTTING, TRANSCRIPTION, AND REAL TIME COMMUNICATIONS
(FR) PIPELINE AUDIO POUR LE REPÉRAGE DE MOTS-CLÉS, LA TRANSCRIPTION ET LES COMMUNICATIONS EN TEMPS RÉEL SIMULTANÉS
Abstract
(EN)
Disclosed in some examples, are methods, systems, and machine-readable mediums for preventing unintended activation of voice command processing of a voice activated device. A first audio signal may be an audio signal that is to be output to a speaker communicatively coupled to the computing device. A second audio signal may be input from a microphone or other audio capture device. Both audio signals are input to a keyword detector to check for the presence of activation keywords. If the activation keyword(s) are detected in the second audio signal but not the first audio signal the voice command processing of the device is activated as this is likely a command from the user and not feedback from the loudspeaker.
(FR)
Dans certains exemples, la présente invention concerne des procédés, des systèmes et des supports lisibles par machine pour éviter l’activation accidentelle d’un traitement de commande vocale d’un dispositif activé par la voix. Un premier signal audio peut être un signal audio qui doit être délivré à un haut-parleur couplé en communication au dispositif informatique. Un deuxième signal audio peut être entré depuis un microphone ou un autre dispositif de capture audio. Les deux signaux audio sont entrés dans un détecteur de mot-clé pour vérifier la présence de mots-clés d’activation. Si le(s) mot(s)-clé(s) d’activation est/sont détecté(s) dans le deuxième signal audio mais pas le premier signal audio, le traitement de commande vocale du dispositif est activé car il s’agit probablement d’une commande de l’utilisateur et non d’un retour provenant du haut-parleur.
Also published as
Latest bibliographic data on file with the International Bureau