Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020111676 - VOICE RECOGNITION DEVICE AND METHOD

Publication Number WO/2020/111676
Publication Date 04.06.2020
International Application No. PCT/KR2019/016181
International Filing Date 22.11.2019
IPC
G10L 25/75 2013.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/-G10L21/129
75for modelling vocal tract parameters
G10L 15/06 2006.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
CPC
G10L 15/06
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 25/75
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
75for modelling vocal tract parameters
Applicants
  • 삼성전자 주식회사 SAMSUNG ELECTRONICS CO., LTD. [KR]/[KR]
Inventors
  • 김찬우 KIM, Chanwoo
  • 고다다난자야 엔. GOWDA, Dhananjaya N.
  • 김성수 KIM, Sungsoo
  • 신민규 SHIN, Minkyu
  • 헥래리 폴 HECK, Larry Paul
  • 가르그아비나브 GARG, Abhinav
  • 김광윤 KIM, Kwangyoun
  • 쿠마르메훌 KUMAR, Mehul
Agents
  • 리앤목 특허법인 Y.P.LEE, MOCK & PARTNERS
Priority Data
10-2019-003637628.03.2019KR
10-2019-015049421.11.2019KR
62/772,38228.11.2018US
62/848,69816.05.2019US
Publication Language Korean (KO)
Filing Language Korean (KO)
Designated States
Title
(EN) VOICE RECOGNITION DEVICE AND METHOD
(FR) DISPOSITIF ET PROCÉDÉ DE RECONNAISSANCE VOCALE
(KO) 음성 인식 장치 및 방법
Abstract
(EN)
The present disclosure relates to an electronic device for recognizing a user voice and a method for recognizing a user voice by the electronic device. According to an embodiment, a method for recognizing a user voice may comprise the steps of: acquiring an audio signal divided into a plurality of frame units; by applying a filter bank distributed according to a preconfigured scale to a frequency spectrum of the audio signal divided in units of the frames, determining an energy component for each filter bank; smoothing the determined energy component for each filter bank; extracting a feature vector of the audio signal on the basis of the smoothed energy component for each filter bank; and recognizing the user voice in the audio signal by inputting the extracted feature vector into a voice recognition model.
(FR)
La présente invention concerne un dispositif électronique permettant de reconnaître une voix d'utilisateur et un procédé permettant de reconnaître une voix d'utilisateur par le dispositif électronique. Selon un mode de réalisation, un procédé de reconnaissance d'une voix d'utilisateur peut comprendre les étapes consistant : à acquérir un signal audio divisé en une pluralité d'unités de trame ; par application d'une banque de filtres distribuée selon une échelle préconfigurée à un spectre de fréquences du signal audio divisé en unités des trames, à déterminer une composante d'énergie pour chaque banque de filtres ; à lisser la composante d'énergie déterminée pour chaque banque de filtres ; à extraire un vecteur de caractéristiques du signal audio sur la base de la composante d'énergie lissée pour chaque banque de filtres ; et à reconnaître la voix de l'utilisateur dans le signal audio en saisissant le vecteur de caractéristiques extrait dans un modèle de reconnaissance vocale.
(KO)
본 개시는 사용자 음성을 인식하는 전자 장치 및 상기 전자 장치가 사용자 음성을 인식하는 방법에 관한 것이다. 일 실시 예에 의하면, 사용자의 음성을 인식하는 방법은 복수의 프레임 단위로 구분되는 오디오 신호를 획득하는 단계; 상기 프레임 단위로 구분되는 상기 오디오 신호의 주파수 스펙트럼에 기 설정된 스케일에 따라 분포된 필터 뱅크를 적용함으로써 필터 뱅크 별 에너지 성분을 결정하는 단계; 상기 결정된 필터 뱅크 별 에너지 성분을 평탄화(smoothing) 하는 단계; 상기 평탄화된 필터 뱅크 별 에너지 성분에 기초하여 상기 오디오 신호의 특징 벡터를 추출하는 단계; 및 상기 추출된 특징 벡터를 음성 인식 모델에 입력함으로써 상기 오디오 신호 내 상기 사용자의 음성을 인식하는 단계; 를 포함할 수 있다.
Also published as
Latest bibliographic data on file with the International Bureau