Processing

Please wait...

Settings

Settings

Goto Application

1. CN108538311 - Audio classification method, device and computer readable storage medium

Office China
Application Number 201810332491.8
Application Date 13.04.2018
Publication Number 108538311
Publication Date 14.09.2018
Grant Number 108538311
Grant Date 15.09.2020
Publication Kind B
IPC
G10L 25/24
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/-G10L21/129
03characterised by the type of extracted parameters
24the extracted parameters being the cepstrum
G10L 25/30
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/-G10L21/129
27characterised by the analysis technique
30using neural networks
CPC
G10L 25/24
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
03characterised by the type of extracted parameters
24the extracted parameters being the cepstrum
G10L 25/30
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
27characterised by the analysis technique
30using neural networks
Applicants TENCENT MUSIC ENTERTAINMENT TECHNOLOGY (SHENZHEN) CO., LTD.
腾讯音乐娱乐科技(深圳)有限公司
Inventors WANG ZHENGTAO
王征韬
ZHANG QING
张庆
Agents 北京三高永信知识产权代理有限责任公司 11138
Title
(EN) Audio classification method, device and computer readable storage medium
(ZH) 音频分类方法、装置及计算机可读存储介质
Abstract
(EN)
The invention discloses an audio classification method, device and a computer readable storage medium, and belongs to the technical field of electronics. The method comprises: collecting an audio signal; intercepting or supplementing the audio signal to adjust the duration of the audio signal to a preset duration; converting the audio signal to a target audio according to the frequency informationof the audio signal; extracting audio features of the target audio through a convolutional network contained in a preset classifier; extracting time-order features of the audio features through a threshold circulation network contained in the preset classifier; and determining a probability that a category of the target audio is a preset category identified by each of multiple preset category identifiers through a fully-connected network contained in the preset classifier according to the time-order features; and determining the preset category identified by a preset category identifier having the highest probability among the multiple preset category identifiers as the category of the target audio. With the adoption of the method, segmentation of the target audio is avoided, the integrity of the target audio is preserved, and the classification accuracy is relatively high.

(ZH)
本发明公开了一种音频分类方法、装置及计算机可读存储介质,属于电子技术领域。该方法包括:采集音频信号;对音频信号进行截取或补充,以将音频信号的时长调整为预设时长;根据音频信号的频率信息,将音频信号转换为目标音频;通过预设分类器中包括的卷积网络提取目标音频的音频特征;通过预设分类器中包括的门限循环网络提取音频特征的时序特征;根据时序特征,通过预设分类器中包括的全连接网络确定目标音频的类别为多个预设类别标识中每个预设类别标识所标识的预设类别的概率;将多个预设类别标识中概率最大的预设类别标识所标识的预设类别确定为目标音频的类别。本发明无需对目标音频进行分段,保留了目标音频的完整性,分类准确度较高。