Processing

Please wait...

Settings

Settings

Goto Application

1. CN101101752 - Monosyllabic language lip-reading recognition system based on vision character

Office China
Application Number 200710052795.0
Application Date 19.07.2007
Publication Number 101101752
Publication Date 09.01.2008
Grant Number 101101752
Grant Date 01.12.2010
Publication Kind B
IPC
G10L 15/24
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
15Speech recognition
24Speech recognition using non-acoustical features
G06K 9/00
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
Applicants Huazhong University of Science & Technology
华中科技大学
Inventors Wang Tianjiang
王天江
Liu Fang
刘芳
Zhou Huihua
周慧华
Gong Liyu
龚立宇
Chen Gang
陈刚
Agents fangfang
华中科技大学专利中心 42201
Title
(EN) Monosyllabic language lip-reading recognition system based on vision character
(ZH) 基于视觉特征的单音节语言唇读识别系统
Abstract
(EN)
This system reads the lip movement of the video creature to recognize the speaking content. Its aim is to use the video info only to recognize the lip language of the single syllable word (SSW), e.g. in Chinese language. This invention includes the video demodulating module, the lip allocating module. The lip movement dividing module, the feature drawing module, the language material warehouse (LMW), the model establishing module and the lip language recognizing module. This LMW possesses rich contents and is easy to expand. This invention processes only video images and need not the audio data to help. It can process video files, e.g. avi, wmv, rmvb and mpg to meet the requirement of recognizing the talking content under soundless condition. The lip movement part in this invention aims SSW to handle intelligently dividing. Comparing with the solid length time dividing or the handwork dividing, this method is more practical and greatly raises the recognition accuracy.

(ZH)

基于视觉特征的单音节语言唇读识别系统,属于计算机智能识别技术,根据视频中人物说话时的唇动变化,识别说话内容,目的在于仅利用视频信息,解决如汉语等单音节语言的唇读识别问题。本发明包括视频解码模块、唇部定位模块、唇动分割模块、特征提取模块、语料库、模型建立模块和唇语识别模块;本发明所采用的语料库内容丰富,易于扩充,本发明只需处理视频图像,不需要音频数据进行辅助识别,能够对avi、wmv、rmvb、mpg等视频文件进行处理,满足无声条件下说话内容识别的要求。本发明的唇动分割部分以单音节为识别目标进行机器智能分割,与定长时间分割和手工分割相比,实用性更强,识别准确率得到极大提高。