Processing

Please wait...

Settings

Settings

Goto Application

1. KR1020220013850 - 발화 영상 생성 방법 및 장치

Office
Republic of Korea
Application Number 1020200093374
Application Date 27.07.2020
Publication Number 1020220013850
Publication Date 04.02.2022
Publication Kind A
IPC
G10L 21/10
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
21Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
10Transforming into visible information
G06N 3/04
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architecture, e.g. interconnection topology
G06N 3/08
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G10L 25/30
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/-G10L21/129
27characterised by the analysis technique
30using neural networks
CPC
G10L 21/10
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
21Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
10Transforming into visible information
G06N 3/0454
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0454using a combination of multiple neural nets
G06N 3/08
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G10L 25/30
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
25Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
27characterised by the analysis technique
30using neural networks
H04N 21/4307
HELECTRICITY
04ELECTRIC COMMUNICATION TECHNIQUE
NPICTORIAL COMMUNICATION, e.g. TELEVISION
21Selective content distribution, e.g. interactive television or video on demand [VOD]
40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
4302Content synchronisation processes, e.g. decoder synchronisation
4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
G10L 2021/105
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
21Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
10Transforming into visible information
105Synthesis of the lips movements from speech, e.g. for talking heads
Applicants 주식회사 딥브레인에이아이
Inventors 채경수
황금별
Agents 두호특허법인
Title
(KO) 발화 영상 생성 방법 및 장치
Abstract
(KO) 발화 영상 생성 방법 및 장치가 개시된다. 개시되는 일 실시예에 따른 발화 영상 생성 장치는, 하나 이상의 프로세서들, 및 하나 이상의 프로세서들에 의해 실행되는 하나 이상의 프로그램들을 저장하는 메모리를 구비한 발화 영상 생성 장치로서, 인물의 발화 영상을 입력으로 하여 영상 특징을 추출하고, 추출한 영상 특징으로부터 발화 영상을 복원하도록 하는 제1 머신 러닝 모델 및 인물의 발화 오디오 신호를 입력으로 하여 영상 특징을 예측하도록 하는 제2 머신 러닝 모델을 포함한다.
Related patent documents