Processing

Please wait...

Settings

Settings

Goto Application

1. WO2019194566 - APPARATUS AND METHOD FOR CONVERTING TEXT WITHIN IMAGE TO VOICE

Publication Number WO/2019/194566
Publication Date 10.10.2019
International Application No. PCT/KR2019/003926
International Filing Date 03.04.2019
IPC
G10L 13/08 2006.01
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
13Speech synthesis; Text to speech systems
08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G06F 15/02 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
15Digital computers in general; Data processing equipment in general
02manually operated with input through keyboard and computation using a built-in program, e.g. pocket calculators
CPC
G06F 15/02
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
15Digital computers in general
02manually operated with input through keyboard and computation using a built-in program, e.g. pocket calculators
G10L 13/08
GPHYSICS
10MUSICAL INSTRUMENTS; ACOUSTICS
LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
13Speech synthesis; Text to speech systems
08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
Applicants
  • 양진호 YANG, Jin Ho [KR]/[KR]
Inventors
  • 양진호 YANG, Jin Ho
Agents
  • 특허법인 메이저 MAJOR PATENT AND LAW FIRM
Priority Data
10-2018-003978905.04.2018KR
10-2018-007667802.07.2018KR
10-2019-002268826.02.2019KR
Publication Language Korean (KO)
Filing Language Korean (KO)
Designated States
Title
(EN) APPARATUS AND METHOD FOR CONVERTING TEXT WITHIN IMAGE TO VOICE
(FR) APPAREIL ET PROCÉDÉ DE CONVERSION EN VOIX D'UN TEXTE PRÉSENT DANS UNE IMAGE
(KO) 이미지 내의 텍스트 음성 변환 장치 및 방법
Abstract
(EN)
A method for converting text within an image to a voice, according to one embodiment of the present invention, comprises the steps of: allowing an image acquisition unit to acquire an image of a medium in which text is recorded; allowing a text detection unit to detect text in a set area, which is set by a user, within the image by using a maximally stable external region (MSER) algorithm and an edge detection algorithm; allowing a text correction unit to correct a character string within the detected text; allowing a processing unit to process the corrected character string so as to obtain a text file in preset phoneme units; allowing a conversion unit to convert the text file in phoneme units to a voice format on the basis of input information of the user; and allowing an output unit to convert the voice format according to setting information of the user so as to output same.
(FR)
La présente invention concerne, selon un mode de réalisation, un procédé de conversion en une voix d'un texte présent dans une image, le procédé comprenant les étapes consistant : à permettre à une unité d'acquisition d'image d'acquérir une image d'un support multimédia dans lequel un texte est enregistré; à permettre à une unité de détection de texte de détecter un texte dans une zone définie, qui est définie par un utilisateur, dans l'image au moyen d'un algorithme de régions externes les plus stables (MSER) et d'un algorithme de détection de bord; à permettre à une unité de correction de texte de corriger une chaîne de caractères dans le texte détecté; à permettre à une unité de traitement de traiter la chaîne de caractères corrigée de façon à obtenir un fichier de texte dans des unités phonèmes prédéfinies; à permettre à une unité de conversion de convertir le fichier de texte sous forme d'unités phonèmes en un format vocal sur la base d'informations d'entrée de l'utilisateur; et à permettre à une unité de sortie de convertir le format vocal en fonction des informations de réglage de l'utilisateur de façon à le délivrer en sortie.
(KO)
본 발명의 일 실시예에 따른 이미지 내의 텍스트를 음성으로 변환하는 방법은 이미지 획득부에서 텍스트가 기록된 매체의 이미지를 획득하는 단계; 텍스트 검출부에서 MSER(Maximally Stable External Region) 알고리즘 및 에지 검출 알고리즘을 이용하여 상기 이미지 내에서 사용자가 설정한 설정영역의 텍스트를 검출하는 단계; 텍스트 보정부에서 검출된 텍스트 내의 문자열을 보정하는 단계; 가공부에서 보정된 문자열을 기 설정된 음운 단위의 텍스트 파일로 가공하는 단계; 변환부에서 사용자의 입력정보에 기초하여 상기 음운 단위의 텍스트 파일을 음성 포맷으로 변환하는 단계; 및 출력부에서 상기 음성 포맷을 사용자의 설정정보에 따라 변환하여 출력하는 단계를 포함한다.
Also published as
Latest bibliographic data on file with the International Bureau