WIPO logo
Mobile | Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文 | العربية |
PATENTSCOPE

Search International and National Patent Collections
World Intellectual Property Organization
Search
 
Browse
 
Translate
 
Options
 
News
 
Login
 
Help
 
Machine translation
1. (WO2018088664) DEVICE FOR AUTOMATICALLY DETECTING MORPHEME PART OF SPEECH TAGGING CORPUS ERROR BY USING ROUGH SETS, AND METHOD THEREFOR
Latest bibliographic data on file with the International Bureau    Submit observation

Pub. No.: WO/2018/088664 International Application No.: PCT/KR2017/006916
Publication Date: 17.05.2018 International Filing Date: 29.06.2017
IPC:
G06F 17/27 (2006.01)
Applicants: CHANGWON NATIONAL UNIVERSITY INDUSTRY UNIVERSITY COOPERATION FOUNDATION[KR/KR]; (Sarim-dong) 20, Changwondaehak-ro Uichang-gu, Changwon-si Gyeongsangnam-do 51140, KR
Inventors: CHA, Jeong Won; KR
PARK, Tae Ho; KR
SHIN, Chang Uk; KR
PARK, Da Sol; KR
PARK, Seong Jae; KR
Agent: KIM, Jung Su; KR
Priority Data:
10-2016-014959710.11.2016KR
Title (EN) DEVICE FOR AUTOMATICALLY DETECTING MORPHEME PART OF SPEECH TAGGING CORPUS ERROR BY USING ROUGH SETS, AND METHOD THEREFOR
(FR) DISPOSITIF DE DÉTECTION AUTOMATIQUE D'ERREUR DE CORPUS D'ÉTIQUETAGE MORPHOSYNTAXIQUE AU MOYEN D'ENSEMBLES APPROXIMATIFS, ET PROCÉDÉ ASSOCIÉ
(KO) 러프 셋을 이용한 형태소 품사 태깅 코퍼스 오류 자동 검출 장치 및 그 방법
Abstract: front page image
(EN) A device for detecting a morpheme tagging corpus error, of the present invention, comprises: an attribute generating unit (120) for generating attributes for word phrases included in an input corpus, by using a kernel to which a rough set theory is applied; and an attribute statistics processing unit (130) for generating part of speech tagging corpus error data through the calculation of attributes and frequency count for the same word phrases by counting attributes for the same word phrase among the word phrases, and thus the present invention can detect, quantify, and modify errors included in a corpus (learning data) required in learning for classifier generation and recognition for natural language processing.
(FR) Un dispositif de détection d'une erreur de corpus d'étiquetage morphosyntaxique, selon la présente invention, comprend : une unité de génération d'attributs (120) pour générer des attributs pour des phrases de mots incluses dans un corpus d'entrée, à l'aide d'un noyau auquel une théorie d'ensembles approximatifs est appliquée ; et une unité de traitement de statistiques d'attributs (130) pour générer des données d'erreur de corpus d'étiquetage morphosyntaxique par un calcul d'attributs et un comptage fréquentiel pour la même phrase de mots par comptage d'attributs pour la même phrase de mots parmi les phrases de mots. La présente invention permet ainsi de détecter, quantifier et modifier des erreurs incluses dans un corpus (données d'apprentissage) nécessaires à l'apprentissage pour la génération et la reconnaissance de classificateurs pour un traitement naturel de la langue.
(KO) 본 발명의 형태소 태깅 코퍼스 오류 검출 장치는 입력된 코퍼스에 포함된 어절들에 대하여 러프 셋 이론을 적용한 커널을 이용하여 자질을 생성하는 자질생성부(120); 및 상기 어절들 중 동일 어절에 대한 자질을 카운트하여 동일 어절들에 대한 자질들과 빈도수를 산출하는 것에 의해 품사 태깅 코퍼스 오류 데이터를 생성하는 자질통계부(130);을 포함하여 구성되어, 자연어 처리를 위한 인식 및 분류기 생성을 위해 학습에 필요한 코퍼스(corpus, 말뭉치, 학습데이터)에 포함되는 오류를 검출하여 정량화할 수 있도록 하고 수정할 수 있도록 한다.
Designated States: AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK, DM, DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IR, IS, JO, JP, KE, KG, KH, KN, KP, KW, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW
African Regional Intellectual Property Organization (ARIPO) (BW, GH, GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, UG, ZM, ZW)
Eurasian Patent Office (AM, AZ, BY, KG, KZ, RU, TJ, TM)
European Patent Office (EPO) (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR)
African Intellectual Property Organization (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, KM, ML, MR, NE, SN, TD, TG)
Publication Language: Korean (KO)
Filing Language: Korean (KO)