WIPO logo
Mobile | Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文 | العربية |
PATENTSCOPE

Search International and National Patent Collections
World Intellectual Property Organization
Search
 
Browse
 
Translate
 
Options
 
News
 
Login
 
Help
 
Machine translation
1. (WO2016093532) ASSOCIATED KEYWORD EXTRACTION METHOD BASED ON NORMALIZED KEYWORD WEIGHT
Latest bibliographic data on file with the International Bureau   

Pub. No.:    WO/2016/093532    International Application No.:    PCT/KR2015/012949
Publication Date: 16.06.2016 International Filing Date: 01.12.2015
IPC:
G06F 17/30 (2006.01)
Applicants: WISENUT, INC. [KR/KR]; (Sampyungdong, DTC Tower Floor 5&6) 49 Daewangpangyo-ro 644 gil Bundang-gu, Seongnam-si Gyeonggi-do 13493 (KR)
Inventors: HAN, Kyuyeol; (KR).
AHN, Youngmin; (KR)
Agent: LIM, Seungseop; (KR)
Priority Data:
10-2014-0177226 10.12.2014 KR
Title (EN) ASSOCIATED KEYWORD EXTRACTION METHOD BASED ON NORMALIZED KEYWORD WEIGHT
(FR) PROCÉDÉ D'EXTRACTION DE MOT-CLÉ ASSOCIÉ BASÉ SUR UN POIDS DE MOT-CLÉ NORMALISÉ
(KO) 정규화된 키워드 가중치에 기반한 연관 키워드 추출 방법
Abstract: front page image
(EN)The present invention relates to a method for extracting associated keywords in a set of documents in a document database. The method of the present invention comprises: (a) generating, by a computer device, a keyword candidate from the set of documents; (b) providing a first weight with respect to the generated keyword and normalizing the keyword weight to the sum of weights in all documents; (c) calculating the association degree of a pair of keywords which simultaneously appear as the sum of weights in documents which simultaneously appear with respect to each keyword, to thereby calculate, per keyword, the weight (second weight) of associated keywords; and (d) determining the ranking of the pair of keywords using the degree of association of the pair of keywords, to which the second weight is provided.
(FR)La présente invention concerne un procédé pour extraire des mots-clés associés dans un ensemble de documents dans une base de données de documents. Le procédé de la présente invention consiste : a) à générer, par un dispositif informatique, un mot-clé candidat à partir de l'ensemble de documents; b) à fournir un premier poids par rapport au mot-clé généré et normaliser le poids de mot-clé à la somme des poids dans tous les documents; c) à calculer le degré d'association d'une paire de mots-clés qui apparaissent simultanément comme la somme des poids dans des documents qui apparaissent simultanément par rapport à chaque mot-clé, pour calculer ainsi, par mot-clé, le poids (second poids) de mots-clés associés; et d) à déterminer le classement de la paire de mots-clés à l'aide du degré d'association de la paire de mots-clés à laquelle le second poids est fourni.
(KO)본 발명은 문서 데이터베이스에 있는 문서 집합에서 연관 키워드를 추출하는 방법에 관한 것이다. 본 발명의 방법은, (a) 컴퓨터 디바이스가 상기 문서 집합에서 키워드 후보를 생성하는 단계와, (b) 생성된 키워드에 대해서 제 1 가중치를 부여하고 모든 문서에서의 가중치합으로 키워드 가중치를 정규화하는 단계와, (c) 동시 출현한 키워드쌍의 연관도를 각 키워드에 대해서 동시 출현한 문서에서의 가중치 합으로 계산함으로써 키워드별 연관 키워드의 가중치(제 2 가중치)를 계산하는 단계와, (d) 상기 제 2 가중치가 부여된 키워드쌍의 연관도를 이용하여 상기 키워드쌍의 랭킹을 결정하는 단계를 포함한다.
Designated States: AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KN, KP, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW.
African Regional Intellectual Property Organization (BW, GH, GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, UG, ZM, ZW)
Eurasian Patent Organization (AM, AZ, BY, KG, KZ, RU, TJ, TM)
European Patent Office (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR)
African Intellectual Property Organization (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, KM, ML, MR, NE, SN, TD, TG).
Publication Language: Korean (KO)
Filing Language: Korean (KO)