WIPO logo
Mobile | Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文 | العربية |
PATENTSCOPE

Search International and National Patent Collections
World Intellectual Property Organization
Search
 
Browse
 
Translate
 
Options
 
News
 
Login
 
Help
 
Machine translation
1. (WO2003005233) METHOD AND SYSTEM FOR LEXICAL ACQUISITION AND IDENTIFYING WORD BOUNDARIES
Latest bibliographic data on file with the International Bureau   

Pub. No.:    WO/2003/005233    International Application No.:    PCT/CN2001/001149
Publication Date: 16.01.2003 International Filing Date: 02.07.2001
IPC:
G06F 17/27 (2006.01)
Applicants: INTEL CORPORATION [US/US]; 2200 Mission College Boulevard, Santa Clara, CA 95052 (US) (AE, AG, AL, AM, AT, AU, AZ, BA, BB, BE, BF, BG, BJ, BR, BY, BZ, CA, CF, CG, CH, CI, CM, CN, CO, CR, CU, CY, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, FR, GA, GB, GD, GE, GH, GM, GN, GR, GW, HR, HU, ID, IE, IL, IN, IS, IT, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MC, MD, MG, MK, ML, MN, MR, MW, MX, MZ, NE, NL, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, SN, SZ, TD, TG, TJ, TM, TR, TT, TZ, UA, UG, UZ, VN, YU, ZA, ZW only).
INTEL CHINA LTD. [CN/CN]; Beijing Kerry Center, 6/F North Tower, 1 Guanghua Road, Chaoyang District, Beijing 100020 (CN) (AE, AG, AL, AM, AT, AU, AZ, BA, BB, BE, BF, BG, BJ, BR, BY, BZ, CA, CF, CG, CH, CI, CM, CN, CO, CR, CU, CY, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, FR, GA, GB, GD, GE, GH, GM, GN, GR, GW, HR, HU, ID, IE, IL, IN, IS, IT, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MC, MD, MG, MK, ML, MN, MR, MW, MX, MZ, NE, NL, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, SN, SZ, TD, TG, TJ, TM, TR, TT, TZ, UA, UG, UZ, VN, YU, ZA, ZW only).
HUANG, Shan [CN/CN]; (CN) (For US Only).
WENG, Fuliang [CN/CN]; (US) (For US Only).
JIN, Naiyong [CN/CN]; (CN) (For US Only)
Inventors: HUANG, Shan; (CN).
WENG, Fuliang; (US).
JIN, Naiyong; (CN)
Agent: CCPIT PATENT AND TRADEMARK LAW OFFICE; 10/F Ocean Plaza, 158 Fuxingmennei Street, Beijing 100031 (CN)
Priority Data:
Title (EN) METHOD AND SYSTEM FOR LEXICAL ACQUISITION AND IDENTIFYING WORD BOUNDARIES
(FR) PROCEDE ET SYSTEME SERVANT A L'ACQUISITION LEXICALE ET A L'IDENTIFICATION DE FRONTIERES LEXICALES
Abstract: front page image
(EN)A system is described for lexical acquisition and identifying word boundaries in input sentences. The system includes a training module and a segmenting module. The training module is configured to compute likelihood values associated with character combinations. Each likelihood value associated with a particular character combination is computed based on the number of occurrences of corresponding character combination in training text data. After training process is completed, the segmenting module is used to identify word boundaries in an input sentence based on the likelihood values associated with character combinations.
(FR)L'invention concerne un système servant à l'acquisition lexicale et à l'identification de frontières lexicales dans des phrases entrées. Ce système comprend un module d'apprentissage et un module de segmentation. Le module d'apprentissage est configuré pour calculer des valeurs de probabilité associées à des combinaisons de caractères. Chaque valeur de probabilité associée à une combinaison de caractères particulière est calculée sur la base du nombre d'occurrences de la combinaison de caractères correspondante dans des données textuelles d'apprentissage. Une fois le processus d'apprentissage achevé, le module de segmentation sert à identifier les frontières lexicales dans une phrase entrée sur la base des valeurs de probabilité associées aux combinaisons de caractères.
Designated States: AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG, US, UZ, VN, YU, ZA, ZW.
African Regional Intellectual Property Organization (GH, GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW)
Eurasian Patent Organization (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM)
European Patent Office (AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, TR)
African Intellectual Property Organization (BF, BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG).
Publication Language: English (EN)
Filing Language: English (EN)