WIPO logo
Mobile | Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文 | العربية |
PATENTSCOPE

Search International and National Patent Collections
World Intellectual Property Organization
Search
 
Browse
 
Translate
 
Options
 
News
 
Login
 
Help
 
Machine translation
1. (WO2004010324) SYSTEM FOR EXTRACTING INFORMATION FROM A NATURAL LANGUAGE TEXT
Latest bibliographic data on file with the International Bureau   

Pub. No.:    WO/2004/010324    International Application No.:    PCT/CH2003/000490
Publication Date: 29.01.2004 International Filing Date: 18.07.2003
IPC:
G06F 17/27 (2006.01)
Applicants: GO ALBERT FRANCE SARL [FR/FR]; 12, rue Vivienne, F-75002 Paris (FR) (For All Designated States Except US).
GERMAIN, Nicolas [FR/FR]; (FR) (For US Only)
Inventors: GERMAIN, Nicolas; (FR)
Agent: GANGUILLET, Cyril; Abrema Agence Brevets et Marques, Ganguillet & Humphrey, 16, Avenue du Théâtre, Case postale 2065, CH-1002 Lausanne (CH)
Priority Data:
02405626.9 19.07.2002 EP
Title (EN) SYSTEM FOR EXTRACTING INFORMATION FROM A NATURAL LANGUAGE TEXT
(FR) SYSTEME D'EXTRACTION D'INFORMATIONS DANS UN TEXTE EN LANGAGE NATUREL
Abstract: front page image
(EN)The invention relates to a system for extracting information from a natural language text. According to the invention, the extraction method consists in: encoding the words from the text by comparing said words with the contents of a lexicon of empty words (essentially articles, prepositions, conjunctions and verbal auxiliaries); and, subsequently, identifying noun phrases by searching for groups of encoded words that adhere to the pre-defined syntactic rules from among the subsets from the series of encoded words thus obtained.
(FR)Le procédé d'extraction effectue un codage des mots du texte en les comparant avec le contenu d'un lexique de mots outils (essentiellement articles, prépositions, conjonctions et auxiliaires verbaux), puis identifie des groupes nominaux en recherchant, parmi des sous-ensembles de la suite des mots codés ainsi obtenue, des groupes de mots codés répondant à des règles syntaxiques prédéfinies.
Designated States: AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NI, NO, NZ, OM, PG, PH, PL, PT, RO, RU, SC, SD, SE, SG, SK, SL, SY, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, YU, ZA, ZM, ZW.
African Regional Intellectual Property Organization (GH, GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW)
Eurasian Patent Organization (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM)
European Patent Office (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HU, IE, IT, LU, MC, NL, PT, RO, SE, SI, SK, TR)
African Intellectual Property Organization (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG).
Publication Language: French (FR)
Filing Language: French (FR)