WIPO logo
Mobile | Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文 | العربية |
PATENTSCOPE

Search International and National Patent Collections
World Intellectual Property Organization
Search
 
Browse
 
Translate
 
Options
 
News
 
Login
 
Help
 
Machine translation
1. (WO2005069199) METHODS AND SYSTEMS FOR TEXT SEGMENTATION
Latest bibliographic data on file with the International Bureau   

Pub. No.:    WO/2005/069199    International Application No.:    PCT/US2003/041609
Publication Date: 28.07.2005 International Filing Date: 30.12.2003
IPC:
G06K 9/72 (2006.01)
Applicants: GOOGLE INC. [US/US]; 1600 Amphitheatre Parkway, Buikding 41, Moutain View, CA 94043 (US) (For All Designated States Except US).
WEISSMAN, Adam, J. [US/US]; (US) (For US Only)
Inventors: WEISSMAN, Adam, J.; (US)
Agent: GARDNER, Steven, J.; Kilpatrick Stockton LLP, 1001 West Fourth Street, Winston-Salem, NC 27101-2400 (US)
Priority Data:
Title (EN) METHODS AND SYSTEMS FOR TEXT SEGMENTATION
(FR) PROCEDES ET SYSTEMES DE SEGMENTATION DE TEXTE
Abstract: front page image
(EN)Methods and systems for test segmentation are disclosed. In one such method and system, a string of characters is accessed (204), a long token is identified (206), contiguous characters in the long token are pinned down (208), tokens from the string of characters are determined by keeping the pinned down contiguous characters together; and a plurality of combinations of tokens are determined (210), wherein the number of combinations of tokens is reduced by the pinned down contiguous characters.
(FR)Cette invention concerne des procédés et des systèmes de segmentation de texte. Dans un procédé et un système de cette invention, le procédé consiste à accéder à une chaîne de caractère (204); à identifier un long jeton (206); à fixer les caractères contigus dans le long jeton (208), à déterminer les jetons de la chaînes de caractères en maintenant groupés les caractères contigus fixés; et à déterminer une pluralité de combinaisons (210), le nombre de combinaisons de jetons étant réduit par les caractères contigus fixes.
Designated States: AE, AG, AL, AM, AT, AU, AZ, BA, BB, BG, BR, BW, BY, BZ, CA, CH, CN, CO, CR, CU, CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ, NI, NO, NZ, OM, PG, PH, PL, PT, RO, RU, SC, SD, SE, SG, SK, SL, SY, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, YU, ZA, ZM, ZW.
African Regional Intellectual Property Organization (BW, GH, GM, KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW)
Eurasian Patent Organization (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM)
European Patent Office (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HU, IE, IT, LU, MC, NL, PT, RO, SE, SI, SK, TR)
African Intellectual Property Organization (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG).
Publication Language: English (EN)
Filing Language: English (EN)