Processing

Please wait...

Settings

Settings

Goto Application

1. WO2001042986 - METHOD AND SYSTEM FOR INDEXING DOCUMENTS USING CONNECTIVITY AND SELECTIVE CONTENT ANALYSIS

Publication Number WO/2001/042986
Publication Date 14.06.2001
International Application No. PCT/US2000/033340
International Filing Date 08.12.2000
Chapter 2 Demand Filed 29.06.2001
IPC
G06F 17/30 2006.1
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
17Digital computing or data processing equipment or methods, specially adapted for specific functions
30Information retrieval; Database structures therefor
CPC
G06F 16/951
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
90Details of database functions independent of the retrieved data types
95Retrieval from the web
951Indexing; Web crawling techniques
Applicants
  • RENSSELAER POLYTECHNIC INSTITUTE [US]/[US] (AllExceptUS)
  • SZYMANSKI, Boleslaw, K. [--]/[--] (UsOnly)
Inventors
  • SZYMANSKI, Boleslaw, K.
Agents
  • GROSSMAN, Jon, D.
Priority Data
60/169,88909.12.1999US
Publication Language English (en)
Filing Language English (EN)
Designated States
Title
(EN) METHOD AND SYSTEM FOR INDEXING DOCUMENTS USING CONNECTIVITY AND SELECTIVE CONTENT ANALYSIS
(FR) SYSTEME ET PROCEDE D'INDEXATION DE DOCUMENTS FONDE SUR LA CONNECTABILITE ET L'ANALYSE SELECTIVE DE LEUR TENEUR
Abstract
(EN) The present invention provides a method for indexing (141) documents. More particularly, a method in which the contents of the documents linked to the indexed (141) page (121) is used. In other words, the method uses a structure of words used by the linked pages (121). All the stored information is parsed (201) and references containing links to the document to be indexed are captured. The captured references are further parsed (201) to collect its content.
(FR) La présente invention concerne un procédé permettant d'indexer (141) des documents. Plus particulièrement, l'invention a pour objet un procédé selon lequel on utilise la teneur des documents liés à la page (121) indexée (141). En d'autres termes, ce procédé se fonde sur la structure des mots utilisés par les pages liées (121). Toutes les informations mémorisées sont analysées (201) et les références contenant des liens avec les documents à annexer sont saisis. Les références ainsi saisies sont, en outre, analysées pour collecter (201) leur contenu.
Related patent documents
Latest bibliographic data on file with the International Bureau