Search International and National Patent Collections

1. (WO2001006408) CUT AND PASTE DOCUMENT SUMMARIZATION SYSTEM AND METHOD

Pub. No.:    WO/2001/006408    International Application No.:    PCT/US2000/004505
Publication Date: Fri Jan 26 00:59:59 CET 2001 International Filing Date: Wed Feb 23 00:59:59 CET 2000
IPC: G06F 17/27
Applicants: THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK
MCKEOWN, Kathleen, R.
JING, Hongyan
Inventors: MCKEOWN, Kathleen, R.
JING, Hongyan
Title: CUT AND PASTE DOCUMENT SUMMARIZATION SYSTEM AND METHOD
Abstract:
A summary of an input document is generated by extracting at least one sentence from the document and parsing the extracted sentences into components, such as in a parse tree (110). Sentence reduction processing is performed to mark components which can be removed from the parse trees (135). Sentence reduction can include context importance processing, probabilistic processing, and linguistic knowledge based processing, probabilistic processing includes identifying sentence combination operations and establishing rules for applying the sentence combination operations to mark the parse trees to merge at least two sentences (140). Sentence combination processing also provides a paste operation to operate on the marked components to effect the indicated removal and combination of sentence components, thereby generating summary sentences for the input document.