Processing

Please wait...

Settings

Settings

1. CA2614177 - GRAMMATICAL PARSING OF DOCUMENT VISUAL STRUCTURES

Office Canada
Application Number 2614177
Application Date 30.06.2006
Publication Number 2614177
Publication Date 11.01.2007
Publication Kind A1
IPC
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
K
RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9
Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62
Methods or arrangements for recognition using electronic means
72
using context analysis based on the provisionally recognised identity of a number of successive patterns, e.g. a word
G06K 9/72
CPC
G06F 40/211
G06K 9/726
G06K 2209/01
Applicants MICROSOFT CORPORATION
Inventors VIOLA, PAUL A.
SHILMAN, MICHAEL
Priority Data 11173280 01.07.2005 US
Title
(EN) GRAMMATICAL PARSING OF DOCUMENT VISUAL STRUCTURES
(FR) ANALYSE GRAMMATICALE DE STRUCTURES VISUELLES DE DOCUMENT
Abstract
(EN)
A two-dimensional representation of a document is leveraged to extract a hierarchical structure that facilitates recognition of the document. The visual structure is grammatically parsed utilizing two-dimensional adaptations of statistical parsing algorithms. This allows recognition of layout structures (e.g., columns, authors, titles, footnotes, etc.) and the like such that structural components of the document can be accurately interpreted. Additional techniques can also be employed to facilitate document layout recognition. For example, grammatical parsing techniques that utilize machine learning, parse scoring based on image representations, boosting techniques, and/or "fast features" and the like can be employed to facilitate in document recognition.