PATENTSCOPE will be unavailable a few hours for maintenance reason on Tuesday 19.11.2019 at 4:00 PM CET
Search International and National Patent Collections
Some content of this application is unavailable at the moment.
If this situation persists, please contact us atFeedback&Contact
1. (CN101253514) Grammatical parsing of document visual structures

Office : China
Application Number: 200680031501.X Application Date: 30.06.2006
Publication Number: 101253514 Publication Date: 27.08.2008
Grant Number: 101253514 Grant Date: 13.06.2012
Publication Kind : B
Prior PCT appl.: Application Number:PCTUS2006026140 ; Publication Number:2007005937 Click to see the data
IPC:
G06K 9/72
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
K
RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9
Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62
Methods or arrangements for recognition using electronic means
72
using context analysis based on the provisionally recognised identity of a number of successive patterns, e.g. a word
CPC:
G06K 9/726
G06F 17/271
G06K 2209/01
Applicants: Microsoft Corp.
微软公司
Inventors: Viola Paul A.
P·A·沃拉
Shilman Michael
M·希尔曼
Agents: chenbin
上海专利商标事务所有限公司 31100
Priority Data: 11173280 01.07.2005 US
Title: (EN) Grammatical parsing of document visual structures
(ZH) 文档可视结构的语法剖析
Abstract: front page image
(EN) A two-dimensional representation of a document is leveraged to extract a hierarchical structure that facilitates recognition of the document. The visual structure is grammatically parsed utilizing two-dimensional adaptations of statistical parsing algorithms. This allows recognition of layout structures (e.g., columns, authors, titles, footnotes, etc.) and the like such that structural components of the document can be accurately interpreted. Additional techniques can also be employed to facilitate document layout recognition. For example, grammatical parsing techniques that utilize machine learning, parse scoring based on image representations, boosting techniques, and/or ''fast features'' and the like can be employed to facilitate in document recognition.
(ZH)

利用文档的二维表示来提取有助于文档识别的分层结构。利用统计剖析算法的二维自适应来对该视觉结构进行语法剖析。这允许识别布局结构(例如,栏、作者、标题、脚注等-)等,使得文档的结构组成部分能被准确地解释。还可采用其它技术来帮助文档布局识别。例如,可采用利用机器学习、基于图像表示的剖析评分、上推技术和/或“快速特征”等的语法剖析技术来帮助文档识别。


Also published as:
NO20080090NZ565147MXMX/a/2008/000180KR1020080026128EP1894144ZA2008/00041
JP2009500755RU0002421810CA2614177IN40/DELNP/2008WO/2007/005937