Search International and National Patent Collections
Some content of this application is unavailable at the moment.
If this situation persists, please contact us atFeedback&Contact
1. (WO2010073540) BUSINESS DOCUMENT PROCESSOR
Latest bibliographic data on file with the International Bureau

Pub. No.: WO/2010/073540 International Application No.: PCT/JP2009/006889
Publication Date: 01.07.2010 International Filing Date: 15.12.2009
IPC:
G06K 9/34 (2006.01) ,G06F 17/30 (2006.01) ,G06K 9/20 (2006.01) ,G06K 9/72 (2006.01)
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
K
RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9
Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
20
Image acquisition
34
Segmentation of touching or overlapping patterns in the image field
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
F
ELECTRIC DIGITAL DATA PROCESSING
17
Digital computing or data processing equipment or methods, specially adapted for specific functions
30
Information retrieval; Database structures therefor
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
K
RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9
Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
20
Image acquisition
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
K
RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9
Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62
Methods or arrangements for recognition using electronic means
72
using context analysis based on the provisionally recognised identity of a number of successive patterns, e.g. a word
Applicants:
Hitachi Solutions, Ltd. [JP/JP]; 4-12-7, Higashishinagawa, Shinagawa-ku, Tokyo 1400002, JP (AllExceptUS)
OBA, Mitsuharu [JP/JP]; JP (UsOnly)
Inventors:
OBA, Mitsuharu; JP
Agent:
HIRAKI, Yusuke; Kamiya-cho MT Bldg. 19F 3-20, Toranomon 4-chome Minato-ku, Tokyo 1050001, JP
Priority Data:
2008-33521626.12.2008JP
Title (EN) BUSINESS DOCUMENT PROCESSOR
(FR) DISPOSITIF DE TRAITEMENT DE DOCUMENTS COMMERCIAUX
Abstract:
(EN) There is provided a technique for removing only a seal impression while keeping character string information when applying OCR to a business document stored in grayscale, even if the character string and the seal impression overlap with each other. The character string that overlaps with the seal impression is extrapolated by matching a character string present near the seal impression against a database. More specifically, first, a seal impression region in a business document inputted in grayscale is removed. Next, character information that is present near the removed seal impression region and of which a portion of the characters is unclear due to the seal impression region is extracted as seal impression related information. Then, an attribute of the extracted seal impression related information is identified, a customer database storing character string candidates containing customer information is referred to, and based on the seal impression related information classified by attribute, the character string that overlaps with the seal impression region and that is thus unclear is extrapolated.
(FR) On décrit une technique destinée à n'éliminer qu'une empreinte de cachet tout en conservant les informations de chaînes de caractères lorsque l'on applique une OCR à un document commercial enregistré en niveaux de gris, même si la chaîne de caractères et l'empreinte de cachet se chevauchent. La chaîne de caractères qui se chevauche avec l'empreinte de cachet est extrapolée en confrontant une chaîne de caractères présente près de l'empreinte de cachet à une base de données. Plus précisément, une région d'empreinte de cachet dans un document commercial saisi en niveaux de gris est éliminée. Ensuite, des informations de caractères proches de la région d'empreinte de cachet éliminée, et dont une partie des caractères n'est pas claire à cause de la région d'empreinte de cachet, sont extraites en tant qu'informations liées à l'empreinte de cachet. Ensuite, un attribut des informations liées à l'empreinte de cachet ainsi extraites est identifié, on se réfère à une base de données de clients emmagasinant des chaînes de caractères candidates contenant des informations sur des clients et, sur la base des informations liées à l'empreinte de cachet classifiées par attributs, la chaîne de caractères qui se chevauche avec la région d'empreinte de cachet et qui n'est donc pas claire est extrapolée.
front page image
Designated States: AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IS, KE, KG, KM, KN, KP, KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, OM, PE, PG, PH, PL, PT, RO, RS, RU, SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW
African Regional Intellectual Property Organization (ARIPO) (BW, GH, GM, KE, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, ZM, ZW)
Eurasian Patent Organization (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM)
European Patent Office (AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, MC, MK, MT, NL, NO, PL, PT, RO, SE, SI, SK, SM, TR)
African Intellectual Property Organization (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG)
Publication Language: English (EN)
Filing Language: English (EN)
Also published as:
EP2370933US20110135209CN102171708