Return to search

Context Dependent Thresholding and Filter Selection for Optical Character Recognition

Thresholding algorithms and filters are of great importance when utilizing OCR to extract information from text documents such as invoices. Invoice documents vary greatly and since the performance of image processing methods when applied to those documents will vary accordingly, selecting appropriate methods is critical if a high recognition rate is to be obtained. This paper aims to determine if a document recognition system that automatically selects optimal processing methods, based on the characteristics of input images, will yield a higher recognition rate than what can be achieved by a manual choice. Such a recognition system, including a learning framework for selecting optimal thresholding algorithms and filters, was developed and evaluated. It was established that an automatic selection will ensure a high recognition rate when applied to a set of arbitrary invoice images by successfully adapting and avoiding the methods that yield poor recognition rates.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-197460
Date January 2012
CreatorsKieri, Andreas
PublisherUppsala universitet, Institutionen för informationsteknologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC F, 1401-5757 ; 12 036

Page generated in 0.0018 seconds