Return to search

Automatic Invoice Data Extraction as a Constraint Satisfaction Problem

Invoice processing has traditionally been heavily dependent onmanual labor, where the task is to identify and move certaininformation from an origin to a destination. A time demandingtask with a high interest of automation to reduce time ofexecution, fault-risk and cost.With the evergrowing interest in automation and ArtificialIntelligence (AI), this thesis will explore the possibilities ofautomating the task of extracting and mapping information ofinterest by defining the problem as a Constraint OptimizationProblem (COP) using numeric relations between present information.The problem is then solved by extracting the numericalvalues in a document and utilizing it as an input space whereeach combination of numeric values are tested using a backendsolver.Several different models were defined, using different approachesand constraints on relations between possible existingfields. A solution to an invoice was considered correct if thetotal, tax, net and rounding amounts were estimated correctly.The final best achieved results were 84.30% correct and8.77% incorrect solutions on a set of 1400 various types of invoices.The achieved results show a promising alternative route toproposed solutions using e.g. machine learning or other intelligentsolutions using graphical or positional data. While only regardingthe numerical values present in each document, the proposedsolution becomes decentralized and therefor can be implementedand ran on any set of invoices without any pre-training phase.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-411596
Date January 2020
CreatorsAndersson, Jakob
PublisherUppsala universitet, Institutionen för informationsteknologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC IT, 1401-5749 ; 20009

Page generated in 0.0023 seconds