Return to search

A Visual Focus on Form Understanding

Paper forms are a commonly used format for collecting information, including information that ultimately will be added to a digital database. This work focuses on the automatic extraction of information from form images. It examines what can be achieved at parsing forms without any textual information. The resulting model, FUDGE, shows that computer vision alone is reasonably successful at the problem. Drawing from the strengths and weaknesses of FUDGE, this work also introduces a novel model, Dessurt, for end-to-end document understanding. Dessurt performs text recognition implicitly and is capable of outputting arbitrary text, making it a more flexible document processing model than prior methods. Dessurt is capable of parsing the entire contents of a form image into a structured format directly, achieving better performance than FUDGE at this task. Also included is a technique to generate synthetic handwriting, which provides synthetic training data for Dessurt.

Identiferoai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-10513
Date19 May 2022
CreatorsDavis, Brian Lafayette
PublisherBYU ScholarsArchive
Source SetsBrigham Young University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations
Rightshttps://lib.byu.edu/about/copyright/

Page generated in 0.002 seconds