Global ETD Search

Return to search

A Visual Focus on Form Understanding

Paper forms are a commonly used format for collecting information, including information that ultimately will be added to a digital database. This work focuses on the automatic extraction of information from form images. It examines what can be achieved at parsing forms without any textual information. The resulting model, FUDGE, shows that computer vision alone is reasonably successful at the problem. Drawing from the strengths and weaknesses of FUDGE, this work also introduces a novel model, Dessurt, for end-to-end document understanding. Dessurt performs text recognition implicitly and is capable of outputting arbitrary text, making it a more flexible document processing model than prior methods. Dessurt is capable of parsing the entire contents of a form image into a structured format directly, achieving better performance than FUDGE at this task. Also included is a technique to generate synthetic handwriting, which provides synthetic training data for Dessurt.

document understanding

Physical Sciences and Mathematics

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-10513
Date	19 May 2022
Creators	Davis, Brian Lafayette
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	https://lib.byu.edu/about/copyright/

Page generated in 0.0019 seconds

A Visual Focus on Form Understanding

Description

Links & Downloads

Tags

Additional Fields