Return to search

Content Detection in Handwritten Documents

abstract: Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a mathematical expression; in this case, the label would be math. This differentiation facilitates the performance of specific recognition tasks depending on the content type. We hypothesize that the recognition accuracy of the subsequent tasks such as textual, math, and shape recognition will increase, further leading to a better analysis of the document.

Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD). / Dissertation/Thesis / Masters Thesis Computer Science 2018

Identiferoai:union.ndltd.org:asu.edu/item:50452
Date January 2018
ContributorsFaizaan, Shaik Mohammed (Author), VanLehn, Kurt (Advisor), Cheema, Salman Shaukat (Advisor), Yang, Yezhou (Committee member), Arizona State University (Publisher)
Source SetsArizona State University
LanguageEnglish
Detected LanguageEnglish
TypeMasters Thesis
Format54 pages
Rightshttp://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0019 seconds