Return to search

End-to-End Tabular Information Extraction in Datasheets with Deep Learning

The advent of Industry 4.0 phenomenon has been transforming the information management regarding the specifications of electronic components.
This change affects many organizations, including global supply chains that optimizes many product chains, such as raw materials or electronic components.
Supply chains consist of thousands of manufacturers and connect them to other organizations and end user, and they include billions of distinct components.
The digitization of critical information has to be carried out automatically since there are millions of documents.
Although the documents vary greatly in shape and style, the essential information is usually presented in the tables in a condensed format.
Extracting the structured information from tables are done by human operators, which costs human effort, time and corporate resources.
Based on the motivation that AI-based solutions are automating many processes, this thesis proposes to use deep learning-based solutions for three main problems: (i) table detection, (ii) table internal structure detection and (iii) End-to-End (E2E) tabular structure detection.
To this end, deep learning models are trained mostly with public datasets, and a private dataset (after labelling 2000+ documents) which was provided to us by our industry partner.
To achieve accurate table detection, we propose a method based on the successful Mask-Region-Based Convolutional Neural Network (Mask-RCNN) instance segmentation model.
With some modifications to the training set labels, we have achieved state-of-the-art detection rates with 99% AP and 100% recall.
We use the PASCAL Visual Object Classes (VOC) 11-point Average Precision (AP) metric to compare the evaluated deep learning-based methods.
Detecting tables is the initial step towards semantic modelling of e-components. Therefore, the structure should also be detected in order to extract information.
With this in mind, we introduce another method based on the Mask-RCNN model, which is able to detect structure at a with around 96% AP.
Combining these two networks, or developing a new model is a necessity.
To this end, inspired by the success of Mask-RCNN models, we introduce the following Mask-RCNN based models to realize E2E tabular structure detection:
Stitched E2E model achieved by bridging the output of table detection model into the structure detection model, attained more than 77% AP on the difficult public UNLV dataset with various post-processing steps applied when bridging the two network. Single-pass E2E detection networks were able to attain a higher AP of 86% but with lower recall.
This thesis concludes that deep learning-based object detection and instance segmentation networks can accomplish state-of-the-art performance.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/39387
Date09 July 2019
CreatorsKara, Ertugrul
ContributorsKantarci, Burak
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
Formatapplication/pdf

Page generated in 0.0023 seconds