Global ETD Search

Return to search

Tabular Information Extraction from Datasheets with Deep Learning for Semantic Modeling

The growing popularity of artificial intelligence and machine learning has led to the adop-
tion of the automation vision in the industry by many other institutions and organizations.
Many corporations have made it their primary objective to make the delivery of goods and
services and manufacturing in a more efficient way with minimal human intervention. Au-
tomated document processing and analysis is also a critical component of this cycle for
many organizations that contribute to the supply chain. The massive volume and diver-
sity of data created in this rapidly evolving environment make this a highly desired step.
Despite this diversity, important information in the documents is provided in the tables.
As a result, extracting tabular data is a crucial aspect of document processing.
This thesis applies deep learning methodologies to detect table structure elements for
the extraction of data and preparation for semantic modelling. In order to find optimal
structure definition, we analyzed the performance of deep learning models in different
formats such as row/column and cell. The combined row and column detection models
perform poorly compared to other models’ detection performance due to the highly over-
lapping nature of rows and columns. Separate row and column detection models seem
to achieve the best average F1-score with 78.5% and 79.1%, respectively. However, de-
termining cell elements from the row and column detections for semantic modelling is
a complicated task due to spanning rows and columns. Considering these facts, a new
method is proposed to set the ground-truth information called a content-focused annota-
tion to define table elements better. Our content-focused method is competent in handling
ambiguities caused by huge white spaces and lack of boundary lines in table structures;
hence, it provides higher accuracy.
Prior works have addressed the table analysis problem under table detection and table
structure detection tasks. However, the impact of dataset structures on table structure
detection has not been investigated. We provide a comparison of table structure detection
performance with cropped and uncropped datasets. The cropped set consists of only
table images that are cropped from documents assuming tables are detected perfectly.
The uncropped set consists of regular document images. Experiments show that deep
learning models can improve the detection performance by up to 9% in average precision
and average recall on the cropped versions. Furthermore, the impact of cropped images is
negligible under the Intersection over Union (IoU) values of 50%-70% when compared to
the uncropped versions. However, beyond 70% IoU thresholds, cropped datasets provide
significantly higher detection performance.

Deep Learning

Convolutional Neural Networks

Image Processing

Document Processing

Table Structure Detection

Table Detection

Tabular Data Extraction

Page Object Detection

Identifer	oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/43402
Date	22 March 2022
Creators	Akkaya, Yakup
Contributors	Kantarci, Burak
Publisher	Université d'Ottawa / University of Ottawa
Source Sets	Université d’Ottawa
Language	English
Detected Language	English
Type	Thesis
Format	application/pdf
Rights	CC0 1.0 Universal, http://creativecommons.org/publicdomain/zero/1.0/

Page generated in 0.0023 seconds

Tabular Information Extraction from Datasheets with Deep Learning for Semantic Modeling

Description

Links & Downloads

Tags

Additional Fields