• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

End-to-End Tabular Information Extraction in Datasheets with Deep Learning

Kara, Ertugrul 09 July 2019 (has links)
The advent of Industry 4.0 phenomenon has been transforming the information management regarding the specifications of electronic components. This change affects many organizations, including global supply chains that optimizes many product chains, such as raw materials or electronic components. Supply chains consist of thousands of manufacturers and connect them to other organizations and end user, and they include billions of distinct components. The digitization of critical information has to be carried out automatically since there are millions of documents. Although the documents vary greatly in shape and style, the essential information is usually presented in the tables in a condensed format. Extracting the structured information from tables are done by human operators, which costs human effort, time and corporate resources. Based on the motivation that AI-based solutions are automating many processes, this thesis proposes to use deep learning-based solutions for three main problems: (i) table detection, (ii) table internal structure detection and (iii) End-to-End (E2E) tabular structure detection. To this end, deep learning models are trained mostly with public datasets, and a private dataset (after labelling 2000+ documents) which was provided to us by our industry partner. To achieve accurate table detection, we propose a method based on the successful Mask-Region-Based Convolutional Neural Network (Mask-RCNN) instance segmentation model. With some modifications to the training set labels, we have achieved state-of-the-art detection rates with 99% AP and 100% recall. We use the PASCAL Visual Object Classes (VOC) 11-point Average Precision (AP) metric to compare the evaluated deep learning-based methods. Detecting tables is the initial step towards semantic modelling of e-components. Therefore, the structure should also be detected in order to extract information. With this in mind, we introduce another method based on the Mask-RCNN model, which is able to detect structure at a with around 96% AP. Combining these two networks, or developing a new model is a necessity. To this end, inspired by the success of Mask-RCNN models, we introduce the following Mask-RCNN based models to realize E2E tabular structure detection: Stitched E2E model achieved by bridging the output of table detection model into the structure detection model, attained more than 77% AP on the difficult public UNLV dataset with various post-processing steps applied when bridging the two network. Single-pass E2E detection networks were able to attain a higher AP of 86% but with lower recall. This thesis concludes that deep learning-based object detection and instance segmentation networks can accomplish state-of-the-art performance.
2

Tabular Information Extraction from Datasheets with Deep Learning for Semantic Modeling

Akkaya, Yakup 22 March 2022 (has links)
The growing popularity of artificial intelligence and machine learning has led to the adop- tion of the automation vision in the industry by many other institutions and organizations. Many corporations have made it their primary objective to make the delivery of goods and services and manufacturing in a more efficient way with minimal human intervention. Au- tomated document processing and analysis is also a critical component of this cycle for many organizations that contribute to the supply chain. The massive volume and diver- sity of data created in this rapidly evolving environment make this a highly desired step. Despite this diversity, important information in the documents is provided in the tables. As a result, extracting tabular data is a crucial aspect of document processing. This thesis applies deep learning methodologies to detect table structure elements for the extraction of data and preparation for semantic modelling. In order to find optimal structure definition, we analyzed the performance of deep learning models in different formats such as row/column and cell. The combined row and column detection models perform poorly compared to other models’ detection performance due to the highly over- lapping nature of rows and columns. Separate row and column detection models seem to achieve the best average F1-score with 78.5% and 79.1%, respectively. However, de- termining cell elements from the row and column detections for semantic modelling is a complicated task due to spanning rows and columns. Considering these facts, a new method is proposed to set the ground-truth information called a content-focused annota- tion to define table elements better. Our content-focused method is competent in handling ambiguities caused by huge white spaces and lack of boundary lines in table structures; hence, it provides higher accuracy. Prior works have addressed the table analysis problem under table detection and table structure detection tasks. However, the impact of dataset structures on table structure detection has not been investigated. We provide a comparison of table structure detection performance with cropped and uncropped datasets. The cropped set consists of only table images that are cropped from documents assuming tables are detected perfectly. The uncropped set consists of regular document images. Experiments show that deep learning models can improve the detection performance by up to 9% in average precision and average recall on the cropped versions. Furthermore, the impact of cropped images is negligible under the Intersection over Union (IoU) values of 50%-70% when compared to the uncropped versions. However, beyond 70% IoU thresholds, cropped datasets provide significantly higher detection performance.

Page generated in 0.5069 seconds