Global ETD Search

Return to search

Convolutional Neural Networks for Named Entity Recognition in Images of Documents

This work researches named entity recognition (NER) with respect to images of documents with a domain-specific layout, by means of Convolutional Neural Networks (CNNs). Examples of such documents are receipts, invoices, forms and scientific papers, the latter of which are used in this work. An NER task is first performed statically, where a static number of entity classes is extracted per document. Networks based on the deep VGG-16 network are used for this task. Here, experimental evaluation shows that framing the task as a classification task, where the network classifies each bounding box coordinate separately, leads to the best network performance. Also, a multi-headed architecture is introduced, where the network has an independent fully-connected classification head per entity. VGG-16 achieves better performance with the multi-headed architecture than with its default, single-headed architecture. Additionally, it is shown that transfer learning does not improve performance of these networks. Analysis suggests that the networks trained for the static NER task learn to recognise document templates, rather than the entities themselves, and therefore do not generalize well to new, unseen templates. For a dynamic NER task, where the type and number of entity classes vary per document, experimental evaluation shows that, on large entities in the document, the Faster R-CNN object detection framework achieves comparable performance to the networks trained on the static task. Analysis suggests that Faster R-CNN generalizes better to new templates than the networks trained for the static task, as Faster R-CNN is trained on local features rather than the full document template. Finally, analysis shows that Faster R-CNN performs poorly on small entities in the image and suggestions are made to improve its performance.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-191213

Convolutional Neural Networks

Faster R-CNN

Named Entity Recognition

Images

Documents

Engineering and Technology

Teknik och teknologier

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-191213
Date	January 2016
Creators	van de Kerkhof, Jan
Publisher	KTH, Skolan för datavetenskap och kommunikation (CSC), -
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0932 seconds

Convolutional Neural Networks for Named Entity Recognition in Images of Documents

Description

Links & Downloads

Tags

Additional Fields