Global ETD Search

Return to search

Analýza rozložení textu v historických dokumentech / Text Layout Analysis in Historical Documents

The goal of this thesis is to design and implement algorithm for text layout analysis in historical documents. Neural network was used to solve this problem, specifically architecture Faster-RCNN. Dataset of 6 135 images with historical newspaper was used for training and testing. For purpose of the thesis four models of neural networks were trained: model for detection of words, headings, text regions and model for words detection based on position in line. Outputs from these models were processed in order to determine text layout in input image. A modified F-score metric was used for the evaluation. Based on this metric, the algorithm reached an accuracy almost 80 %.

http://www.nusl.cz/ntk/nusl-445522

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:445522
Date	January 2021
Creators	Palacková, Bianca
Contributors	Hradiš, Michal, Kodym, Oldřich
Publisher	Vysoké učení technické v Brně. Fakulta informačních technologií
Source Sets	Czech ETDs
Language	Czech
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.002 seconds

Analýza rozložení textu v historických dokumentech / Text Layout Analysis in Historical Documents

Description

Links & Downloads

Tags

Additional Fields