Relationship extraction is the task of extracting semantic relationships between en- tities from a text. We create a Czech Relationship Extraction Dataset (CERED) using distant supervision on Wikidata and Czech Wikipedia. We detail the methodology we used and the pitfalls we encountered. Then we use CERED to fine-tune a neural network model for relationship extraction. We base our model on BERT - a linguistic model pre-trained on extensive unlabeled data. We demonstrate that our model performs well on existing English relationship datasets (Semeval 2010 Task 8, TACRED) and report the results we achieved on CERED. 1
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:434974 |
Date | January 2020 |
Creators | Šimečková, Zuzana |
Contributors | Straka, Milan, Straňák, Pavel |
Source Sets | Czech ETDs |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0021 seconds