• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Bidirectional LSTM-CNNs-CRF Models for POS Tagging

Tang, Hao January 2018 (has links)
In order to achieve state-of-the-art performance for part-of-speech(POS) tagging, the traditional systems require a significant amount of hand-crafted features and data pre-processing. In this thesis, we present a discriminative word embedding, character embedding and byte pair encoding (BPE) hybrid neural network architecture to implement a true end-to-end system without feature engineering and data pre-processing. The neural network architecture is a combination of bidirectional LSTM, CNNs, and CRF, which can achieve a state-of-the-art performance for a wide range of sequence labeling tasks. We evaluate our model on Universal Dependencies (UD) dataset for English, Spanish, and German POS tagging. It outperforms other models with 95.1%, 98.15%, and 93.43% accuracy on testing datasets respectively. Moreover, the largest improvements of our model appear on out-of-vocabulary corpora for Spanish and German. According to statistical significance testing, the improvements of English on testing and out-of-vocabulary corpora are not statistically significant. However, the improvements of the other more morphological languages are statistically significant on their corresponding corpora.

Page generated in 0.0647 seconds