• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The Impact of Semantic and Stylistic Features in Genre Classification for News

Pei, Ziming January 2022 (has links)
In this thesis, we investigate the usefulness of a group of features in genre classification problems for news. We choose a diverse feature set, covering features related to content and styles of the texts. The features are divided into two groups: semantic and stylistic. More specifically, the semantic features include genre-exclusive words, emotional words and synonyms. The stylistic features include character-level and document-level features. We use three traditional machine learning classification models and one neural network model to evaluate the effects of our features: Support Vector Machine, Complement Naive Bayes, k-Nearest Neighbor, and Convolutional Neural Networks. The results are evaluated by F1 score, precision and recall (both micro- and macro-averaged). We compare the performance of different models to find the optimal feature set for this news genre classification task, and meanwhile seek the most suitable classifier. We show that genre-exclusive words and synonyms are beneficial to the classification task, in that they are the most informative features in the training process. Emotional words have negative effect on the results. We present the best result of 0.97 by macro-average F1 score, precision and recall on the feature set combining the preprocessed dataset and its synonym sets generated based on contexts classified by the Complement Naive Bayes model. We discuss the results achieved from the experiments and the best-performing models, answer the research questions, and provide suggestions for future studies.

Page generated in 0.1534 seconds