• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Stylometry: Quantifying Classic Literature For Authorship Attribution : - A Machine Learning Approach

Yousif, Jacob, Scarano, Donato January 2024 (has links)
Classic literature is rich, be it linguistically, historically, or culturally, making it valuable for future studies. Consequently, this project chose a set of 48 classic books to conduct a stylometric analysis on the defined set of books, adopting an approach used by a related work to divide the books into text segments, quantify the resulting text segments, and analyze the books using the quantified values to understand the linguistic attributes of the books. Apart from the latter, this project conducted different classification tasks for other objectives. In one respect, the study used the quantified values of the text segments of the books for classification tasks using advanced models like LightGBM and TabNet to assess the application of this approach in authorship attribution. From another perspective, the study utilized a State-Of-The-Art model, namely, RoBERTa for classification tasks using the segmented texts of the books instead to evaluate the performance of the model in authorship attribution. The results uncovered the characteristics of the books to a reasonable degree. Regarding the authorship attribution tasks, the results suggest that segmenting and quantifying text using stylometric analysis and supervised machine learning algorithms is practical in such tasks. This approach, while showing promise, may still require further improvements to achieve optimal performance. Lastly, RoBERTa demonstrated high performance in authorship attribution tasks.

Page generated in 0.1069 seconds