Return to search

A machine learning system for the automatic identification of text structure, and application to research article abstracts in computer science

Teaching learners about the common structural patterns used in different types of texts, such as the abstract and introduction of research papers, has proved successful in many reading and writing courses. However, a major problem faced by researchers when analyzing texts is the vast amount of time needed to conduct the analysis. This has led to many studies reporting only `preliminary' findings, based on a small corpus of target texts. In this thesis, I propose a computer system that uses machine learning to automatically identify the structure of texts, enabling researchers to quickly and effectively process very large corpora. The system also has applications in the classroom as a teacher resource when evaluating and selecting texts that highlight certain structural features, and as a student resource when conducting data-driven learning. To test the system, it was applied to research article abstracts in computer science journals and found to be fast and accurate. It was also assessed by a practicing teacher and graduate school student, and shown to be flexible, easy to use, and a practical aid in the classroom.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:551478
Date January 2002
CreatorsAnthony, L. E.
PublisherUniversity of Birmingham
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation

Page generated in 0.0023 seconds