Return to search

Enhancement of Deep Neural Networks and Their Application to Text Mining

Many current application domains of machine learning and arti cial intelligence
involve knowledge discovery from text, such as sentiment analysis, document
ontology, and spam detection. Humans have years of experience and training with
language, enabling them to understand complicated, nuanced text passages with relative
ease. A text classi er attempts to emulate or replicate this knowledge so that
computers can discriminate between concepts encountered in text; however, learning
high-level concepts from text, such as those found in many applications of text classi-
cation, is a challenging task due to the many challenges associated with text mining
and classi cation. Recently, classi ers trained using arti cial neural networks have
been shown to be e ective for a variety of text mining tasks. Convolutional neural
networks have been trained to classify text from character-level input, automatically
learn high-level abstract representations and avoiding the need for human engineered
features.
This dissertation proposes two new techniques for character-level learning,
log(m) character embedding and convolutional window classi cation. Log(m) embedding
is a new character-vector representation for text data that is more compact and memory e cient than previous embedding vectors. Convolutional window classi
cation is a technique for classifying long documents, i.e. documents with lengths
exceeding the input dimension of the neural network. Additionally, we investigate the
performance of convolutional neural networks combined with long short-term memory
networks, explore how document length impacts classi cation performance and
compare performance of neural networks against non-neural network-based learners
in text classi cation tasks. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection

Identiferoai:union.ndltd.org:fau.edu/oai:fau.digital.flvc.org:fau_40842
ContributorsPrusa, Joseph Daniel (author), Khoshgoftaar, Taghi M. (Thesis advisor), Florida Atlantic University (Degree grantor), College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
PublisherFlorida Atlantic University
Source SetsFlorida Atlantic University
LanguageEnglish
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation, Text
Format156 p., application/pdf
RightsCopyright © is held by the author, with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder., http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0023 seconds