Global ETD Search

221	Topological Analysis of Averaged Sentence Embeddings Holmes, Wesley J. January 2020 (has links) No description available. Computer Science Topological Data Analysis Natural Language Processing
222	A Language-Model-Based Approach for Detecting Incompleteness in Natural-Language Requirements Luitel, Dipeeka 24 May 2023 (has links) [Context and motivation]: Incompleteness in natural-language requirements is a challenging problem. [Question/Problem]: A common technique for detecting incompleteness in requirements is checking the requirements against external sources. With the emergence of language models such as BERT, an interesting question is whether language models are useful external sources for finding potential incompleteness in requirements. [Principal ideas/results]: We mask words in requirements and have BERT's masked language model (MLM) generate contextualized predictions for filling the masked slots. We simulate incompleteness by withholding content from requirements and measure BERT's ability to predict terminology that is present in the withheld content but absent in the content disclosed to BERT. [Contributions]: BERT can be configured to generate multiple predictions per mask. Our first contribution is to determine how many predictions per mask is an optimal trade-off between effectively discovering omissions in requirements and the level of noise in the predictions. Our second contribution is devising a machine learning-based filter that post-processes predictions made by BERT to further reduce noise. We empirically evaluate our solution over 40 requirements specifications drawn from the PURE dataset [30]. Our results indicate that: (1) predictions made by BERT are highly effective at pinpointing terminology that is missing from requirements, and (2) our filter can substantially reduce noise from the predictions, thus making BERT a more compelling aid for improving completeness in requirements. BERT Natural Language Processing Machine Learning Language Models
223	Characterizing Text Style Based on Semantic Structure Muncy, Chloe January 2022 (has links) No description available. Computer Science Natural Language Processing Semantic Dynamics Text Analysis
224	Detecting Dissimilarity in Discourse on Social Media Mineur, Mattias January 2022 (has links) A lot of interaction between humans take place on social media. Groups and communities are sometimes formed both with and without intention. These interactions generate a large quantity of text data. This project aims to detect dissimilarity in discourse between communities on social media using a distributed approach. A data set of tweets was used to test and evaluate the method. Tweets produced from two communities were extracted from the data set. Two Natural Language Processing techniques were used to vectorise the tweets for each community. Namely LIWC, dictionary based on knowledge acquired from professionals in linguistics and psychology, and BERT, an embedding model which uses machine learning to present words and sentences as a vector of decimal numbers. These vectors were then used as representations of the text to measure the similarity of discourse between the communities. Both distance and similarity were measured. It was concluded that none of the combinations of measure or vectorisation method that was tried could detect a dissimilarity in discourse on social media for the sample data set. Natural Language Processing
225	Self-supervised text sentiment transfer with rationale predictions and pretrained transformers Sinclair, Neil 21 April 2023 (has links) (PDF) Sentiment transfer involves changing the sentiment of a sentence, such as from a positive to negative sentiment, whilst maintaining the informational content. Whilst this challenge in the NLP research domain can be constructed as a translation problem, traditional sequence-to-sequence translation methods are inadequate due to the dearth of parallel corpora for sentiment transfer. Thus, sentiment transfer can be posed as an unsupervised learning problem where a model must learn to transfer from one sentiment to another in the absence of parallel sentences. Given that the sentiment of a sentence is often defined by a limited number of sentiment-specific words within the sentence, this problem can also be posed as a problem of identifying and altering sentiment-specific words as a means of transferring from one sentiment to another. In this dissertation we use a novel method of sentiment word identification from the interpretability literature called the method of rationales. This method identifies the words or phrases in a sentence that explain the ‘rationale' for a classifier's class prediction, in this case the sentiment of a sentence. This method is then compared against a baseline heuristic sentiment word identification method. We also experiment with a pretrained encoder-decoder Transformer model, known as BART, as a method for improving upon previous sentiment transfer results. This pretrained model is fine-tuned first in an unsupervised manner as a denoising autoencoder to reconstruct sentences where sentiment words have been masked out. This fine-tuned model then generates a parallel corpus which is used to further fine-tune the final stage of the model in a self-supervised manner. Results were compared against a baseline using automatic evaluations of accuracy and BLEU score as well as human evaluations of content preservation, sentiment accuracy and sentence fluency. The results of this dissertation show that both neural network and heuristic-based methods of sentiment word identification achieve similar results across models for similar levels of sentiment word removal for the Yelp dataset. However, the heuristic approach leads to improved results with the pretrained model on the Amazon dataset. We also find that using the pretrained Transformers model improves upon the results of using the baseline LSTM trained from scratch for the Yelp dataset for all automatic metrics. The pretrained BART model scores higher across all human-evaluated outputs for both datasets, which is likely due to its larger size and pretraining corpus. These results also show a similar trade-off between content preservation and sentiment transfer accuracy as in previous research, with more favourable results on the Yelp dataset relative to the baseline. natural language processing neural networks sentiment transfer transformers
226	Driving by Speaking: Natural Language Control of Robotic Wheelchairs Hecht, Steven A. 16 August 2013 (has links) No description available. Robotics Mobile Robotics Cognitive Robotics Natural Language Processing Navigation Simulation
227	Information and Representation Tradeoffs in Document Classification Jin, Timothy 23 May 2022 (has links) No description available. Computer Science natural language processing text classification document classification
228	SKEWER: Sentiment Knowledge Extraction with Entity Recognition Wu, Christopher James 01 June 2016 (has links) (PDF) The California state legislature introduces approximately 5,000 new bills each legislative session. While the legislative hearings are recorded on video, the recordings are not easily accessible to the public. The lack of official transcripts or summaries also increases the effort required to gain meaningful insight from those recordings. Therefore, the news media and the general population are largely oblivious to what transpires during legislative sessions. Digital Democracy, a project started by the Cal Poly Institute for Advanced Technology and Public Policy, is an online platform created to bring transparency to the California legislature. It features a searchable database of state legislative committee hearings, with each hearing accompanied by a transcript that was generated by an internal transcription tool. This thesis presents SKEWER, a pipeline for building a spoken-word knowledge graph from those transcripts. SKEWER utilizes a number of natural language processing tools to extract named entities, phrases, and sentiments from the transcript texts and aggregates the results of those tools into a graph database. The resulting graph can be queried to discover knowledge regarding the positions of legislators, lobbyists, and the general public towards specific bills or topics, and how those positions are expressed in committee hearings. Several case studies are presented to illustrate the new knowledge that can be acquired from the knowledge graph. knowledge graph digital democracy natural language processing Databases and Information Systems
229	Towards Explainable Event Detection and Extraction Mehta, Sneha 22 July 2021 (has links) Event extraction refers to extracting specific knowledge of incidents from natural language text and consolidating it into a structured form. Some important applications of event extraction include search, retrieval, question answering and event forecasting. However, before events can be extracted it is imperative to detect events i.e. identify which documents from a large collection contain events of interest and from those extracting the sentences that might contain the event related information. This task is challenging because it is easier to obtain labels at the document level than finegrained annotations at the sentence level. Current approaches for this task are suboptimal because they directly aggregate sentence probabilities estimated by a classifier to obtain document probabilities resulting in error propagation. To alleviate this problem we propose to leverage recent advances in representation learning by using attention mechanisms. Specifically, for event detection we propose a method to compute document embeddings from sentence embeddings by leveraging attention and training a document classifier on those embeddings to mitigate the error propagation problem. However, we find that existing attention mechanisms are inept for this task, because either they are suboptimal or they use a large number of parameters. To address this problem we propose a lean attention mechanism which is effective for event detection. Current approaches for event extraction rely on finegrained labels in specific domains. Extending extraction to new domains is challenging because of difficulty of collecting finegrained data. Machine reading comprehension(MRC) based approaches, that enable zero-shot extraction struggle with syntactically complex sentences and long-range dependencies. To mitigate this problem, we propose a syntactic sentence simplification approach that is guided by MRC model to improve its performance on event extraction. / Doctor of Philosophy / Event extraction is the task of extracting events of societal importance from natural language texts. The task has a wide range of applications from search, retrieval, question answering to forecasting population level events like civil unrest, disease occurrences with reasonable accuracy. Before events can be extracted it is imperative to identify the documents that are likely to contain the events of interest and extract the sentences that mention those events. This is termed as event detection. Current approaches for event detection are suboptimal. They assume that events are neatly partitioned into sentences and obtain document level event probabilities directly from predicted sentence level probabilities. In this dissertation, under the same assumption by leveraging representation learning we mitigate some of the shortcomings of the previous event detection methods. Current approaches to event extraction are only limited to restricted domains and require finegrained labeled corpora for their training. One way to extend event extraction to new domains in by enabling zero-shot extraction. Machine reading comprehension (MRC) based approach provides a promising way forward for zero-shot extraction. However, this approach suffers from the long-range dependency problem and faces difficulty in handling syntactically complex sentences with multiple clauses. To mitigate this problem we propose a syntactic sentence simplification algorithm that is guided by the MRC system to improves its performance. deep learning natural language processing information extraction representation learning
230	A Mixed Methods Study of Ranger Attrition: Examining the Relationship of Candidate Attitudes, Attributions and Goals Coombs, Aaron Keith 01 May 2023 (has links) Elite military selection programs like the 75th Ranger Regiment's Ranger Assessment and Selection Program (RASP) are known for their difficulty and high attrition rates, despite substantial candidate screening just to get into such programs. The current study analyzes Ranger candidates 'attitudes, attributions, and goals (AAGs) and their relationship with successful completion of pre-RASP, a preparation phase for the demanding eight-week RASP program. Candidates' entry and exit surveys were analyzed using natural language processing (NLP), as well as more traditional statistical analyses of Likert-measured survey items to determine what reasons for joining and what individual goals related to candidate success. Candidates' Intrinsic Motivations and Satisfaction as measured on entry surveys were the strongest predictors of success. Specifically, candidates' desire to deploy or serve in combat, and the goal of earning credibility in the Rangers were the most important reasons and goals provided through candidates' open-text responses. Additionally, between-groups analyses between Black Candidates, Hispanic Candidates, and White Candidates showed that differences in candidate abilities and motivations better explains pre-RASP attrition than demographic proxies such as race or ethnicity. The study's use of NLP demonstrates the practical utility of applying machine learning to quantitatively analyze open-text responses that have traditionally been limited to qualitative analysis or subject to human coding, although predictive models utilizing more traditional Likert-measurement of AAGs had better predictive accuracy. / Doctor of Philosophy / Elite military selection programs like the 75th Ranger Regiment's Ranger Assessment and Selection Program (RASP) are known for their difficulty and high attrition rates, despite substantial candidate screening just to get into such programs. The current study analyzes Ranger candidates' attitudes and goals and their relationship with successful completion of pre-RASP, a preparation phase for the demanding eight-week RASP program. Candidates' entry and exit surveys were analyzed to better understand the relationship between candidates' reasons for volunteering and their goals in the organization. Candidates' Intrinsic Motivations and their Satisfaction upon arrival for pre-RASP best predicted candidate success. Specifically, candidates' desires to deploy or serve in combat, and the goal of earning credibility in the Rangers were the most important reasons and goals provided through candidates' open-text responses. Additionally, between-groups analyses between Black Candidates, Hispanic Candidates, and White Candidates showed that differences in candidate abilities and motivations better explains pre-RASP attrition than demographic proxies such as race or ethnicity. Military Selection Attrition Motivation Natural Language Processing (NLP)

Search results