91 |
Developing a dynamic recommendation system for personalizing educational content within an E-learning networkMirzaeibonehkhater, Marzieh January 2018 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / This research proposed a dynamic recommendation system for a social learning
environment entitled CourseNetworking (CN). The CN provides an opportunity for
the users to satisfy their academic requirement in which they receive the most relevant and updated content. In our research, we extracted some implicit and explicit
features from the system, which are the most relevant user feature and posts features. The selected features are used to make a rating scale between users and posts
so that represent the link between user and post in this learning management system
(LMS). We developed an algorithm which measures the link between each user and
post for the individual. To achieve our goal in our system design, we applied natural
language processing technique (NLP) for text analysis and applied various classi cation technique with the aim of feature selection. We believe that considering the content
of the posts in learning environments as an impactful feature will greatly affect to
the performance of our system. Our experimental results demonstrated that our recommender system predicts the most informative and relevant posts to the users. Our
system design addressed the sparsity and cold-start problems, which are the two main
challenging issues in recommender systems.
|
92 |
Tracking and Characterizing Natural Language Semantic Dynamics of Conversations in Real-TimeAlsayed, Omar 24 May 2022 (has links)
No description available.
|
93 |
Formalizing Contract Refinements Using a Controlled Natural LanguageMeloche, Regan 30 November 2023 (has links)
The formalization of natural language contracts can make the prescriptions found in these contracts more precise, promoting the development of smart contracts, which are digitized forms of the documents where the monitoring and execution can be partially automated. Full formalization remains a difficult problem, and this thesis makes steps towards solving this challenge by focusing on a narrow sub-problem of formalizing contract refinements. We want to allow a contract author to customize a contract template, and automatically convert the resulting contract to a formal specification language called Symboleo, created specifically for the legal contract domain. The hope is that research towards partial formalization can be useful on its own, as well as useful towards the full formalization of contracts.
The main questions addressed by this thesis involve asking what linguistic forms these refinements will take. Answering these questions involves both linguistic analysis and empirical analysis on a set of real contracts to construct a controlled natural language (CNL). This language is expressive and natural enough to be adopted by contract authors, and it is precise enough that it can reliably be converted into the proper formal specification. We also design a tool, SymboleoNLP, that demonstrates this functionality on realistic contracts. This involves ensuring that the contract author can input contract refinements that adhere to our CNL, and that the refinements are properly formalized with Symboleo.
In addition to contributing an evidence-based CNL for contract refinements, this thesis
also outlines a very clear methodology for constructing this CNL, which may need to go through iterations as requirements change and as the Symboleo language evolves. The SymboleoNLP tool is another contribution, and is designed for iterative improvement. We explore a number of potential areas where further NLP techniques may be integrated to improve performance, and the tool is designed for easy integration of these modules to adapt to emerging technologies and changing requirements.
|
94 |
‘How can one evaluate a conversational software agent framework?’Panesar, Kulvinder 07 October 2020 (has links)
Yes / This paper presents a critical evaluation framework for a linguistically orientated conversational software agent
(CSA) (Panesar, 2017). The CSA prototype investigates the integration, intersection and interface of the
language, knowledge, and speech act constructions (SAC) based on a grammatical object (Nolan, 2014), and the
sub-model of belief, desires and intention (BDI) (Rao and Georgeff, 1995) and dialogue management (DM) for
natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of
interpretation to provide realistic dialogue to support the human-to-computer communication.
This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory –
Role and Reference Grammar (RRG) (Van Valin Jr, 2005); (2) Agent Cognitive Model with two inner models:
(a) knowledge representation model employing conceptual graphs serialised to Resource Description Framework
(RDF); (b) a planning model underpinned by BDI concepts (Wooldridge, 2013) and intentionality (Searle,
1983) and rational interaction (Cohen and Levesque, 1990); and (3) a dialogue model employing common
ground (Stalnaker, 2002).
The evaluation approach for this Java-based prototype and its phase models is a multi-approach driven by
grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation
criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and
integration of all phase models and their inner models. This multi-approach encompasses checking performance
both at internal processing, stages per model and post-implementation assessments of the goals of RRG, and
RRG based specifics tests.
The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for
purpose for describing, and explaining phenomena, language processing and knowledge, and computational
adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL –
agent to ontology with semantic gaps, and further addressed by a lexical bridging consideration (Panesar, 2017).
|
95 |
Contextualizing antimicrobial resistance determinants using deep-learning language modelsEdalatmand, Arman 11 1900 (has links)
Bacterial outbreak publications outline the key factors involved in uncontrolled spread of infection. Such factors include the environments, pathogens, hosts, and antimicrobial resistance (AMR) genes involved. Individually, each paper published in this area gives a glimpse into the devastating impact drug resistant infections have on healthcare, agriculture, and livestock. When examined together, these papers reveal a story across time, from the discovery of new resistance genes to their dissemination to different pathogens, hosts, and environments. My work aims to extract this information from publications by using the biomedical deep-learning language model, BioBERT. BioBERT is pre-trained on all abstracts found in PubMed and has state-of-the-art performance with language tasks using biomedical literature. I trained BioBERT on two tasks: entity recognition to identify AMR-relevant terms (i.e., AMR genes, taxonomy, environments, geographical locations, etc.) and relation extraction to determine which terms identified through entity recognition contextualize AMR genes. Datasets were generated semi-automatically to train BioBERT for these tasks. My work currently collates results from 204,094 antimicrobial resistance publications worldwide and generates interpretable results about the sources where genes are commonly found. Overall, my work takes a large-scale approach to collect antimicrobial resistance data from a commonly overlooked resource, i.e., the systematic examination of the large body of AMR literature. / Thesis / Master of Science (MSc)
|
96 |
Language Identification on Short Textual DataCui, Yexin January 2020 (has links)
Language identification is the task of automatically detecting the languages(s) written in a text or a document given, and is also the very first step of further natural language processing tasks. This task has been well-studied over decades in the past, however, most of the works have focused on long texts rather than the short that is proved to be more challenging due to the insufficiency of syntactic and semantic information. In this work, we present approaches to this problem based on deep learning techniques, traditional methods and their combination. The proposed ensemble model, composed of a learning based method and a dictionary based method, achieves 89.6% accuracy on our new generated gold test set, surpassing Google Translate API by 3.7% and an industry leading tool Langid.py by 26.1%. / Thesis / Master of Applied Science (MASc)
|
97 |
Leveraging Structure for Effective Question AnsweringBonadiman, Daniele 25 September 2020 (has links)
In this thesis, we focus on Answer Sentence Selection (A2S) that is the core task of retrieval based question answering. A2S consists of selecting the sentences that answer user queries from a collection of documents retrieved by a search engine. Over more than two decades, several solutions based on machine learning have been proposed to solve this task, starting from simple approaches based on manual feature engineering to more complex Structural Tree Kernels models, and recently Neural Network architectures.
In particular, the latter requires little human effort as they can automatically extract relevant features from plain text. The development of neural architectures brought improvements in many areas of A2S, reaching unprecedented results. They substantially increase accuracy on almost all benchmark datasets for A2S. However, this has come with the cost of a huge increase in the number of parameters and computational costs of the models. A large number of parameters has led to two drawbacks. The model requires a massive amount of data to train effectively, and huge computational power to maintain an acceptable transaction per second in a production environment. Current state-of-the-art techniques for A2S use huge Transformer architectures, having up to 340 million parameters, pre-trained on a massive amount of data, e.g., BERT. The latter and related models in the same family, such as RoBERTa, are general architectures, i.e., they can be applied to many tasks of NLP without any architectural change.
In contrast to the trend above, we focus on specialized architectures for A2S that can effectively encode the local structure of the question and answer candidate and global information, i.e., the structure of the task and the context in which the answer candidate appears.
In particular, we propose solutions to effectively encode both the local and the global structure of A2S in efficient neural network models. (i) We encode syntactic information in a fast CNN architecture exploiting the capabilities of Structural Tree Kernel to encode the syntactic structure. (ii) We propose an efficient model that can use semantic relational information between question and answer candidates by pretraining word representations on a relational knowledge base. (iii) This efficient approach is further extended to encode each answer candidate's contextual information, encoding all answer candidates in the original context. Lastly, (iv) we propose a solution to encode task-specific structure that is available, for example, available on the community Question Answering task.
The final model, which encodes different aspects of the task, achieves state-of-the-art performance on A2S compared with other efficient architectures. The proposed model is more efficient than attention based architectures and outperforms BERT by two orders of magnitude in terms of transaction per second during training and testing, i.e., it processes 700 questions per second compared to 6 questions per second for BERT when training on a single GPU.
|
98 |
BCC’ing AI: Using Modern Natural Language Processing to Detect Micro and Macro E-ggressions in Workplace EmailsCornett, Kelsi E. 24 May 2024 (has links)
Subtle offensive statements in workplace emails, which I term "Micro E-ggressions," can significantly impact the psychological safety and subsequent productivity of work environments despite their often-ambiguous intent. This thesis investigates the prevalence and nature of both micro and macro e-ggressions within workplace email communications, utilizing state-of-the-art natural language processing (NLP) techniques. Leveraging a large dataset of workplace emails, the study aims to detect and analyze these subtle offenses, exploring their themes and the contextual factors that facilitate their occurrence. The research identifies common types of micro e-ggressions, such as questioning competence and work ethic, and examines the responses to these offenses. Results indicate a high prevalence of offensive content in workplace emails and reveal distinct thematic elements that contribute to the perpetuation of workplace incivility. The findings underscore the potential for NLP tools to bridge gaps in awareness and sensitivity, ultimately contributing to more inclusive and respectful workplace cultures. / Master of Science / Subtle offensive statements in workplace emails, which I term "Micro E-ggressions," can significantly impact the psychological safety and subsequent productivity of work environments despite their often-ambiguous intent. This thesis investigates the prevalence and nature of both micro and macro e-ggressions within workplace email communications, utilizing state-of-the-art natural language processing (NLP) techniques. Leveraging a large dataset of workplace emails, the study aims to detect and analyze these subtle offenses, exploring their themes and the contextual factors that facilitate their occurrence. The research identifies common types of micro e-ggressions, such as questioning competence and work ethic, and examines the responses to these offenses. The results show a high occurrence of offensive content in workplace emails and highlight patterns that help maintain a negative work environment. The study demonstrates that advanced language analysis tools can help raise awareness and sensitivity, ultimately fostering more inclusive and respectful workplace cultures.
|
99 |
NLP in Engineering Education - Demonstrating the use of Natural Language Processing Techniques for Use in Engineering Education Classrooms and ResearchBhaduri, Sreyoshi 19 February 2018 (has links)
Engineering Education is a developing field, with new research and ideas constantly emerging and contributing to the ever-evolving nature of this discipline. Textual data (such as publications, open-ended questions on student assignments, and interview transcripts) form an important means of dialogue between the various stakeholders of the engineering community. Analysis of textual data demands consumption of a lot of time and resources. As a result, researchers end up spending a lot of time and effort in analyzing such text repositories. While there is a lot to be gained through in-depth research analysis of text data, some educators or administrators could benefit from an automated system which could reveal trends and present broader overviews for given datasets in more time and resource efficient ways. Analyzing datasets using Natural Language Processing is one solution to this problem.
The purpose of my doctoral research was two-pronged: first, to describe the current state of use of Natural Language Processing as it applies to the broader field of Education, and second, to demonstrate the use of Natural Language Processing techniques for two Engineering Education specific contexts of instruction and research respectively. Specifically, my research includes three manuscripts: (1) systematic review of existing publications on the use of Natural Language Processing in education research, (2) automated classification system for open-ended student responses to gauge metacognition levels in engineering classrooms, and (3) using insights from Natural Language Processing techniques to facilitate exploratory analysis of a large interview dataset led by a novice researcher.
A common theme across the three tasks was to explore the use of Natural Language Processing techniques to enable the computer to extract meaningful information from textual data for Engineering Education related contexts. Results from my first manuscript suggested that researchers in the broader fields of Education used Natural Language Processing for a wide range of tasks, primarily serving to automate instruction in terms of creating content for examinations, automated grading or intelligent tutoring purposes. In manuscripts two and three I implemented some of the Natural Language Processing techniques such as Part-of-Speech tagging and tf-idf (text frequency-inverse document frequency) that were found (through my systematic review) to be used by researchers, to (a) develop an automated classification system for student responses to gauge their metacognitive levels and (b) conduct an exploratory novice led analysis of excerpts from interviews of students on career preparedness, respectively. Overall results of my research studies indicate that although the use of Natural Language Processing techniques in Engineering Education is not widespread, although such research endeavors could facilitate research and practice in our field. Particularly, this type of approach to textual data could be of use to practitioners in large engineering classrooms who are unable to devote large amounts of time to data analysis but would benefit from algorithmic systems that could quickly present a summary based on information processed from available text data. / Ph. D. / Textual data (such as publications, open-ended questions on student assignments, and interview transcripts) form an important means of dialogue between the various stakeholders of the engineering community. However, analyzing these datasets can be time consuming as well as resource-intensive. Natural Language Processing techniques exploit the machine’s ability to process and handle data in time-efficient ways. In my doctoral research I demonstrate how Natural Language Processing techniques can be used in the classrooms and in education research. Specifically, I began my research by systematically reviewing current studies describing the use of Natural Language Processing for education related contexts. I then used this understanding to inform use of Natural Language Processing techniques to two Engineering Education specific contexts: one in the classroom to automatically classify students’ responses to open-ended questions to understand the metacognitive levels, and the second context of informing analysis of a large dataset comprising excerpts from interview transcripts of engineering students describing career preparedness.
|
100 |
Natural Language Driven Image Edits using a Semantic Image Manipulation LanguageMohapatra, Akrit 04 June 2018 (has links)
Language provides us with a powerful tool to articulate and express ourselves! Understanding and harnessing the expressions of natural language can open the doors to a vast array of creative applications. In this work we explore one such application - natural language based image editing. We propose a novel framework to go from free-form natural language commands to performing fine-grained image edits.
Recent progress in the field of deep learning has motivated solving most tasks using end-to-end deep convolutional frameworks. Such methods have shown to be very successful even achieving super-human performance in some cases. Although such progress has shown significant promise for the future we believe there is still progress to be made before their effective application to a task like fine-grained image editing. We approach the problem by dissecting the inputs (image and language query) and focusing on understanding the language input utilizing traditional natural language processing (NLP) techniques. We start by parsing the input query to identify the entities, attributes and relationships and generate a command entity representation. We define our own high-level image manipulation language that serves as an intermediate programming language connecting natural language requests that represent a creative intent over an image into the lower-level operations needed to execute them. The semantic command entity representations are mapped into this high- level language to carry out the intended execution. / Master of Science / Image editing is a very challenging task that requires a specific skill set. Hence, Going from natural language to directly performing image edits thereby automating the entire procedure is a challenging problem as well as a potential application that could benefit widespread users. There are multiple stages involved in such a process starting with understanding the intent of a command provided in natural language, identifying the editing tasks represented by it and the different objects and properties of the image the command intends to act upon and finally performing the intended edit(s).
There has been significant progress in the field of natural language processing as well as computer vision in recent years. On the natural language front computers are now able to accurately parse sentences, analyze large amounts of text, classify sentiments and emotions and much more. Similarly on the computer vision side computers can accurately identify objects, localize them and even generate real life like images from random noise pixels.
In this work, we propose a novel framework that enables us to go from natural language commands to performing image edits. Our approach starts by parsing the language input, identifying the entities and relations in the image from the language followed by mapping it into a set of sequential executable commands in an intermediate programming language that we define to execute the edit.
|
Page generated in 0.024 seconds