Global ETD Search

101	Rational Design Inspired Application of Natural Language Processing Algorithms to Red Shift mNeptune684 Parkinson, Scott 26 March 2021 (has links) Recent innovations and progress in machine learning algorithms from the Natural Language Processing (NLP) community have motivated efforts to apply these models and concepts to proteins. The representations generated by trained NLP models have been shown to capture important semantic and structural understanding of proteins encompassing biochemical and biophysical properties, among other key concepts. In turn, these representations have demonstrated application to protein engineering tasks including mutation analysis and design of novel proteins. Here we use this NLP paradigm in a protein engineering effort to further red shift the emission wavelength of the red fluorescent protein mNeptune684 using only a small number of functional training variants ('Low-N' scenario). The collaborative nature of this thesis with the Department of Chemistry and Biomolecular Sciences explores using these tools and methods in the rational design process. machine learning protein engineering natural language processing
102	Natural Language Processing and Extracting Information From Medical Reports Pfeiffer II, Richard D. 29 June 2006 (has links) Submitted to the Health Informatics Graduate Program Faculty, Indiana University, in partial fulfillment of the requirements for the degree of Master of Science in Health Informatics.May 2006 / The purpose of this study is to examine the current use of natural language processing for extracting meaningful data from free text in medical reports. The use of natural language processing has been used to process information from various genres. To evaluate the use of natural language processing, a synthesized review of primary research papers specific to natural language processing and extracting data from medical reports. A three phased approach is used to describe the process of gathering the final metrics for validating the use of natural language processing. The main purpose of any NLP is to extract or understand human language and to process it into meaning for a specified area of interest or end-user. There are three types of approaches: symbolic, statistical, and connectionist. There are identified problems with natural language processing and the different approaches. Problems noted about natural language processing in the research are: acquisition, coverage, robustness, and extensibility. Metrics were gathered from primary research papers to evaluate the success of the natural language processors. Recall average of the four papers was 85%. Precision average of five papers was 87.7%. Accuracy average was 97%. Sensitivity average was 84%, while specificity was 97.4%. Based on the results of the primary research there was no definitive way to validate one NLP approach as an industry standard The research reviewed it is clear that there has been at least limited success with information extraction from free text with use of natural language processing. It is important to understand the continuum of data, information, and knowledge in the previous and future research of natural language processing. In the industry of health informatics this is a technology necessary for improving healthcare and research. Natural Language Processing NLP Medical Reporting Informatics
103	Rättsäker Textanalys Svensson, Henrik, Lindqvist, Kalle January 2019 (has links) Digital språkbehandling (natural language processing) är ett forskningsområde inom vilketdet ständigt görs nya framsteg. En betydande del av den textanalys som sker inom dettafält har som mål att uppnå en fullgod tillämpning kring dialogen mellan människa ochdator. I denna studie vill vi dock fokusera på den inverkan digital språkbehandling kan hapå den mänskliga inlärningsprocessen. Vårt praktiska testområde har också en framtidainverkan på en av de mest grundläggande förutsättningarna för ett rättssäkert samhälle,nämligen den polisiära rapportskrivningen.Genom att skapa en teoretisk idébas som förenar viktiga aspekter av digital språk-behandling och polisrapportskrivning samt därefter implementera dem i en pedagogiskwebbplattform ämnad för polisstudenter är vi av uppfattningen att vår forskning tillförnågot nytt inom det datavetenskapliga respektive det samhällsvetenskapliga fälten.Syftet med arbetet är att verka som de första stegen mot en webbapplikation somunderstödjer svensk polisdokumentation. / Natural language processing is a research area in which new advances are constantly beingmade. A significant portion of text analyses that takes place in this field have the aim ofachieving a satisfactory application in the dialogue between human and computer. In thisstudy, we instead want to focus on what impact natural language processing can have onthe human learning process.Simultaneously, the context for our research has a future impact on one of the mostbasic principles for a legally secure society, namely the writing of the police report.By creating a theoretical foundation of ideas that combines aspects of natural languageprocessing as well as official police report writing and then implementing them in aneducational web platform intended for police students, we are of the opinion that ourresearch adds something new in the computer science and sociological fields.The purpose of this work is to act as the first steps towards a web application thatsupports the Swedish police documentation. linguistics language processing Engineering and Technology Teknik och teknologier
104	Deductive, Inductive and Abductive Reasoning over Natural Language Text: A Case Study with Adaptations, Behaviors and Variations in Organisms January 2019 (has links) abstract: Question answering is a challenging problem and a long term goal of Artificial Intelligence. There are many approaches proposed to solve this problem, including end to end machine learning systems, Information Retrieval based approaches and Textual Entailment. Despite being popular, these methods find difficulty in solving problems that require multi level reasoning and combining independent pieces of knowledge, for example, a question like "What adaptation is necessary in intertidal ecosystems but not in reef ecosystems?'', requires the system to consider qualities, behaviour or features of an organism living in an intertidal ecosystem and compare with that of an organism in a reef ecosystem to find the answer. The proposed solution is to solve a genre of questions, which is questions based on "Adaptation, Variation and Behavior in Organisms", where there are various different independent sets of knowledge required for answering questions along with reasoning. This method is implemented using Answer Set Programming and Natural Language Inference (which is based on machine learning ) for finding which of the given options is more probable to be the answer by matching it with the knowledge base. To evaluate this approach, a dataset of questions and a knowledge base in the domain of "Adaptation, Variation and Behavior in Organisms" is created. / Dissertation/Thesis / Masters Thesis Computer Science 2019 Computer science Natural Language Processing Question Answering
105	Developing a dynamic recommendation system for personalizing educational content within an E-learning network Mirzaeibonehkhater, Marzieh January 2018 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / This research proposed a dynamic recommendation system for a social learning environment entitled CourseNetworking (CN). The CN provides an opportunity for the users to satisfy their academic requirement in which they receive the most relevant and updated content. In our research, we extracted some implicit and explicit features from the system, which are the most relevant user feature and posts features. The selected features are used to make a rating scale between users and posts so that represent the link between user and post in this learning management system (LMS). We developed an algorithm which measures the link between each user and post for the individual. To achieve our goal in our system design, we applied natural language processing technique (NLP) for text analysis and applied various classi cation technique with the aim of feature selection. We believe that considering the content of the posts in learning environments as an impactful feature will greatly affect to the performance of our system. Our experimental results demonstrated that our recommender system predicts the most informative and relevant posts to the users. Our system design addressed the sparsity and cold-start problems, which are the two main challenging issues in recommender systems. Recommendation systems Machine learning Natural language processing
106	Tracking and Characterizing Natural Language Semantic Dynamics of Conversations in Real-Time Alsayed, Omar 24 May 2022 (has links) No description available. Artificial Intelligence Semantic Dynamics Natural Language Processing
107	Formalizing Contract Refinements Using a Controlled Natural Language Meloche, Regan 30 November 2023 (has links) The formalization of natural language contracts can make the prescriptions found in these contracts more precise, promoting the development of smart contracts, which are digitized forms of the documents where the monitoring and execution can be partially automated. Full formalization remains a difficult problem, and this thesis makes steps towards solving this challenge by focusing on a narrow sub-problem of formalizing contract refinements. We want to allow a contract author to customize a contract template, and automatically convert the resulting contract to a formal specification language called Symboleo, created specifically for the legal contract domain. The hope is that research towards partial formalization can be useful on its own, as well as useful towards the full formalization of contracts. The main questions addressed by this thesis involve asking what linguistic forms these refinements will take. Answering these questions involves both linguistic analysis and empirical analysis on a set of real contracts to construct a controlled natural language (CNL). This language is expressive and natural enough to be adopted by contract authors, and it is precise enough that it can reliably be converted into the proper formal specification. We also design a tool, SymboleoNLP, that demonstrates this functionality on realistic contracts. This involves ensuring that the contract author can input contract refinements that adhere to our CNL, and that the refinements are properly formalized with Symboleo. In addition to contributing an evidence-based CNL for contract refinements, this thesis also outlines a very clear methodology for constructing this CNL, which may need to go through iterations as requirements change and as the Symboleo language evolves. The SymboleoNLP tool is another contribution, and is designed for iterative improvement. We explore a number of potential areas where further NLP techniques may be integrated to improve performance, and the tool is designed for easy integration of these modules to adapt to emerging technologies and changing requirements. Symboleo contract template formalization natural language processing
108	‘How can one evaluate a conversational software agent framework?’ Panesar, Kulvinder 07 October 2020 (has links) Yes / This paper presents a critical evaluation framework for a linguistically orientated conversational software agent (CSA) (Panesar, 2017). The CSA prototype investigates the integration, intersection and interface of the language, knowledge, and speech act constructions (SAC) based on a grammatical object (Nolan, 2014), and the sub-model of belief, desires and intention (BDI) (Rao and Georgeff, 1995) and dialogue management (DM) for natural language processing (NLP). A long-standing issue within NLP CSA systems is refining the accuracy of interpretation to provide realistic dialogue to support the human-to-computer communication. This prototype constitutes three phase models: (1) a linguistic model based on a functional linguistic theory – Role and Reference Grammar (RRG) (Van Valin Jr, 2005); (2) Agent Cognitive Model with two inner models: (a) knowledge representation model employing conceptual graphs serialised to Resource Description Framework (RDF); (b) a planning model underpinned by BDI concepts (Wooldridge, 2013) and intentionality (Searle, 1983) and rational interaction (Cohen and Levesque, 1990); and (3) a dialogue model employing common ground (Stalnaker, 2002). The evaluation approach for this Java-based prototype and its phase models is a multi-approach driven by grammatical testing (English language utterances), software engineering and agent practice. A set of evaluation criteria are grouped per phase model, and the testing framework aims to test the interface, intersection and integration of all phase models and their inner models. This multi-approach encompasses checking performance both at internal processing, stages per model and post-implementation assessments of the goals of RRG, and RRG based specifics tests. The empirical evaluations demonstrate that the CSA is a proof-of-concept, demonstrating RRG’s fitness for purpose for describing, and explaining phenomena, language processing and knowledge, and computational adequacy. Contrastingly, evaluations identify the complexity of lower level computational mappings of NL – agent to ontology with semantic gaps, and further addressed by a lexical bridging consideration (Panesar, 2017). Conversational software agents Natural language processing
109	Contextualizing antimicrobial resistance determinants using deep-learning language models Edalatmand, Arman 11 1900 (has links) Bacterial outbreak publications outline the key factors involved in uncontrolled spread of infection. Such factors include the environments, pathogens, hosts, and antimicrobial resistance (AMR) genes involved. Individually, each paper published in this area gives a glimpse into the devastating impact drug resistant infections have on healthcare, agriculture, and livestock. When examined together, these papers reveal a story across time, from the discovery of new resistance genes to their dissemination to different pathogens, hosts, and environments. My work aims to extract this information from publications by using the biomedical deep-learning language model, BioBERT. BioBERT is pre-trained on all abstracts found in PubMed and has state-of-the-art performance with language tasks using biomedical literature. I trained BioBERT on two tasks: entity recognition to identify AMR-relevant terms (i.e., AMR genes, taxonomy, environments, geographical locations, etc.) and relation extraction to determine which terms identified through entity recognition contextualize AMR genes. Datasets were generated semi-automatically to train BioBERT for these tasks. My work currently collates results from 204,094 antimicrobial resistance publications worldwide and generates interpretable results about the sources where genes are commonly found. Overall, my work takes a large-scale approach to collect antimicrobial resistance data from a commonly overlooked resource, i.e., the systematic examination of the large body of AMR literature. / Thesis / Master of Science (MSc) antimicrobial resistance natural language processing epidemiology
110	Language Identification on Short Textual Data Cui, Yexin January 2020 (has links) Language identification is the task of automatically detecting the languages(s) written in a text or a document given, and is also the very first step of further natural language processing tasks. This task has been well-studied over decades in the past, however, most of the works have focused on long texts rather than the short that is proved to be more challenging due to the insufficiency of syntactic and semantic information. In this work, we present approaches to this problem based on deep learning techniques, traditional methods and their combination. The proposed ensemble model, composed of a learning based method and a dictionary based method, achieves 89.6% accuracy on our new generated gold test set, surpassing Google Translate API by 3.7% and an industry leading tool Langid.py by 26.1%. / Thesis / Master of Applied Science (MASc) Natural Language Processing Language identification Textual data

Search results