Global ETD Search

1	Forced Attention for Image Captioning Hemanth Devarapalli (5930603) 17 January 2019 (has links) <div> <div> <div> <p>Automatic generation of captions for a given image is an active research area in Artificial Intelligence. The architectures have evolved from using metadata of the images on which classical machine learning was employed to neural networks. Two different styles of architectures evolved in the neural network space for image captioning: Encoder-Attention-Decoder architecture, and the transformer architecture. This study is an attempt to modify the attention to allow any object to be specified. An archetypical Encoder-Attention-Decoder architecture (Show, Attend, and Tell (Xu et al., 2015)) is employed as a baseline for this study, and a modification of the Show, Attend, and Tell architecture is proposed. Both the architectures are evaluated on the MSCOCO (Lin et al., 2014) dataset, and seven metrics: BLEU – 1, 2, 3, 4 (Papineni, Roukos, Ward & Zhu, 2002), METEOR (Banerjee & Lavie, 2005), ROGUE L (Lin, 2004), and CIDer (Vedantam, Lawrence & Parikh, 2015) are calculated. Finally, the statistical significance of the results is evaluated by performing paired t tests. </p> </div> </div> </div> Natural Language Processing Artificial intelligence. Natural language processsing Image Captioning Deep Learning
2	MULTI-ATTRIBUTE AND TEMPORAL ANALYSIS OF PRODUCT REVIEWS USING TOPIC MODELLING AND SENTIMENT ANALYSIS Meet Tusharbhai Suthar (14232623) 08 December 2022 (has links) <p>Online reviews are frequently utilized to determine a product's quality before purchase along with the photographs and one-to-five star ratings. The research addressed the two distinct problems observed in the review systems. </p> <p>First, due to thousands of reviews for a product, the different characteristics of customer evaluations, such as consumer sentiments, cannot be understood by manually reading only a few reviews. Second, from these reviews, it is extremely hard to understand the change in these sentiments and other important product aspects over the years (temporal analysis). To address these problems, the study focused on 2 main research parts.</p> <p>Part one of the research was focused on answering how topic modelling and sentiment analysis can work together to give deeper understanding on attribute-based product review. The second part compared different topic modelling approaches to evaluate the performances and advantages of emerging NLP models. For this purpose, a dataset consisting of 469 publicly accessible Amazon evaluations of the Kindle E-reader and 15,000 reviews of iPhone products was utilized to examine sentiment Analysis and Topic modelling. Latent Dirichlet Allocation topic model and BERTopic topic model were used to perform topic modelling and to acquire the diverse topics of concern. Sentiment Analysis was carried out to better understand each topic's positive and negative tones. Topic analysis of Kindle user evaluations revealed the following major themes: (a) leisure consumption, (b) utility as a gift, (c) pricing, (d) parental control, (e) reliability and durability, and (f) charging. While the main themes emerged from the analysis of iPhone reviews depended on the model and year of the device, some themes were found to be consistent across all the iPhone models including (a) Apple vs Android (b) utility as gift and (c) service. The study's approach helped to analyze customer reviews for any product, and the study results provided a deeper understanding of the product's strengths and weaknesses based on a comprehensive analysis of user feedback useful for product makers, retailers, e-commerce platforms, and consumers.</p> Business analytics topic model methods sentiment analytics multi-attribute analysis Natural language processsing LDA
3	PROMPT-ASSISTED RELATION FUSION IN KNOWLEDGE GRAPH ACQUISITION Xiaonan Jing (14230196) 08 December 2022 (has links) <p> </p> <p>Knowledge Base (KB) systems have been studied for decades. Various approaches have been explored in acquiring accurate and scalable KBs. Recently, many studies focus on Knowledge Graphs (KG) which uses a simple triple representation. A triple consists of a head entity, a predicate, and a tail entity. The head entity and the tail entity are connected by the predicate which indicates a certain relation between them. Three main research fields can be identified in KG acquisition. First, relation extraction aims at extracting the triples from the raw data. Second, entity linking addresses mapping the same entity together. Last, knowledge fusion integrates heterogeneous sources into one. This dissertation focuses on relation fusion, which is a sub-process of knowledge fusion. More specifically, this dissertation aims to investigate if the concurrently popular prompt-based learning method can assist with relation fusion. A framework to acquire a KG is proposed to work with a real world dataset. The framework contains a Preprocessing module which annotates raw sentences and links known entities to the triples; a Prompting module, which generates and processes prompts for prediction with Pretrained Language Models (PLMs); and a Relation Fusion module, which creates predicate representations, clusters embeddings, and derives cluster labels. A series of experiments with comparison prompting groups are conducted. The results indicate that prompt-based learning, if applied appropriately, can help with grouping similar predicates. The framework proposed in this dissertation can be used eectively for assisting human experts with the creation of relation types during knowledge acquisition. </p> Natural language processing Information extraction and fusion knowledge graphs Natural language processsing prompts language models
4	USING ARTIFICIAL INTELLIGENCE TO PROVIDE DIFFERENTIATED FEEDBACK AND INSTRUCTION IN INTRODUCTORY PHYSICS Jeremy M Munsell (12468648) 27 April 2022 (has links) <p>Cognitive load theory (CLT) lays out a tripartite scheme concerned with how learners cognitively interact with instructional materials during learning and problem solving. Cognitive load refers to the utilization of working memory resources, and CLT designates three types of cognitive load as intrinsic cognitive load, extraneous cognitive load, and germane cognitive load. Intrinsic cognitive load is related to the intrinsic complexity of the material. Extraneous cognitive load is concerned with unnecessary utilization of cognitive resources due to suboptimal instructional design. Germane cognitive load results from processing the intrinsic load and schema acquisition. The expertise reversal effect follows as a consequence of CLT. </p> <p>The expertise reversal effect (ERE) states that instructional materials that are beneficial to low prior knowledge (LPK) learners may be detrimental to high prior knowledge (HPK) learners. Less guided materials have been shown to reduce extraneous cognitive load for these learners and therefore produce a greater benefit. </p> <p>In this work we present the development of online instructional modules that deliver content in two distinct styles, differentiated by their use of guiding features. the high level guidance version (HLG) uses guiding features, such as animations and voice narration, which have been shown to benefit LPK learners. Alternatively, guiding features have been shown to be destructive to the learning of HPK students. The low level guidance (LLG) version uses text in place of voice narration and pop-up content in place of continuous animations. Both versions led to a statistically significant improvement from pre-test to post-test. However, both HPK and LPK students showed a preference for the HLG version of the module, contrary to the ERE. Future work will focus on improving the ability to indentify HPK and LPK students, and refining methods for providing optimal instructional materials for these cohorts. </p> <p>Meanwhile, the use of machine learning is an emerging trend in education. Machine learning has been used in roles such as automatic scoring of essays in scientific argumentation tasks and providing feedback to students in real time. In this work we report our results on two projects using machine learning in education. In one project we used machine learning to predict students’ correctness on a physics problem given an essay outlining their approach to solving the problem. Our overall accuracy in predicting problem correctness given a student’s strategy essay was 80%. We were able to detect students whose approach would lead to an incorrect solution at a rate of 87%. However, deploying this model to provide real-time feedback would necessitate performance improvement. Planned future work on this problem includes hand grading essays to produce a label that reflects the scientific merit of each essay, using more sophisticated models (like Google’s B.E.R.T.), and generalizing to a larger set of problems. </p> <p>In another study, we used data about students’ prior academic behavior to predict academic risk in a first-year algebra based physics course. Their final course grade was used to define their risk category as; B- and above is designated low risk, and C+ and below is designated as high-risk. Using a mix of numerical and category features such as high school gpa, ACT/SAT scores, gender, and ethnicity we were able to predict student academic risk with 75% overall accuracy. Students with a very high grade (A) or students with a very low grade (D,F,W) were identified at a rate 92% and 88% (respectively).</p> <p>Prior work has shown that performance can be greatly increased by including in-class features into the model. Future work will focus on obtaining raw data, rather than using curved scores reported to the university registrar. Also, obtaining more batches of data to improve predictive power with existing models developed in this study.<br> </p> physics education research machine learning in education Natural language processsing instructional modules educational webapp
5	MULTILINGUAL CYBERBULLYING DETECTION SYSTEM Rohit Sidram Pawar (6613247) 11 June 2019 (has links) Since the use of social media has evolved, the ability of its users to bully others has increased. One of the prevalent forms of bullying is Cyberbullying, which occurs on the social media sites such as Facebook©, WhatsApp©, and Twitter©. The past decade has witnessed a growth in cyberbullying – is a form of bullying that occurs virtually by the use of electronic devices, such as messaging, e-mail, online gaming, social media, or through images or mails sent to a mobile. This bullying is not only limited to English language and occurs in other languages. Hence, it is of the utmost importance to detect cyberbullying in multiple languages. Since current approaches to identify cyberbullying are mostly focused on English language texts, this thesis proposes a new approach (called Multilingual Cyberbullying Detection System) for the detection of cyberbullying in multiple languages (English, Hindi, and Marathi). It uses two techniques, namely, Machine Learning-based and Lexicon-based, to classify the input data as bullying or non-bullying. The aim of this research is to not only detect cyberbullying but also provide a distributed infrastructure to detect bullying. We have developed multiple prototypes (standalone, collaborative, and cloud-based) and carried out experiments with them to detect cyberbullying on different datasets from multiple languages. The outcomes of our experiments show that the machine-learning model outperforms the lexicon-based model in all the languages. In addition, the results of our experiments show that collaboration techniques can help to improve the accuracy of a poor-performing node in the system. Finally, we show that the cloud-based configurations performed better than the local configurations. Computer Engineering Computer Software Distributed Computing Natural Language Processing Computer System Architecture Distributed computing Natural language processsing machine Learning Predictions cloud applications Indian languages
6	Biomedical Concept Association and Clustering Using Word Embeddings Setu Shah (5931128) 12 February 2019 (has links) <div>Biomedical data exists in the form of journal articles, research studies, electronic health records, care guidelines, etc. While text mining and natural language processing tools have been widely employed across various domains, these are just taking off in the healthcare space.</div><div><br></div><div>A primary hurdle that makes it difficult to build artificial intelligence models that use biomedical data, is the limited amount of labelled data available. Since most models rely on supervised or semi-supervised methods, generating large amounts of pre-processed labelled data that can be used for training purposes becomes extremely costly. Even for datasets that are labelled, the lack of normalization of biomedical concepts further affects the quality of results produced and limits the application to a restricted dataset. This affects reproducibility of the results and techniques across datasets, making it difficult to deploy research solutions to improve healthcare services.</div><div><br></div><div>The research presented in this thesis focuses on reducing the need to create labels for biomedical text mining by using unsupervised recurrent neural networks. The proposed method utilizes word embeddings to generate vector representations of biomedical concepts based on semantics and context. Experiments with unsupervised clustering of these biomedical concepts show that concepts that are similar to each other are clustered together. While this clustering captures different synonyms of the same concept, it also captures the similarities between various diseases and the symptoms that those diseases are symptomatic of.</div><div><br></div><div>To test the performance of the concept vectors on corpora of documents, a document vector generation method that utilizes these concept vectors is also proposed. The document vectors thus generated are used as an input to clustering algorithms, and the results show that across multiple corpora, the proposed methods of concept and document vector generation outperform the baselines and provide more meaningful clustering. The applications of this document clustering are huge, especially in the search and retrieval space, providing clinicians, researchers and patients more holistic and comprehensive results than relying on the exclusive term that they search for.</div><div><br></div><div>At the end, a framework for extracting clinical information that can be mapped to electronic health records from preventive care guidelines is presented. The extracted information can be integrated with the clinical decision support system of an electronic health record. A visualization tool to better understand and observe patient trajectories is also explored. Both these methods have potential to improve the preventive care services provided to patients.</div> Computer Engineering Natural Language Processing biomedical science Natural language processsing Word embeddings Artificial intelligence document clustering preventive care
7	Training Methodologies for Energy-Efficient, Low Latency Spiking Neural Networks Nitin Rathi (11849999) 17 December 2021 (has links) <div>Deep learning models have become the de-facto solution in various fields like computer vision, natural language processing, robotics, drug discovery, and many others. The skyrocketing performance and success of multi-layer neural networks comes at a significant power and energy cost. Thus, there is a need to rethink the current trajectory and explore different computing frameworks. One such option is spiking neural networks (SNNs) that is inspired from the spike-based processing observed in biological brains. SNNs operating with binary signals (or spikes), can potentially be an energy-efficient alternative to the power-hungry analog neural networks (ANNs) that operate on real-valued analog signals. The binary all-or-nothing spike-based communication in SNNs implemented on event-driven hardware offers a low-power alternative to ANNs. A spike is a Delta function with magnitude 1. With all its appeal for low power, training SNNs efficiently for high accuracy remains an active area of research. The existing ANN training methodologies when applied to SNNs, results in networks that have very high latency. Supervised training of SNNs with spikes is challenging (due to discontinuous gradients) and resource-intensive (time, compute, and memory).Thus, we propose compression methods, training methodologies, learning rules</div><div><br></div><div>First, we propose compression techniques for SNNs based on unsupervised spike timing dependent plasticity (STDP) model. We present a sparse SNN topology where non-critical connections are pruned to reduce the network size and the remaining critical synapses are weight quantized to accommodate for limited conductance levels in emerging in-memory computing hardware . Pruning is based on the power law weight-dependent</div><div>STDP model; synapses between pre- and post-neuron with high spike correlation are retained, whereas synapses with low correlation or uncorrelated spiking activity are pruned. The process of pruning non-critical connections and quantizing the weights of critical synapses is</div><div>performed at regular intervals during training.</div><div><br></div><div>Second, we propose a multimodal SNN that combines two modalities (image and audio). The two unimodal ensembles are connected with cross-modal connections and the entire network is trained with unsupervised learning. The network receives inputs in both modalities for the same class and</div><div>predicts the class label. The excitatory connections in the unimodal ensemble and the cross-modal connections are trained with STDP. The cross-modal connections capture the correlation between neurons of different modalities. The multimodal network learns features of both modalities and improves the classification accuracy compared to unimodal topology, even when one of the modality is distorted by noise. The cross-modal connections are only excitatory and do not inhibit the normal activity of the unimodal ensembles. </div><div><br></div><div>Third, we explore supervised learning methods for SNNs.Many works have shown that an SNN for inference can be formed by copying the weights from a trained ANN and setting the firing threshold for each layer as the maximum input received in that layer. These type of converted SNNs require a large number of time steps to achieve competitive accuracy which diminishes the energy savings. The number of time steps can be reduced by training SNNs with spike-based backpropagation from scratch, but that is computationally expensive and slow. To address these challenges, we present a computationally-efficient training technique for deep SNNs. We propose a hybrid training methodology:</div><div>1) take a converted SNN and use its weights and thresholds as an initialization step for spike-based backpropagation, and 2) perform incremental spike-timing dependent backpropagation (STDB) on this carefully initialized network to obtain an SNN that converges within few epochs and requires fewer time steps for input processing. STDB is performed with a novel surrogate gradient function defined using neuron’s spike time. The weight update is proportional to the difference in spike timing between the current time step and the most recent time step the neuron generated an output spike.</div><div><br></div><div>Fourth, we present techniques to further reduce the inference latency in SNNs. SNNs suffer from high inference latency, resulting from inefficient input encoding, and sub-optimal settings of the neuron parameters (firing threshold, and membrane leak). We propose DIET-SNN, a low-latency deep spiking network that is trained with gradient descent to optimize the membrane leak and the firing threshold along with other network parameters (weights). The membrane leak and threshold for each layer of the SNN are optimized with end-to-end backpropagation to achieve competitive accuracy at reduced latency. The analog pixel values of an image are directly applied to the input layer of DIET-SNN without the need to convert to spike-train. The first convolutional layer is trained to convert inputs into spikes where leaky-integrate-and-fire (LIF) neurons integrate the weighted inputs and generate an output spike when the membrane potential crosses the trained firing threshold. The trained membrane leak controls the flow of input information and attenuates irrelevant inputs to increase the activation sparsity in the convolutional and dense layers of the network. The reduced latency combined with high activation sparsity provides large improvements in computational efficiency.</div><div><br></div><div>Finally, we explore the application of SNNs in sequential learning tasks. We propose LITE-SNN, a lightweight SNN suitable for sequential learning tasks on data from dynamic vision sensors (DVS) and natural language processing (NLP). In general sequential data is processed with complex recurrent neural networks (like long short-term memory (LSTM), and gated recurrent unit (GRU)) with explicit feedback connections and internal states to handle the long-term dependencies. Whereas neuron models in SNNs - integrate-and-fire (IF) or leaky-integrate-and-fire (LIF) - have implicit feedback in their internal state (membrane potential) by design and can be leveraged for sequential tasks. The membrane potential in the IF/LIF neuron integrates the incoming current and outputs an event (or spike) when the potential crosses a threshold value. Since SNNs compute with highly sparse spike-based spatio-temporal data, the energy/inference is lower than LSTMs/GRUs. SNNs also have fewer parameters than LSTM/GRU resulting in smaller models and faster inference. We observe the problem of vanishing gradients in vanilla SNNs for longer sequences and implement a convolutional SNN with attention layers to perform sequence-to-sequence learning tasks. The inherent recurrence in SNNs, in addition to the fully parallelized convolutional operations, provides an additional mechanism to model sequential dependencies and leads to better accuracy than convolutional neural networks with ReLU activations.</div> Computer Engineering Spiking Neural Networks (SNN) Neuromorphic Computing Supervised Machine Learning Natural language processsing computer vision algorithms

1

Page generated in 0.0843 seconds