Global ETD Search

21	Efficient Matrix-aware Relational Query Processing in Big Data Systems Yongyang Yu (5930462) 03 January 2019 (has links) <div>In the big data era, the use of large-scale machine learning methods is becoming ubiquitous in data exploration tasks ranging from business intelligence and bioinformatics to self-driving cars. In these domains, a number of queries are composed of various kinds of operators, such as relational operators for preprocessing input data, and machine learning models for complex analysis. Usually, these learning methods heavily rely on matrix computations. As a result, it is imperative to develop novel query processing approaches and systems that are aware of big matrix data and corresponding operators, scale to clusters of hundreds of machines, and leverage distributed memory for high-performance computation. This dissertation introduces and studies several matrix-aware relational query processing strategies, analyzes and optimizes their performance.</div><div><br></div><div><div>The first contribution of this dissertation is MatFast, a matrix computation system for efficiently processing and optimizing matrix-only queries in a distributed in-memory environment. We introduce a set of heuristic rules to rewrite special features of a matrix query for less memory footprint, and cost models to estimate the sparsity of sparse matrix multiplications, and to distribute the matrix data partitions among various compute workers for a communication-efficient execution. We implement and test the query processing strategies in an open-source distributed dataflow</div><div>engine (Apache Spark).</div></div><div><br></div><div><div>In the second contribution of this dissertation, we extend MatFast to MatRel, where we study how to efficiently process queries that involve both matrix and relational operators. We identify a series of equivalent transformation rules to rewrite a logical plan when both relational and matrix operations are present. We introduce selection, projection, aggregation, and join operators over matrix data, and propose optimizations to reduce computation overhead. We also design a cost model to distribute matrix data among various compute workers for communication-efficient</div><div>evaluation of relational join operations.</div></div><div><br></div><div><div>In the third and last contribution of this dissertation, we demonstrate how to leverage MatRel for optimizing complex matrix-aware relational query evaluation pipelines. Especially, we showcase how to efficiently learn model parameters for deep neural networks of various applications with MatRel, e.g., Word2Vec.</div></div> Applied Computer Science big data query optimization matrix computation distributed computing
22	Analysis of Controllability for Temporal Networks Babak Ravandi (7456850) 17 October 2019 (has links) Physical systems modeled by networks are fully dynamic in the sense that the process of adding edges and vertices never ends, and no edge or vertex is necessarily eternal. Temporal networks enable to explicitly study systems with a changing topology by capturing explicitly the temporal changes. The controllability of temporal networks is the study of driving the state of a temporal network to a target state at deadline t<sub>f</sub> within △t = t<sub>f</sub> - t<sub>0</sub> steps by stimulating key nodes called driver nodes. In this research, the author aims to understand and analyze temporal networks from the controllability perspective at the global and nodal scales. To analyze the controllability at global scale, the author provides an efficient heuristic algorithm to build driver node sets capable of fully controlling temporal networks. At the nodal scale, the author presents the concept of Complete Controllable Domain (CCD) to investigate the characteristics of Maximum Controllable Subspaces (MCSs) of a driver node. The author shows that a driver node can have an exponential number of MCSs and introduces a branch and bound algorithm to approximate the CCD of a driver node. The proposed algorithms are evaluated on real-world temporal networks induced from ant interactions in six colonies and in a set of e-mail communications of a manufacturing company. At the global scale, the author provides ways to determine the control regime in which a network operates. Through empirical analysis, the author shows that ant interaction networks operate under a distributed control regime whereas the e-mails network operates in a centralized regime. At the nodal scale, the analysis indicated that on average the number of nodes that a driver node always controls is equal to the number of driver nodes that always control a node. <br> Applied Computer Science Complex Physical Systems Temporal Networks Controllability Network Science Complex Networks Complex Systems
23	Unsupervised Visual Knowledge Discovery and Accumulation in Dynamic Environments Ziyin Wang (7860227) 13 November 2019 (has links) Developing unsupervised vision systems in Dynamic Environments is one of the next challenges in Computer Vision. In Dynamic Environments, we usually lack the complete domain knowledge of the applied environments before deployment, and computation is also limited due to the need for prompt reaction and on-board computational capacity. This thesis studies a series of key Computer Vision problems in Dynamic Environments. <div><br></div><div>First, we propose a stream clustering algorithm and a number of variants for unsupervised feature learning and object discovery, which possess several crucial characteristics required by applications in Dynamic Environments, e.g. fully progressive, arbitrary similarity measure, matching object while the feature space is increasing, etc. We give strong provable guarantees of the clustering accuracy in statistic view. Based on the above the approaches, we tackle the problem of discovering aerial objects on-the-fly, where we assume all of the objects are unknown at the beginning of the deployment. The vision system is required to discover from the low-level features to salient objects on-the-fly without any supervision. We propose a number of approaches with respect to object proposal, tracking, recognition, and localization to achieve real-time performance. Extensive experiments on prevalent aerial video datasets showed that the approaches efficiently and accurately discover salient ground objects. </div><div><br></div><div>To explore complex and deep architectures in Dynamic Environments, we propose Unsupervised Deep Encoding which unifies traditional Visual Encoding and Convolutional Neural Networks. We found strong relationships between single-layer Neural Networks and Clustering and thus performed unsupervised feature learning at each layer from the feature maps of the previous layer. We replaced the dot product inside each neuron with a similarity measure, which is also used in unsupervised feature learning. The weight vectors of our network are initialized by cluster centers. Therefore, one feature map is a visual encoding of its previous feature map. We applied this mechanism to pre-training Convolutional Neural Networks for image classification. It has been found by extensive experiments that pre-training benefits the network more reliable learning dynamics (e.g.fast convergence without Batch Normalization) and better classification accuracy.</div> Applied Computer Science Computer Vision Aerial Vision Dynamic Environments Deep Learning
24	Leipziger Beiträge zur Informatik 20 November 2014 (has links) In der Buchreihe "Leipziger Beiträge zur Informatik" erscheinen Berichte aus Forschungsvorhaben, Herausgeberbände im Bereich innovativer und sich etablierender Forschungsgebiete, Habilitationsschriften und Dissertationen sowie Konferenz-Proceedings und herausragende studentische Arbeiten. Der Wert dieser durch den „Leipziger Informatik Verbund“ (LIV) als Zusammenschluss und Interessenverbund verschiedener Informatik-Einrichtungen im Jahr 2003 begründeten Reihe liegt darin, zeitnah und umfassend über abgeschlossene oder laufende wissenschaftliche Arbeiten sowie über neu entstehende Forschungsfelder zu berichten. Die Reihe stellt die innovative Themenvielfalt in den Herausgeberbänden neben die hohe wissenschaftliche Durchdringung in Habilitationen und Dissertationen. Zudem ergänzt sie forschungsrelevante Bereiche mit praxisorientierten technischen Beiträgen und Dokumentationen. Informatik Angewandte Informatik Wirtschaftsinformatik computer science applied computer science business informatics ddc:000 ddc:004
25	EXTRACTING SYMPTOMS FROM NARRATIVE TEXTUSING ARTIFICIAL INTELLIGENCE Priyanka Rakesh Gandhi (9713879) 07 January 2021 (has links) <div><div><div><p>Electronic health records collect an enormous amount of data about patients. However, the information about the patient’s illness is stored in progress notes that are in an un- structured format. It is difficult for humans to annotate symptoms listed in the free text. Recently, researchers have explored the advancements of deep learning can be applied to pro- cess biomedical data. The information in the text can be extracted with the help of natural language processing. The research presented in this thesis aims at automating the process of symptom extraction. The proposed methods use pre-trained word embeddings such as BioWord2Vec, BERT, and BioBERT to generate vectors of the words based on semantics and syntactic structure of sentences. BioWord2Vec embeddings are fed into a BiLSTM neural network with a CRF layer to capture the dependencies between the co-related terms in the sentence. The pre-trained BERT and BioBERT embeddings are fed into the BERT model with a CRF layer to analyze the output tags of neighboring tokens. The research shows that with the help of the CRF layer in neural network models, longer phrases of symptoms can be extracted from the text. The proposed models are compared with the UMLS Metamap tool that uses various sources to categorize the terms in the text to different semantic types and Stanford CoreNLP, a dependency parser, that analyses syntactic relations in the sentence to extract information. The performance of the models is analyzed by using strict, relaxed, and n-gram evaluation schemes. The results show BioBERT with a CRF layer can extract the majority of the human-labeled symptoms. Furthermore, the model is used to extract symptoms from COVID-19 tweets. The model was able to extract symptoms listed by CDC as well as new symptoms.</p></div></div></div> Applied Computer Science Advanced Neural Networks Artificial Intelligence Natural language processing Medical notes
26	Disentangled Representations Learning for Covid-19 Sequelae Prediction Zhaorui Liu (11820731) 19 December 2021 (has links) Severe acute respiratory syndrome (SARS)-CoV-2 emerged in late 2019, then became an unprecedented public health crisis. Hundreds of millions of people have been affected. What is worse, many researchers have revealed that COVID-19 may have long-term effects on varieties of organs even after recovery. Consequently, there is a need for the study of its sequelae. The purpose of this project is to use machine learning algorithms to study the relationship between patients’ EMR data and long-term sequelae, especially kidney diseases. Inspired by a recent learning disentangled representation for recommendation work, this project proposes a method that (i) predicts the development trend of the kidney disease; (ii) learn representations that uncover and disentangle factors related to kidney diseases. The major contribution is that this model has high interpretability which enables medical works to infer the development of patients' condition. Applied Computer Science Health Informatics Disentangled Representations Learning Covid-19 Sequelae
27	Evaluating the Effects of BKT-LSTM on Students' Learning Performance Jianyao Li (11794436) 20 December 2021 (has links) <div>Today, machine learning models and Deep Neural Networks (DNNs) are prevalent in various areas. Also, educational Artificial Intelligence (AI) is drawing increasing attention</div><div>with the rapid development of online learning platforms. Researchers explore different types of educational AI to improve students’ learning performance and experience in online classes. Educational AIs can be categorized into “interactive” and “predictive.” Interactive AIs answer simple course questions for students, such as the due day of homework and the final project’s minimum page requirement. Predictive educational AIs play a role in predicting students’ learning states. Instructors can adjust the learning content based on the students’ learning states. However, most AIs are not evaluated in an actual class setting. Therefore, we want to evaluate the effects of a state-of-the-art educational AI model, BKT (Bayesian Knowledge Tracing)-LSTM(Long Short-Term Memory), on students’ learning performance in an actual class setting. Data came from the course CNIT 25501, a large introductory Java program?ming class at Purdue University. Participants were randomly separated into the control and experimental groups (AI-group). Weekly quizzes measured participants’ learning performance. Pre-quiz and base quizzes estimated participants’ prior knowledge levels. Using BKT-LSTM, participants in the experimental group had questions from the knowledge that they were most lacking. However, participants in the control group had questions from randomly picked knowledge. The results suggested that both the experimental and control groups had lower scores in review quizzes than in base quizzes. However, the score difference between base quizzes and review quizzes for the experimental group was more often significantly different (three quizzes) compared to the control group (two quizzes), demonstrating the predictive capability of BKT-LSTM to some extent. Initially, we expected that BKT-LSTM would enhance students’ learning performance. However, in post-quiz, participants in the control group had significantly higher scores than those in the experimental group. The result suggested that continuous complex questions may negatively affect students’ learning initiatives. On the contrary, relatively easy questions may improve their learning initiatives.</div> Applied Computer Science Computer-Human Interaction BKT-LSTM Educational AI Knowledge Tracing Learning performance
28	DRONE CLASSIFICATION WITH MOTION AND APPEARANCE FEATURE USING CONVOLUTIONAL NEURAL NETWORKS Eunsuh Lee (8981213) 17 June 2020 (has links) <div> <div> <div> <p>With the advancement in Unmanned Aerial Vehicles (UAV) technology, UAVs have become accessible to the public. However, recent world events have highlighted that the rapid increase of UAVs is bringing with it a threat to public privacy and security. Thus, it is important to think about how to prevent the threats of UAVs to protect our privacy and safety. This study aims to provide an alternative way to substitute an expensive system by using 2D optical sensors that can be easily utilized by people. One of the main challenges for aerial object recognition with computer vision is discriminating other flying objects from the targets, in the far distance. There are limitation to classify the flying object when it appears as a set of small black pixels on the frame. The movement feature can help the system to extract the discriminative feature, so that the classifier can classify the UAV and other objects, such as a bird. Thus, this study proposes a drone detection system using two elements of information, which are appearance information and motion information to overcome the limitation of a vision based system. </p> </div> </div> </div> Applied Computer Science Autonomous Vehicles Computer Vision Image Processing Drone Detection computer vision technique
29	NON-INTRUSIVE LOAD EXTRACTION OF ELECTRIC VEHICLE CHARGING LOADS FOR EDGE COMPUTING Hyeonae Jang (8790983) 01 May 2020 (has links) <div>The accelerated urbanization of countries has led the adoption of the smart power grid with an explosion in high power usage. The emergence of Non-intrusive load monitoring (NILM), also referred to as Energy Disaggregation has followed the recent worldwide adoption of smart meters in smart grids. NILM is a convenient process to analyze composite electrical energy load and determine electrical energy consumption.</div><div><br></div><div>A number of state-of-the-art NILM (energy disaggregation) algorithms have been proposed recently to detect various individual appliances from one aggregated signal observation. Different kinds of classification methods such as Hidden Markov Model (HMM), Support Vector Method (SVM), neural networks, fuzzy logic, Naive Bayes, k-Nearest Neighbors (kNN), and many other hybrid approaches have been used to classify the estimated power consumption of electrical appliances from extracted appliances signatures. This study proposes an end-to-end edge computing system with an NILM algorithm, which especially focuses on recognizing Electric Vehicle (EV) charging. This system consists of three main components: (1) Data acquisition and Preprocessing, (2) Extraction of EV charging load via an NILM algorithm (Load identification) on the NILMTK Framework, (3) and Result report to the cloud server platform.</div><div><br></div><div>The monitoring of energy consumption through the proposed system is remarkably beneficial for demand response and energy efficiency. It helps to improve the understanding and prediction of power grid stress as well as enhance grid system reliability and resilience of the power grid. Furthermore, it is highly advantageous for the integration of more renewable energies that are under rapid development. As a result, countless potential NILM use-cases are expected from monitoring and identifying energy consumption in a power grid. It would enable smarter power consumption plans for residents as well as more flexible power grid management for electric utility companies, such as Duke Energy and ComEd.</div> Computer Engineering Applied Computer Science Computer Software NILM Edge Computing Electric Vehicle Smart Grid
30	Source Code Hyeonae Jang (8790983) 01 May 2020 (has links) This compressed file consists of h5 and python files created to conduct the thesis study Computer Engineering Applied Computer Science Computer Software nilmtk nilm Electric Vehicle

Search results