Global ETD Search

41	Visualization design for improving layer-wise relevance propagation and multi-attribute image classification Huang, Xinyi 01 December 2021 (has links) No description available. Computer Science deep learning visualization XAI
42	Accelerating the Computation and Design of Nanoscale Materials with Deep Learning Ryczko, Kevin 03 December 2021 (has links) In this article-based thesis, we cover applications of deep learning to different problems in condensed matter physics, where the goal is to either accelerate the computation or design of a nanoscale material. We first motivate and introduce how machine learning methods can be used to accelerate traditional condensed matter physics calculations. In addition, we discuss what designing a material means, and how it has been previously done. We then consider the fundamentals of electronic structure and conventional calculations which include density functional theory (DFT), density functional perturbation theory (DFPT), quantum Monte Carlo (QMC), and electron transport with tight binding. In addition, we cover the basics of deep learning. Afterwards, we discuss 6 articles. The first 5 articles are dedicated to accelerating the computation of nanoscale materials. In Article 1, we use convolutional neural networks to predict energies for diatomic molecules modelled with a Lennard-Jones potential and density functional theory energies of hexagonal lattices with and without defects. In Article 2, we use extensive deep neural networks to represent density functional theory energy functionals for electron gases by using the electron density as input and bypass the Kohn-Sham equations by using the external potential as input. In addition, we use deep convolutional inverse graphics networks to map the external potential directly to the electron density. In Article 3, we use voxel deep neural networks (VDNNs) to map electron densities to kinetic energy densities and functional derivatives of the kinetic energies for graphene lattices. We also use VDNNs to calculate an electron density from a direct minimization calculation and introduce a Monte Carlo based solver that avoids taking a functional derivative altogether. In Article 4, we use a deep learning framework to predict the polarization, dielectric function, Born-effective charges, longitudinal optical transverse optical splitting, Raman tensors, and Raman spectra for 2 crystalline systems. In Article 5, we use VDNNs to map DFT electron densities to QMC energy densities for graphene systems, and compute the energy barrier associated with forming a Stone-Wales defect. In Article 6, we design a graphene-based quantum transducer that has the ability to physically split valley currents by controlling the pn-doping of the lattice sites. The design is guided by an neural network that operates on a pristine lattice and outputs a lattice with pn-doping such that valley currents are optimally split. Lastly, we summarize the thesis and outline future work. Condensed Matter Physics Deep Learning Materials Design
43	Digital Twin Coaching for Edge Computing Using Deep Learning Based 2D Pose Estimation Gámez Díaz, Rogelio 15 April 2021 (has links) In these challenging times caused by the COVID-19, technology that leverages Artificial Intelligence potential can help people cope with the pandemic. For example, people looking to perform physical exercises while in quarantine. We also find another opportunity in the widespread adoption of mobile smart devices, making complex Artificial Intelligence (AI) models accessible to the average user. Taking advantage of this situation, we propose a Smart Coaching experience on the Edge with our Digital Twin Coaching (DTC) architecture. Since the general population is advised to work from home, sedentarism has become prevalent. Coaching is a positive force in exercising, but keeping physical distance while exercising is a significant problem. Therefore, a Smart Coach can help in this scenario as it involves using smart devices instead of direct communication with another person. Some researchers have worked on Smart Coaching, but their systems often involve complex devices such as RGB-Depth cameras, making them cumbersome to use. Our approach is one of the firsts to focus on everyday smart devices, like smartphones, to solve this problem. Digital Twin Coaching can be defined as a virtual system designed to help people improve in a specific field and is a powerful tool if combined with edge technology. The DTC architecture has six characteristics that we try to fulfill: adaptability, compatibility, flexibility, portability, security, and privacy. We collected training data of 10 subjects using a 2D pose estimation model to train our models since there was no dataset of Coach-Trainee videos. To effectively use this information, the most critical pre-processing step was synchronization. This step synchronizes the coach and the trainee’s poses to overcome the trainee's action lag while performing the routine in real-time. We trained a light neural network called “Pose Inference Neural Network” (PINN) to serve as a fine-tuning architecture mechanism. We improved the generalist 2D pose estimation model with this trained neural network while keeping the time complexity relatively unaffected. We also propose an Angular Pose Representation to compare the trainee and coach's stances that consider the differences in different people's body proportions. For the PINN model, we use Random Search Optimization to come up with the best configuration. The configurations tested included using 1, 2, 3, 4, 5, and 10 layers. We chose the 2-Layer Neural Network (2-LNN) configuration because it was the fastest to train and predict while providing a fair tradeoff between performance and resource consumption. Using frame synchronization in pre-processing, we improved 76% on the test loss (Mean Squared Error) while training with the 2-LNN. The PINN improved the R2 score of the PoseNet model by at least 15% and at most 93% depending on the configuration. Our approach only added 4 seconds (roughly 2% of the total time) to the total processing time on average. Finally, the usability test results showed that our Proof of Concept application, DTCoach, was considered easy to learn and convenient to use. At the same time, some participants mentioned that they would like to have more features and improved clarity to be more invested in using the app frequently. We hope DTCoach can help people stay more active, especially in quarantine, as the application can serve as a motivator. Since it can be run on modern smartphones, it can quickly be adopted by many people. Digital Twin Pose Estimation Deep Learning E-coaching
44	Shape-Tailored Invariant Descriptors for Segmentation Khan, Naeemullah 11 1900 (has links) Segmentation is one of the first steps in human visual system which helps us see the world around us. Humans pre-attentively segment scenes into regions of unique textures in around 10-20 ms. In this thesis, we address the problem of segmentation by grouping dense pixel-wise descriptors. Our work is based on the fact that human vision has a feed forward and a feed backward loop, where low level feature are used to refine high level features in forward feed, and higher level feature information is used to refine the low level features in backward feed. Most vision algorithms are based on a feed-forward loop, where low-level features are used to construct and refine high level features, but they don’t have the feed back loop. We have introduced ”Shape-Tailored Local Descriptors”, where we use the high level feature information (region approximation) to update low level features i.e. the descriptor, and the low level feature information of the descriptor is used to update the segmentation regions. Our ”Shape-Tailored Local Descriptor” are dense local descriptors which are tailored to an arbitrarily shaped region, aggregating data only within the region of interest. Since the segmentation, i.e., the regions, are not known a-priori, we propose a joint problem for Shape-Tailored Local Descriptors and Segmentation (regions). Furthermore, since natural scenes consist of multiple objects, which may have different visual textures at different scales, we propose to use a multi-scale approach to segmentation. We have used a set of discrete scales, and a continuum of scales in our experiments, both resulted in state-of-the-art performance. Lastly we have looked into the nature of the features selected, we tried handcrafted color and gradient channels and we have also introduced an algorithm to incorporate learning optimal descriptors in segmentation approaches. In the final part of this thesis we have introduced techniques for unsupervised learning of descriptors for segmentation. This eliminates the problem of deep learning methods where we need huge amounts of training data to train the networks. The optimum descriptors are learned, without any training data, on the go during segmentation. segmentation Computer Vision Deep Learning Descriptors Textures
45	RNA Sequence Classification Using Secondary Structure Fingerprints, Sequence-Based Features, and Deep Learning Sutanto, Kevin 12 March 2021 (has links) RNAs are involved in different facets of biological processes; including but not limited to controlling and inhibiting gene expressions, enabling transcription and translation from DNA to proteins, in processes involving diseases such as cancer, and virus-host interactions. As such, there are useful applications that may arise from studies and analyses involving RNAs, such as detecting cancer by measuring the abundance of specific RNAs, detecting and identifying infections involving RNA viruses, identifying the origins of and relationships between RNA viruses, and identifying potential targets when designing novel drugs. Extracting sequences from RNA samples is usually not a major limitation anymore thanks to sequencing technologies such as RNA-Seq. However, accurately identifying and analyzing the extracted sequences is often still the bottleneck when it comes to developing RNA-based applications. Like proteins, functional RNAs are able to fold into complex structures in order to perform specific functions throughout their lifecycle. This suggests that structural information can be used to identify or classify RNA sequences, in addition to the sequence information of the RNA itself. Furthermore, a strand of RNA may have more than one possible structural conformations it can fold into, and it is also possible for a strand to form different structures in vivo and in vitro. However, past studies that utilized secondary structure information for RNA identification purposes have relied on one predicted secondary structure for each RNA sequence, despite the possible one-to-many relationship between a strand of RNA and the possible secondary structures. Therefore, we hypothesized that using a representation that includes the multiple possible secondary structures of an RNA for classification purposes may improve the classification performance. We proposed and built a pipeline that produces secondary structure fingerprints given a sequence of RNA, that takes into account the aforementioned multiple possible secondary structures for a single RNA. Using this pipeline, we explored and developed different types of secondary structure fingerprints in our studies. A type of fingerprints serves as high-level topological representations of the RNA structure, while another type represents matches with common known RNA secondary structure motifs we have curated from databases and the literature. Next, to test our hypothesis, the different fingerprints are then used with deep learning and with different datasets, alone and together with various sequence-based features, to investigate how the secondary structure fingerprints affect the classification performance. Finally, by analyzing our findings, we also propose approaches that can be adopted by future studies to further improve our secondary structure fingerprints and classification performance. RNA secondary structure k-mer deep learning
46	3D Object Detection for Advanced Driver Assistance Systems Demilew, Selameab 29 June 2021 (has links) Robust and timely perception of the environment is an essential requirement of all autonomous and semi-autonomous systems. This necessity has been the main factor behind the rapid growth and adoption of LiDAR sensors within the ADAS sensor suite. In this thesis, we develop a fast and accurate 3D object detector that converts raw point clouds collected by LiDARs into sparse occupancy cuboids to detect cars and other road users using deep convolutional neural networks. The proposed pipeline reduces the runtime of PointPillars by 43% and performs on par with other state-of-the-art models. We do not gain improvements in speed by compromising the network's complexity and learning capacity but rather through the use of an efficient input encoding procedure. In addition to rigorous profiling on three different platforms, we conduct a comprehensive error analysis and recognize principal sources of error among the predicted attributes. Even though point clouds adequately capture the 3D structure of the physical world, they lack the rich texture information present in color images. In light of this, we explore the possibility of fusing the two modalities with the intent of improving detection accuracy. We present a late fusion strategy that merges the classification head of our LiDAR-based object detector with semantic segmentation maps inferred from images. Extensive experiments on the KITTI 3D object detection benchmark demonstrate the validity of the proposed fusion scheme. 3D Object Detection Autonomous Vehicles Deep Learning
47	Reconfigurable Snapshot HDR Imaging Using Coded Masks Alghamdi, Masheal M. 10 July 2021 (has links) High Dynamic Range (HDR) image acquisition from a single image capture, also known as snapshot HDR imaging, is challenging because the bit depths of camera sensors are far from sufficient to cover the full dynamic range of the scene. Existing HDR techniques focus either on algorithmic reconstruction or hardware modification to extend the dynamic range. In this thesis, we propose a joint design for snapshot HDR imaging by devising a spatially varying modulation mask in the hardware combined with a deep learning algorithm to reconstruct the HDR image. In this approach, we achieve a reconfigurable HDR camera design that does not require custom sensors, and instead can be reconfigured between HDR and conventional mode with very simple calibration steps. We demonstrate that the proposed hardware-software solution offers a flexible, yet robust, way to modulate per-pixel exposures, and the network requires little knowledge of the hardware to faithfully reconstruct the HDR image. Comparative analysis demonstrated that our method outperforms the state-of-the-art in terms of visual perception quality. We leverage transfer learning to overcome the lack of sufficiently large HDR datasets available. We show how transferring from a different large scale task (image classification on ImageNet) leads to considerable improvements in HDR reconstruction computational photography high dynamic range deep learning
48	Minimalism in deep learning Jensen, Louis 24 February 2022 (has links) As deep learning continues to push the boundaries with applications previously thought impossible, it has become more important than ever to reduce the associated resource costs. Data is not always abundant, labelling costs valuable human time, and deep models are demanding of computer hardware. In this dissertation, I will examine questions of minimalism in deep learning. I will show that deep learning can learn with fewer measurements, fewer weights, and less information. With minimalism, deep learning can become even more ubiquitous, succeeding in more applications and on more everyday devices. Computer science Deep learning Neural networks
49	A Study of Transformer Models for Emotion Classification in Informal Text Esperanca, Alvaro Soares de Boa 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Textual emotion classification is a task in affective AI that branches from sentiment analysis and focuses on identifying emotions expressed in a given text excerpt. It has a wide variety of applications that improve human-computer interactions, particularly to empower computers to understand subjective human language better. Significant research has been done on this task, but very little of that research leverages one of the most emotion-bearing symbols we have used in modern communication: Emojis. In this thesis, we propose several transformer-based models for emotion classification that processes emojis as input tokens and leverages pretrained models and uses them , a model that processes Emojis as textual inputs and leverages DeepMoji to generate affective feature vectors used as reference when aggregating different modalities of text encoding. To evaluate ReferEmo, we experimented on the SemEval 2018 and GoEmotions datasets, two benchmark datasets for emotion classification, and achieved competitive performance compared to state-of-the-art models tested on these datasets. Notably, our model performs better on the underrepresented classes of each dataset. NLP Deep Learning Emotion Classification BERT Emojis
50	Feature Detection from Mobile LiDAR Using Deep Learning Liu, Xian 12 March 2019 (has links) No description available. Computer Science Deep learning, LiDAR, Feature Detection

Search results