331 |
Transfer Learning and Hyperparameter Optimisation with Convolutional Neural Networks for Fashion Style Classification and Image RetrievalAlishev, Andrey January 2024 (has links)
The thesis explores the application of Convolutional Neural Networks (CNNs) in the fashion industry, focusing on fashion style classification and image retrieval. Employing transfer learning, the study investigates the effectiveness of fine-tuning pre-trained CNN models to adapt them for a specific fashion recognition task by initially performing an extensive hyperparameter optimisation, utilising the Optuna framework. The impact of dataset size on model performance was examined by comparing the accuracy of models trained on datasets containing 2000 and 8000 images. Results indicate that larger datasets significantly improve model performance, particularly for more complex models like EfficientNetV2S, which showed the best overall performance with an accuracy of 85.38% on the larger dataset after fine-tuning. The best-performing and fine-tuned model was subsequently used for image retrieval as features were extracted from the last convolutional layer. These features were used in a cosine similarity measure to rank images by their similarity to a query image. This technique achieved a mean average precision (mAP) of 0.4525, indicating that CNNs hold promise for enhancing fashion retrieval systems, although further improvements and validations are necessary. Overall, this research highlights the versatility of CNNs in interpreting and categorizing complex visual data. The importance of well-prepared, targeted data and refined model training strategies is highlighted to enhance the accuracy and applicability of AI in diverse fields.
|
332 |
Using Visual Abstractions to Improve Spatially Aware Nominal Safety in Autonomous VehiclesModak, Varun Nimish 05 June 2024 (has links)
As autonomous vehicles (AVs) evolve, ensuring their safety extends beyond traditional met- rics. While current nominal safety scores focus on the timeliness of AV responses like latency or instantaneous response time, this paper proposes expanding the concept to include spatial configurations formed by obstacles with respect to the ego-vehicle. By analyzing these spatial relationships, including proximity, density and arrangement, this research aims to demon- strate how these factors influence the safety force field around the AV. The goal is to show that beyond meeting Responsibility-Sensitive Safety (RSS) metrics, spatial configurations significantly impact the safety force field, particularly affecting path planning capability. High spatial occupancy of obstacle configurations can impede easy maneuverability, thus challenging safety-critical modules like path planning. This paper aims to capture this by proposing a safety score that leverages the ability of modern computer vision techniques, par- ticularly image segmentation models, to capture high and low levels of spatial and contextual information. By enhancing the scope of nominal safety to include such spatial analysis, this research aims to broaden the understanding of drivable space and enable AV designers to evaluate path planning algorithms based on spatial configuration centric safety levels. / Master of Science / As self-driving cars become more common, ensuring their safety is crucial. While current safety measures focus on how quickly these cars can react to dangers, this paper suggests that understanding the spatial relationships between the car and obstacles is just as important, and needs to be explored further. Prior metrics use velocity and acceleration of all the actors, to determine the safe-distance of obstacles from the vehicle, and determine how fast the car should react before a predicted collision. This paper aims to extend the scope of how safety is viewed during normal operating conditions of the vehicle by considering the arrangement of obstacles around it as an influencing factor to safety. By using advanced computer vision techniques, particularly models that can understand images in detail, this research proposes a new spatial safety metric. This score considers how well the car navigates through dense environments by understanding the spatial configurations that obstacles form. By studying these factors, I wish to introduce a metric that improves how self-driving cars are designed to navigate and path plan safely on the roads.
|
333 |
Novel Electrochemical Methods for Human NeurochemistryEltahir, Amnah 14 October 2020 (has links)
Computational psychiatry describes psychological phenomena as abnormalities in biological computations. Current available technologies span multiple organizational and temporal domains, but there remains a knowledge gap with respect to neuromodulator dynamics in humans. Recent efforts by members of the Montague Laboratory and collaborators adapted fast scan cyclic voltammetry (FSCV) from rodent experiments for use in human patients already receiving brain surgery. The process of modifying established FSCV methods for clinical application has led improved model building strategies, and a new "random burst" sensing protocol. The advent of random burst sensing raises questions about the capabilities of in-vivo electrochemistry techniques, while opening introducing possibilities for novel approaches. Through a series of in-vitro experiments, this study aims to explore and validate novel electrochemical sensing approaches. Initial expository experiments tested assumptions about waveform design to detect dopamine concentrations by reducing amplitude and duration of forcing functions, as well as distinguishing norepinephrine concentrations. Next, large data sets collected on mixtures of dopamine, serotonin and pH validated a newly proposed "low amplitude random burst sensing" protocol, for both within-probe and out-of-probe modeling. Data collected on the same set of solutions also attempted to establish an order-millisecond random burst sensing approach. Preliminary endeavors into using convolutional neural networks also provided an example of an alternative modeling strategy. The results of this work challenge existing assumptions of neurochemistry, while demonstrating the capabilities of new neurochemical sensing approaches. This study will also act as a springboard for emerging technological developments in human neurochemistry. / Doctor of Philosophy / Neuroscience characterizes nervous system functions from the cellular to the systems level. A gap in available technologies has prevented neuroscientist from studying how changes in the molecular dynamics in the brain relate to psychiatric conditions. Recent efforts by the Montague Laboratory have adapted neurochemistry techniques for use in human patients. Consequently, a new "random burst sensing" approach was developed that challenged existing assumptions about electrochemistry. In this study, in-vivo experiments were conducted to push the limits of electrochemical sensing by reducing the voltage amplitude range and increasing sensing temporal resolution of electrochemical sensing beyond previously established limits. The results of this study offer novel neurochemistry approaches and act as a jumping off point for future technological developments.
|
334 |
Rapid Prediction of Tsunamis and Storm Surges Using Machine LearningLee, Michael 27 April 2021 (has links)
Tsunami and storm surge are two of the main destructive and costly natural hazards faced by coastal communities around the world. To enhance coastal resilience and to develop effective risk management strategies, accurate and efficient tsunami and storm surge prediction models are needed. However, existing physics-based numerical models have the disadvantage of being difficult to satisfy both accuracy and efficiency at the same time. In this dissertation, several surrogate models are developed using statistical and machine learning techniques that can rapidly predict a tsunami and storm surge without substantial loss of accuracy, with respect to high-fidelity physics-based models. First, a tsunami run-up response function (TRRF) model is developed that can rapidly predict a tsunami run-up distribution from earthquake fault parameters. This new surrogate modeling approach reduces the number of simulations required to build a surrogate model by separately modeling the leading order contribution and the residual part of the tsunami run-up distribution. Secondly, a TRRF-based inversion (TRRF-INV) model is developed that can infer a tsunami source and its impact from tsunami run-up records. Since this new tsunami inversion model is based on the TRRF model, it can perform a large number of tsunami forward simulations in tsunami inversion modeling, which is impossible with physics-based models. And lastly, a one-dimensional convolutional neural network combined with principal component analysis and k-means clustering (C1PKNet) model is developed that can rapidly predict the peak storm surge from tropical cyclone track time series. Because the C1PKNet model uses the tropical cyclone track time series, it has the advantage of being able to predict more diverse tropical cyclone scenarios than the existing surrogate models that rely on a tropical cyclone condition at one moment (usually at or near landfall). The surrogate models developed in this dissertation have the potential to save lives, mitigate coastal hazard damage, and promote resilient coastal communities. / Doctor of Philosophy / Tsunami and storm surge can cause extensive damage to coastal communities; to reduce this damage, accurate and fast computer models are needed that can predict the water level change caused by these coastal hazards. The problem is that existing physics-based computer models are either accurate but slow or less accurate but fast. In this dissertation, three new computer models are developed using statistical and machine learning techniques that can rapidly predict a tsunami and storm surge without substantial loss of accuracy compared to the accurate physics-based computer models. Three computer models are as follows: (1) A computer model that can rapidly predict the maximum ground elevation wetted by the tsunami along the coastline from earthquake information, (2) A computer model that can reversely predict a tsunami source and its impact from the observations of the maximum ground elevation wetted by the tsunami, (3) A computer model that can rapidly predict peak storm surges across a wide range of coastal areas from the tropical cyclone's track position over time. These new computer models have the potential to improve forecasting capabilities, advance understanding of historical tsunami and storm surge events, and lead to better preparedness plans for possible future tsunamis and storm surges.
|
335 |
Study of Critical Phenomena with Monte Carlo and Machine Learning TechniquesAzizi, Ahmadreza 08 July 2020 (has links)
Dynamical properties of non-equilibrium systems, similar to equilibrium ones, have been shown to obey robust time scaling laws which have enriched the concept of physical universality classes. In the first part of this Dissertation, we present the results of our investigations of some of the critical dynamical properties of systems belonging to the Voter or the Directed Percolation (DP) universality class. To be more precise, we focus on the aging properties of two-state and three-state Potts models with absorbing states and we determine temporal scaling of autocorrelation and autoresponse functions.
We propose a novel microscopic model which exhibits non-equilibrium critical points belonging to the Voter, DP and Ising Universality classes. We argue that our model has properties similar to the Generalized Voter Model (GVM) in its Langevin description. Finally, we study the time evolution of the width of interfaces separating different absorbing states.
The second part of this Dissertation is devoted to the applications of Machine Learning models in physical systems. First, we show that a trained Convolutional Neural Network (CNN) using configurations from the Ising model with conserved magnetization is able to find the location of the critical point. Second, using as our training dataset configurations of Ising models with conserved or non-conserved magnetization obtained in importance sampling Monte Carlo simulations, we investigate the physical properties of configurations generated by the Restricted Boltzmann Machine (RBM) model.
The first part of this research was sponsored by the US Army Research Office and was accomplished under Grant Number W911NF-17-1-0156.
The second part of this work was supported by the United States National Science Foundation through grant DMR-1606814. / Doctor of Philosophy / Physical systems with equilibrium states contain common properties with which they are categorized in different universality classes. Similar to these equilibrium systems, non-equilibrium systems may obey robust scaling laws and lie in different dynamic universality classes. In the first part of this Dissertation, we investigate the dynamical properties of two important dynamic universality classes, the Directed Percolation universality class and the Generalized Voter universality class. These two universality classes include models with absorbing states. A good example of an absorbing state is found in the contact process for epidemic spreading when all individuals are infected. We also propose a microscopic model with tunable parameters which exhibits phase transitions belonging to the Voter, Directed Percolation and Ising universality classes. To identify these universality classes, we measure specific dynamic and static quantities, such as interface density at different values of the tunable parameters and show that the physical properties of these quantities are identical to what is expected for the different universal classes.
The second part of this Dissertation is devoted to the application of Machine Learning models in physical systems. Considering physical system configurations as input dataset for our machine learning pipeline, we extract properties of the input data through our machine learning models. As a supervised learning model, we use a deep neural network model and train it using configurations from the Ising model with conserved dynamics. Finally, we address the question whether generative models in machine learning (models that output objects that are similar to inputs) are able to produce new configurations with properties similar to those obtained from given physical models. To this end we train a well known generative model, the Restricted Boltzmann Machine (RBM), on Ising configurations with either conserved or non-conserved magnetization at different temperatures and study the properties of configurations generated by RBM.
The first part of this research was sponsored by the US Army Research Office and was accomplished under Grant Number W911NF-17-1-0156.
The second part of this work was supported by the United States National Science Foundation through grant DMR-1606814.
|
336 |
A Multimodal Graph Convolutional Approach to Predict Genes Associated with Rare Genetic DiseasesSahasrabudhe, Dhruva Shrikrishna 11 September 2020 (has links)
There exist a large number of rare genetic diseases in humans. Our knowledge of the specific gene variants whose presence in the genome of a person predisposes them towards developing a disease, called gene associations, is incomplete. Computational tools which can predict genes which may be associated with a rare disease have great utility in healthcare. However, a majority of existing prediction algorithms require a set of already known "seed genes'' to further discover novel associations for a disease. This drawback becomes more serious for rare genetic diseases, since a large proportion do not have any known gene associations. In this work, we develop an approach for disease-gene association prediction that overcomes the reliance on seed genes. Our approach uses the similarity of the observable biological characteristics of diseases (i.e., phenotypes) along with a global map of direct and indirect human protein interactions, to transfer associations from diseases whose gene associations have been discovered to diseases with no known gene associations. We formulate disease-gene association prediction over a multimodal network of diseases and genes, and develop an approach based on graph convolutional networks. We show how our model design considerations impact prediction performance. We demonstrate that our approach outperforms simpler graph machine learning and traditional machine learning approaches, as well as a competitive network propagation based approach for the task of predicting disease-gene associations. / Master of Science / There exist a large number of rare genetic diseases in humans. Our knowledge of the specific gene variants whose presence in the genome of a person predisposes them towards developing a disease, called gene associations, is incomplete. Computational tools which can predict genes which may be associated with a rare disease have great utility in healthcare. However, a majority of existing prediction algorithms require a set of already known "seed genes'' to further discover novel associations for a disease. This drawback becomes more serious for rare genetic diseases, since a large proportion do not have any known gene associations. In this work, we develop an approach for disease-gene association prediction that overcomes the reliance on seed genes. Our approach uses the similarity of the observable biological characteristics of diseases (i.e. disease phenotypes) along with a global map of direct and indirect human protein interactions, to transfer gene associations from diseases whose gene associations have been discovered, to diseases with no known associations. We implement an approach based on the field of graph machine learning, namely graph convolutional networks, to predict the genes associated with rare genetic diseases. We show how our predictor performs, compared to other approaches, and analyze some of the choices made in the design of the predictor, along with some properties of the outputs of our predictor.
|
337 |
Vehicle Detection in Deep LearningXiao, Yao 08 July 2019 (has links)
Computer vision techniques are becoming increasingly popular. For example, face recognition is used to help police find criminals, vehicle detection is used to prevent drivers from serious traffic accidents, and written word recognition is used to convert written words into printed words. With the rapid development of vehicle detection given the use of deep learning techniques, there are still concerns about the performance of state-of-the-art vehicle detection techniques. For example, state-of-the-art vehicle detectors are restricted by the large variation of scales. People working on vehicle detection are developing techniques to solve this problem. This thesis proposes an advanced vehicle detection model, adopting one of the classical neural networks, which are the residual neural network and the region proposal network. The model utilizes the residual neural network as a feature extractor and the region proposal network to detect the potential objects' information. / Master of Science / Computer vision techniques are becoming increasingly popular. For example, face recognition is used to help police find criminals, vehicle detection is used to prevent drivers from serious traffic accidents, and written word recognition is used to convert written words into printed words. With the rapid development of vehicle detection given the use of deep learning techniques, there are still concerns about the performance of state-of-the art vehicle detection techniques. For example, state-of-the-art vehicle detectors are restricted by the large variation of scales. People working on vehicle detection are developing techniques to solve this problem. This thesis proposes an advanced vehicle detection model, utilizing deep learning techniques to detect the potential objects’ information.
|
338 |
Supervised Inference of Gene Regulatory NetworksSen, Malabika Ashit 09 September 2021 (has links)
A gene regulatory network (GRN) records the interactions among transcription
factors and their target genes. GRNs are useful to study how transcription factors (TFs) control
gene expression as cells transition between states during differentiation and development.
Scientists usually construct GRNs by careful examination and study of the literature. This
process is slow and painstaking and does not scale to large networks. In this thesis, we study
the problem of inferring GRNs automatically from gene expression data. Recent data-driven
approaches to infer GRNs increasingly rely on single-cell level RNA-sequencing (scRNA-seq)
data. Most of these methods rely on unsupervised or association based strategies, which
cannot leverage known regulatory interactions by design. To facilitate supervised learning,
we propose a novel graph convolutional neural network (GCN) based autoencoder to infer
new regulatory edges from a known GRN and scRNA-seq data. As the name suggests, a
GCN-based autoencoder consists of an encoder that learns a low-dimensional embedding
of the nodes (genes) in the input graph (the GRN) through a series of graph convolution
operations and a decoder that aims to reconstruct the original graph as accurately as possible.
We investigate several GCN-based architectures to determine the ideal encoder-decoder
combination for GRN reconstruction. We systematically study the performance of these
and other supervised learning methods on different mouse and human scRNA-seq datasets
for two types of evaluation. We demonstrate that our GCN-based approach substantially
outperforms traditional machine learning approaches. / Master of Science / In multi-cellular living organisms, stem cells differentiate into multiple cell types.
Proteins called transcription factors (TFs) control the activity of genes to effect these transitions.
It is possible to represent these interactions abstractly using a gene regulatory network
(GRN). In a GRN, each node is a TF or a gene and each edge connects a TF to a gene or
TF that it controls. New high-throughput technologies that can measure gene expression
(activity) in individual cells provide rich data that can be used to construct GRNs. In this
thesis, we take advantage of recent advances in the field of machine learning to develop
a new computational method for computationally constructing GRNs. The distinguishing
property of our technique is that it is supervised, i.e., it uses experimentally-known interactions
to infer new regulatory connections. We investigate several variations of this approach
to reconstruct a GRN as close to the original network as possible. We analyze and provide
a rationale for the decisions made in designing, evaluating, and choosing the characteristics
of our predictor. We show that our predictor has a reconstruction accuracy that is superior
to other supervised-learning approaches.
|
339 |
Behind the Scenes: Evaluating Computer Vision Embedding Techniques for Discovering Similar Photo BackgroundsDodson, Terryl Dwayne 11 July 2023 (has links)
Historical photographs can generate significant cultural and economic value, but often their subjects go unidentified. However, if analyzed correctly, visual clues in these photographs can open up new directions in identifying unknown subjects. For example, many 19th century photographs contain painted backdrops that can be mapped to a specific photographer or location, but this research process is often manual, time-consuming, and unsuccessful. AI-based computer vision algorithms could be used to automatically identify painted backdrops or photographers or cluster photos with similar backdrops in order to aid researchers. However, it is unknown which computer vision algorithms are feasible for painted backdrop identification or which techniques work better than others. We present three studies evaluating four different types of image embeddings – Inception, CLIP, MAE, and pHash – across a variety of metrics and techniques. We find that a workflow using CLIP embeddings combined with a background classifier and simulated user feedback performs best. We also discuss implications for human-AI collaboration in visual analysis and new possibilities for digital humanities scholarship. / Master of Science / Historical photographs can generate significant cultural and economic value, but often their subjects go unidentified. However, if these photographs are analyzed correctly, clues in these photographs can open up new directions in identifying unknown subjects. For example, many 19th century photographs contain painted backdrops that can be mapped to a specific photographer or location, but this research process is often manual, time-consuming, and unsuccessful. Artificial Intelligence-based computer vision techniques could be used to automatically identify painted backdrops or photographers or group together photos with similar backdrops in order to aid researchers. However, it is unknown which computer vision techniques are feasible for painted backdrop identification or which techniques work better than others. We present three studies comparing four different types of computer vision techniques – Inception, CLIP, MAE, and pHash – across a variety of metrics. We find that a workflow that combines the CLIP computer vision technique, software that automatically classifies photo backgrounds, and simulated human feedback performs best. We also discuss implications for collaboration between humans and AI for analyzing images and new possibilities for academic research combining technology and history.
|
340 |
Towards Cyberbullying-free social media in smart cities: a unified multi-modal approachKumari, K., Singh, J.P., Dwivedi, Y.K., Rana, Nripendra P. 27 September 2020 (has links)
Yes / Smart cities are shifting the presence of people from physical world to cyber world (cyberspace). Along with the facilities for societies, the troubles of physical world, such as bullying, aggression and hate speech, are also taking their presence emphatically in cyberspace. This paper aims to dig the posts of social media to identify the bullying comments containing text as well as image. In this paper, we have proposed a unified representation of text and image together to eliminate the need for separate learning modules for image and text. A single-layer Convolutional Neural Network model is used with a unified representation. The major findings of this research are that the text represented as image is a better model to encode the information. We also found that single-layer Convolutional Neural Network is giving better results with two-dimensional representation. In the current scenario, we have used three layers of text and three layers of a colour image to represent the input that gives a recall of 74% of the bullying class with one layer of Convolutional Neural Network. / Ministry of Electronics and Information Technology (MeitY), Government of India
|
Page generated in 0.0199 seconds