Global ETD Search

11	Self-supervised monocular image depth learning and confidence estimation Chen, L., Tang, W., Wan, Tao Ruan, John, N.W. 17 June 2020 (has links) No / We present a novel self-supervised framework for monocular image depth learning and confidence estimation. Our framework reduces the amount of ground truth annotation data required for training Convolutional Neural Networks (CNNs), which is often a challenging problem for the fast deployment of CNNs in many computer vision tasks. Our DepthNet adopts a novel fully differential patch-based cost function through the Zero-Mean Normalized Cross Correlation (ZNCC) to take multi-scale patches as matching and learning strategies. This approach greatly increases the accuracy and robustness of the depth learning. Whilst the proposed patch-based cost function naturally provides a 0-to-1 confidence, it is then used to self-supervise the training of a parallel network for confidence map learning and estimation by exploiting the fact that ZNCC is a normalized measure of similarity which can be approximated as the confidence of the depth estimation. Therefore, the proposed corresponding confidence map learning and estimation operate in a self-supervised manner and is a parallel network to the DepthNet. Evaluation on the KITTI depth prediction evaluation dataset and Make3D dataset show that our method outperforms the state-of-the-art results. Confidence map Deep convolutional neural networks Monocular depth estimation
12	Squeeze and Excite Residual Capsule Network for Embedded Edge Devices Naqvi, Sami 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / During recent years, the field of computer vision has evolved rapidly. Convolutional Neural Networks (CNNs) have become the chosen default for implementing computer vision tasks. The popularity is based on how the CNNs have successfully performed the well-known computer vision tasks such as image annotation, instance segmentation, and others with promising outcomes. However, CNNs have their caveats and need further research to turn them into reliable machine learning algorithms. The disadvantages of CNNs become more evident as the approach to breaking down an input image becomes apparent. Convolutional neural networks group blobs of pixels to identify objects in a given image. Such a technique makes CNNs incapable of breaking down the input images into sub-parts, which could distinguish the orientation and transformation of objects and their parts. The functions in a CNN are competent at learning only the shift-invariant features of the object in an image. The discussed limitations provides researchers and developers a purpose for further enhancing an effective algorithm for computer vision. The opportunity to improve is explored by several distinct approaches, each tackling a unique set of issues in the convolutional neural network’s architecture. The Capsule Network (CapsNet) which brings an innovative approach to resolve issues pertaining to affine transformations by sharing transformation matrices between the different levels of capsules. While, the Residual Network (ResNet) introduced skip connections which allows deeper networks to be more powerful and solves vanishing gradient problem. The motivation of these fusion of these advantageous ideas of CapsNet and ResNet with Squeeze and Excite (SE) Block from Squeeze and Excite Network, this research work presents SE-Residual Capsule Network (SE-RCN), an efficient neural network model. The proposed model, replaces the traditional convolutional layer of CapsNet with skip connections and SE Block to lower the complexity of the CapsNet. The performance of the model is demonstrated on the well known datasets like MNIST and CIFAR-10 and a substantial reduction in the number of training parameters is observed in comparison to similar neural networks. The proposed SE-RCN produces 6.37 Million parameters with an accuracy of 99.71% on the MNIST dataset and on CIFAR-10 dataset it produces 10.55 Million parameters with 83.86% accuracy. Computer Vision Convolutional Neural Networks Capsule Network Dynamic Routing
13	Deep face recognition using imperfect facial data Elmahmudi, Ali A.M., Ugail, Hassan 27 April 2019 (has links) Yes / Today, computer based face recognition is a mature and reliable mechanism which is being practically utilised for many access control scenarios. As such, face recognition or authentication is predominantly performed using ‘perfect’ data of full frontal facial images. Though that may be the case, in reality, there are numerous situations where full frontal faces may not be available — the imperfect face images that often come from CCTV cameras do demonstrate the case in point. Hence, the problem of computer based face recognition using partial facial data as probes is still largely an unexplored area of research. Given that humans and computers perform face recognition and authentication inherently differently, it must be interesting as well as intriguing to understand how a computer favours various parts of the face when presented to the challenges of face recognition. In this work, we explore the question that surrounds the idea of face recognition using partial facial data. We explore it by applying novel experiments to test the performance of machine learning using partial faces and other manipulations on face images such as rotation and zooming, which we use as training and recognition cues. In particular, we study the rate of recognition subject to the various parts of the face such as the eyes, mouth, nose and the cheek. We also study the effect of face recognition subject to facial rotation as well as the effect of recognition subject to zooming out of the facial images. Our experiments are based on using the state of the art convolutional neural network based architecture along with the pre-trained VGG-Face model through which we extract features for machine learning. We then use two classifiers namely the cosine similarity and the linear support vector machines to test the recognition rates. We ran our experiments on two publicly available datasets namely, the controlled Brazilian FEI and the uncontrolled LFW dataset. Our results show that individual parts of the face such as the eyes, nose and the cheeks have low recognition rates though the rate of recognition quickly goes up when individual parts of the face in combined form are presented as probes. Face recognition Convolutional neural networks Deep learning Cosine similarity
14	Deep Learning for Taxonomy Prediction Ramesh, Shreyas 04 June 2019 (has links) The last decade has seen great advances in Next-Generation Sequencing technologies, and, as a result, there has been a rise in the number of genomes sequenced each year. In 2017, there were as many as 10,000 new organisms sequenced and added into the RefSeq Database. Taxonomy prediction is a science involving the hierarchical classification of DNA fragments up to the rank species. In this research, we introduce Predicting Linked Organisms, Plinko, for short. Plinko is a fully-functioning, state-of-the-art predictive system that accurately captures DNA - Taxonomy relationships where other state-of-the-art algorithms falter. Plinko leverages multi-view convolutional neural networks and the pre-defined taxonomy tree structure to improve multi-level taxonomy prediction. In the Plinko strategy, each network takes advantage of different word usage patterns corresponding to different levels of evolutionary divergence. Plinko has the advantages of relatively low storage, GPGPU parallel training and inference, making the solution portable, and scalable with anticipated genome database growth. To the best of our knowledge, Plinko is the first to use multi-view convolutional neural networks as the core algorithm in a compositional,alignment-free approach to taxonomy prediction. / Master of Science / Taxonomy prediction is a science involving the hierarchical classification of DNA fragments up to the rank species. Given species diversity on Earth, taxonomy prediction gets challenging with (i) increasing number of species (labels) to classify and (ii) decreasing input (DNA) size. In this research, we introduce Predicting Linked Organisms, Plinko, for short. Plinko is a fully-functioning, state-of-the-art predictive system that accurately captures DNA - Taxonomy relationships where other state-of-the-art algorithms falter. Three major challenges in taxonomy prediction are (i) large dataset sizes (order of 109 sequences) (ii) large label spaces (order of 103 labels) and (iii) low resolution inputs (100 base pairs or less). Plinko leverages multi-view convolutional neural networks and the pre-defined taxonomy tree structure to improve multi-level taxonomy prediction for hard to classify sequences under the three conditions stated above. Plinko has the advantage of relatively low storage footprint, making the solution portable, and scalable with anticipated genome database growth. To the best of our knowledge, Plinko is the first to use multi-view convolutional neural networks as the core algorithm in a compositional, alignment-free approach to taxonomy prediction. taxonomy prediction convolutional neural networks hierarchical prediction cnn taxonomic binning
15	Measuring the Functionality of Amazon Alexa and Google Home Applications Wang, Jiamin 01 1900 (has links) Voice Personal Assistant (VPA) is a software agent, which can interpret the user's voice commands and respond with appropriate information or action. The users can operate the VPA by voice to complete multiple tasks, such as read the message, order coffee, send an email, check the news, and so on. Although this new technique brings in interesting and useful features, they also pose new privacy and security risks. The current researches have focused on proof-of-concept attacks by pointing out the potential ways of launching the attacks, e.g., craft hidden voice commands to trigger malicious actions without noticing the user, fool the VPA to invoke the wrong applications. However, lacking a comprehensive understanding of the functionality of the skills and its commands prevents us from analyzing the potential threats of these attacks systematically. In this project, we developed convolutional neural networks with active learning and keyword-based approach to investigate the commands according to their capability (information retrieval or action injection) and sensitivity (sensitive or nonsensitive). Through these two levels of analysis, we will provide a complete view of VPA skills, and their susceptibility to the existing attacks. / M.S. / Voice Personal Assistant (VPA) is a software agent, which can interpret the users' voice commands and respond with appropriate information or action. The current popular VPAs are Amazon Alexa, Google Home, Apple Siri and Microsoft Cortana. The developers can build and publish third-party applications, called skills in Amazon Alex and actions in Google Homes on the VPA server. The users simply "talk" to the VPA devices to complete different tasks, like read the message, order coffee, send an email, check the news, and so on. Although this new technique brings in interesting and useful features, they also pose new potential security threats. Recent researches revealed that the vulnerabilities exist in the VPA ecosystems. The users can incorrectly invoke the malicious skill whose name has similar pronunciations to the user-intended skill. The inaudible voice triggers the unintended actions without noticing users. All the current researches focused on the potential ways of launching the attacks. The lack of a comprehensive understanding of the functionality of the skills and its commands prevents us from analyzing the potential consequences of these attacks systematically. In this project, we carried out an extensive analysis of third-party applications from Amazon Alexa and Google Home to characterize the attack surfaces. First, we developed a convolutional neural network with active learning framework to categorize the commands according to their capability, whether they are information retrieval or action injection commands. Second, we employed the keyword-based approach to classifying the commands into sensitive and nonsensitive classes. Through these two levels of analysis, we will provide a complete view of VPA skills' functionality, and their susceptibility to the existing attacks. Natural Language Processing convolutional neural networks Active learning RAKE security
16	Alzheimer’s Detection With The Discrete Wavelet Transform And Convolutional Neural Networks Nardone, Melissa N 01 December 2022 (has links) (PDF) Alzheimer’s disease slowly destroys an individual’s memory, and it is estimated to impact more than 5.5 million Americans. Over time, Alzheimer’s disease can cause behavior and personality changes. Current diagnosis techniques are challenging because individuals may show no clinical signs of the disease in the initial stages. As of today, there is no cure for Alzheimer’s. Therefore, symptom management is key, and it is critical that Alzheimer’s is detected early before major cognitive damage. The approach implemented in this thesis explores the idea of using the Discrete Wavelet Transform (DWT) and Convolutional Neural Networks (CNN) for Alzheimer’s detection. The neural network is trained and tested using Magnetic Resonance Image (MRI) brain scans from the ADNI1 (Alzheimer’s Disease Neuroimaging Initiative) dataset; and various mother wavelets and network hyperparameters are implemented to identify the optimal model. The resulting model can successfully identify patients with mild Alzheimer’s disease (AD) and the ones that are cognitively normal (NL) with an average accuracy of accuracy of 77.53±2.37%, an f1-score of 77.03±3.24%, precision of 80.63±11.03%, recall or sensitivity or 77.90±11.52%, and a specificity of 77.53±2.37%. Discrete Wavelet Transform Convolutional Neural Networks Alzheimer's Detection
17	Plastic contaminants detection and classification in seed cotton using customed design feeder separator unit and convolutional neural network Harjono, Jonathan 13 December 2024 (has links) (PDF) The thesis presents an application of a convolutional neural network (CNN) to detect and classify plastic contaminants in seed cotton in real-time, in a ginning environment. A multi-layered CNN was developed and used to detect plastic contaminants from images captured by a universal serial bus (USB) camera, in a flowing cotton stream, with an artificial lighting system. The CNN was trained using a collection of images captured in the dynamic setting that simulated real-time cotton flow using a custom design feeder system. The CNN was trained in Google Colaboratory (Colab) environment. After the training was completed, it was then converted into TensorRT format and loaded into a Jetson Xavier-based embedded system. The model was able to achieve an 87-93% detection accuracy while maintaining 30-50 frames per second (FPS).
18	Learning with Pre-Defined Filters for Image Classification Zhuang, Chengyuan 12 1900 (has links) Vision is widely acknowledged as the most critical and complex human sense, with visual data constituting roughly 90% of the information transmitted to the brain. As a result, the ability to classify image information is pivotal for enabling computers to interpret the visual world and execute tasks on our behalf. Convolutional neural networks (CNNs) have significantly advanced image classification in recent years, although their training with a vast number of parameters remains challenging, impacting recognition performance. Methods like attention mechanisms have emerged to prioritize extensive training data, which is often difficult to acquire. The limited availability of training data constitutes a significant challenge that forms the core focus of our research. Recent research has begun exploring the integration of predefined filters within CNNs to alleviate the learning burden. However, the integration of these filters, particularly with attention mechanisms, remains an ongoing area of investigation. This dissertation aims to explore effective strategies in this domain. Convolutional neural networks complex human sense attention mechanisms
19	Towards Explainable Decision-making Strategies of Deep Convolutional Neural Networks : An exploration into explainable AI and potential applications within cancer detection Hammarström, Tobias January 2020 (has links) The influence of Artificial Intelligence (AI) on society is increasing, with applications in highly sensitive and complicated areas. Examples include using Deep Convolutional Neural Networks within healthcare for diagnosing cancer. However, the inner workings of such models are often unknown, limiting the much-needed trust in the models. To combat this, Explainable AI (XAI) methods aim to provide explanations of the models' decision-making. Two such methods, Spectral Relevance Analysis (SpRAy) and Testing with Concept Activation Methods (TCAV), were evaluated on a deep learning model classifying cat and dog images that contained introduced artificial noise. The task was to assess the methods' capabilities to explain the importance of the introduced noise for the learnt model. The task was constructed as an exploratory step, with the future aim of using the methods on models diagnosing oral cancer. In addition to using the TCAV method as introduced by its authors, this study also utilizes the CAV-sensitivity to introduce and perform a sensitivity magnitude analysis. Both methods proved useful in discerning between the model’s two decision-making strategies based on either the animal or the noise. However, greater insight into the intricacies of said strategies is desired. Additionally, the methods provided a deeper understanding of the model’s learning, as the model did not seem to properly distinguish between the noise and the animal conceptually. The methods thus accentuated the limitations of the model, thereby increasing our trust in its abilities. In conclusion, the methods show promise regarding the task of detecting visually distinctive noise in images, which could extend to other distinctive features present in more complex problems. Consequently, more research should be conducted on applying these methods on more complex areas with specialized models and tasks, e.g. oral cancer. AI artificial intelligence explainable artificial intelligence convolutional neural networks deep convolutional neural networks XAI explainable AI Other Computer and Information Science Annan data- och informationsvetenskap
20	Interiörs påverkan på lägenheters pris och värdering / The effects of interior condition on price and evaluation of real estate Hemmingsson, Jesper, Häusler Redhe, Adrian January 2021 (has links) Fastighetsvärderingar har historiskt sett utförts av mäklare eller experter på området. Med den växande mängden verktyg på internet för värderingar uppstår frågan hur väl verktygen presterar och vad som kan göras för att förbättra dem. Moderna metoder utgår ifrån försäljningsstatistik av liknande objekt när en värdering görs med tekniska verktyg. Detta med hjälp av olika former av metadata, bland annat storlek, läge och byggnations år. Den här studien utforskar möjligheten att använda interiört skick som variabel i värderingar av lägenheter genom att träna Convolutional Neural Networks för att klassificera for lägenheter i Stockholm, samt undersöka sambandet mellan det interiöra skicket och den felterms om Boolis varderingsalgoritm ger upphov till. Klassificeringsmodellerna tränades på insamlad data för skick av lägenhetsägare samt tillhörande visningsbilder 200 stycken bilder som erhållits av Booli. Studien visar ett statistiskt signifikant samband mellan interiört skick och feltermen från värderingar. Värderingar på lägenheter av högt skick tenderar i genomsnitt att vara 3% för låga, och 3% för höga för lägenheter av lågt skick. Resultaten indikerar att interiör som variabel kan användas för att reducera felet i Boolis varderingsalgoritm. Dock lyckades ej experimentet med att reducera feltermen i någon större utsträckning i detta arbete. / Housing evaluations has historically been made in person by real estate agents or other experts. With growing online tools for evaluations, the question arises how well they perform, and what can be done to improve them. Modern approaches use sales data for similar housing when evaluating a certain house or apartment, with variables mainly being different forms of metadata such as living area, location and year or construction. This study explores the possibility to use the interior condition as a variable in housing evaluations by training Convolutional Neural Networks to classify the condition of kitchens and bathrooms for apartments in Stockholm, Sweden and testing the relationship between said conditions and the error of Booli’s evaluation algorithm. The classification models were trained on crowd sourced data of the condition and the advertisement images for 200 images provided by Booli. The study finds that a statistically significant relationship exists between interior condition and the evaluation error, and the evaluations of apartments tends to be 3% too small on high condition apartments, while being on average 3% too large for low condition apartments. The results of the study indicates that including interior as a variable might reduce the error of Booli’s evaluation algorithm. However, the experiment for doing so in this study failed to do so in any sizeable manner. Convolutional Neural Networks Housing Evaluation Housing Market Machine Learning Stockholm Convolutional Neural Networks Bostadsvardering Bostadsmarknad Maskininlärning Stockholm Other Computer and Information Science Annan data- och informationsvetenskap

Search results