291 |
Compact ConvNets with Ternary Weights and Binary ActivationsHolesovsky, Ondrej January 2017 (has links)
Compact architectures, ternary weights and binary activations are two methods suitable for making neural networks more efficient. We introduce a) a dithering binary activation which improves accuracy of ternary weight networks with binary activations by randomizing quantization error, and b) a method of implementing ternary weight networks with binary activations using binary operations. Despite these new approaches, training a compact SqueezeNet architecture with ternary weights and full precision activations on ImageNet degrades classification accuracy significantly more than when training a less compact architecture the same way. Therefore ternary weights in their current form cannot be called the best method for reducing network size. However, the effect of weight decay on ternary weight network training should be investigated more in order to have more certainty in this finding. / Kompakta arkitekturer, ternära vikter och binära aktiveringar är två metoder som är lämpliga för att göra neurala nätverk effektivare. Vi introducerar a) en dithering binär aktivering som förbättrar noggrannheten av ternärviktsnätverk med binära aktiveringar genom randomisering av kvantiseringsfel, och b) en metod för genomförande ternärviktsnätverk med binära aktiveringar med användning av binära operationer. Trots dessa nya metoder, att träna en kompakt SqueezeNet-arkitektur med ternära vikter och fullprecisionaktiveringar på ImageNet försämrar klassificeringsnoggrannheten betydligt mer än om man tränar en mindre kompakt arkitektur på samma sätt. Därför kan ternära vikter i deras nuvarande form inte kallas bästa sättet att minska nätverksstorleken. Emellertid, effekten av weight decay på träning av ternärviktsnätverk bör undersökas mer för att få större säkerhet i detta resultat.
292 |
Calibration in Eye Tracking Using Transfer Learning / Kalibrering inom Eye Tracking genom överföringsträningMasko, David January 2017 (has links)
This thesis empirically studies transfer learning as a calibration framework for Convolutional Neural Network (CNN) based appearance-based gaze estimation models. A dataset of approximately 1,900,000 eyestripe images distributed over 1682 subjects is used to train and evaluate several gaze estimation models. Each model is initially trained on the training data resulting in generic gaze models. The models are subsequently calibrated for each test subject, using the subject's calibration data, by applying transfer learning through network fine-tuning on the final layers of the network. Transfer learning is observed to reduce the Euclidean distance error of the generic models within the range of 12-21%, which is in line with current state-of-the-art. The best performing calibrated model shows a mean error of 29.53mm and a median error of 22.77mm. However, calibrating heatmap output-based gaze estimation models decreases the performance over the generic models. It is concluded that transfer learning is a viable calibration framework for improving the performance of CNN-based appearance based gaze estimation models. / Detta examensarbete är en empirisk studie på överföringsträning som ramverk för kalibrering av neurala faltningsnätverks (CNN)-baserade bildbaserad blickapproximationsmodeller. En datamängd på omkring 1 900 000 ögonrandsbilder fördelat över 1682 personer används för att träna och bedöma flertalet blickapproximationsmodeller. Varje modell tränas inledningsvis på all träningsdata, vilket resulterar i generiska modeller. Modellerna kalibreras därefter för vardera testperson med testpersonens kalibreringsdata via överföringsträning genom anpassning av de sista lagren av nätverket. Med överföringsträning observeras en minskning av felet mätt som eukilidskt avstånd för de generiska modellerna inom 12-21%, vilket motsvarar de bästa nuvarande modellerna. För den bäst presterande kalibrerade modellen uppmäts medelfelet 29,53mm och medianfelet 22,77mm. Dock leder kalibrering av regionella sannolikhetsbaserade blickapproximationsmodeller till en försämring av prestanda jämfört med de generiska modellerna. Slutsatsen är att överföringsträning är en legitim kalibreringsansats för att förbättra prestanda hos CNN-baserade bildbaserad blickapproximationsmodeller.
293 |
Gland Segmentation with Convolutional Neural Networks : Validity of Stroma Segmentation as a General Approach / Konvolutionella neurala nätverk för segmentering av körtel : Validitet hos stroma-segmentering som en allmän metodBINDER, THOMAS January 2019 (has links)
The analysis of glandular morphology within histopathology images is a crucial step in determining the stage of cancer. Manual annotation is a very laborious task. It is time consuming and suffers from the subjectivity of the specialists that label the glands. One of the aims of computational pathology is developing tools to automate gland segmentation. Such an algorithm would improve the efficiency of cancer diag- nosis. This is a complex task as there is a large variability in glandular morphologies and staining techniques. So far, specialised models have given promising results focusing on only one organ. This work investigated the idea of a cross domain ap- proximation. Unlike parenchymae the stroma tissue that lies between the glands is similar throughout all organs in the body. Creating a model able to precisely seg- ment the stroma would pave the way for a cross organ model. It would be able to segment the tissue and therefore give access to gland morphologies of different organs. To address this issue, we investigated different new and former architec- tures such as the MILD-net which is the currently best performing algorithm of the GlaS challenge. New architectures were created based on the promising U shaped network as well as Xception and the ResNet for feature extraction. These networks were trained on colon histopathology images focusing on glands and on the stroma. The comparision of the different results showed that this initial cross domain ap- proximation goes into the right direction and incites for further developments.
294 |
Electricity Price Forecasting Using a Convolutional Neural NetworkWinicki, Elliott 01 March 2020 (has links) (PDF)
Many methods have been used to forecast real-time electricity prices in various regions around the world. The problem is difficult because of market volatility affected by a wide range of exogenous variables from weather to natural gas prices, and accurate price forecasting could help both suppliers and consumers plan effective business strategies. Statistical analysis with autoregressive moving average methods and computational intelligence approaches using artificial neural networks dominate the landscape. With the rise in popularity of convolutional neural networks to handle problems with large numbers of inputs, and convolutional neural networks conspicuously lacking from current literature in this field, convolutional neural networks are used for this time series forecasting problem and show some promising results.
This document fulfills both MSEE Master's Thesis and BSCPE Senior Project requirements.
295 |
Exploration and Comparison of Image-Based Techniques for Strawberry DetectionLiu, Yongxin 01 September 2020 (has links) (PDF)
Strawberry is an important cash crop in California, and its supply accounts for 80% of the US market [2]. However, in current practice, strawberries are picked manually, which is very labor-intensive and time-consuming. In addition, the farmers need to hire an appropriate number of laborers to harvest the berries based on the estimated volume. When overestimating the yield, it will cause a waste of human resources, while underestimating the yield will cause the loss of the strawberry harvest [3]. Therefore, accurately estimating harvest volume in the field is important to farmers. This paper focuses on an image-based solution to detect strawberries in the field by using the traditional computer vision technique and deep learning method.
When strawberries are in different growth stages, there are considerable differences in their color. Therefore, various color spaces are first studied in this work, and the most effective color components are used in detecting strawberries and differentiating mature and immature strawberries.
In some color channels such as the R color channel from the RGB color model, Hue color channel from the HSV color model, 'a' color channel from the Lab color model, the pixels belonging to ripe strawberries are clearly distinguished from the background pixels. Thus, the color-based K-mean cluster algorithm to detect red strawberries will be exploited. Finally, it achieves a 90.5% truth-positive rate for detecting red strawberries. For detecting the unripe strawberry, this thesis first trained the Support Vector Machine classifier based on the HOG feature. After optimizing the classifier through hard negative mining, the truth-positive rate reached 81.11%.
Finally, when exploring the deep learning model, two detectors based on different pre-trained models were trained using TensorFlow Object Detection API with the acceleration of Amazon Web Services' GPU instance. When detecting in a single strawberry plant image, they have achieved truth-positive rates of 89.2% and 92.3%, respectively; while in the strawberry field image with multiple plants, they have reached 85.5% and 86.3%.
296 |
Attacking Computer Vision Models Using Occlusion Analysis to Create Physically Robust Adversarial ImagesLoh, Jacobsen 01 June 2020 (has links) (PDF)
Self-driving cars rely on their sense of sight to function effectively in chaotic and uncontrolled environments. Thanks to recent developments in computer vision, specifically convolutional neural networks, autonomous vehicles have developed the ability to see at or above human-level capabilities, which in turn has allowed for rapid advances in self-driving cars. Unfortunately, much like humans being confused by simple optical illusions, convolutional neural networks are susceptible to simple adversarial inputs. As there is no overlap between the optical illusions that fool humans and the adversarial examples that threaten convolutional neural networks, little is understood as to why these adversarial examples dupe such advanced models and what effective mitigation techniques might exist to resolve these issues.
This thesis focuses on these adversarial images. By extending existing work, this thesis is able to offer a unique perspective on adversarial examples. Furthermore, these extensions are used to develop a novel attack that can generate physically robust adversarial examples. These physically robust instances provide a unique challenge as they transcend both individual models and the digital domain, thereby posing a significant threat to the efficacy of convolutional neural networks and their dependent applications.
297 |
Exploration of Semi-supervised Learning for Convolutional Neural NetworksSheffler, Nicholas 01 March 2023 (has links) (PDF)
Training a neural network requires a large amount of labeled data that has to be created by either human annotation or by very specifically created methods. Currently, there is a vast abundance of unlabeled data that is neglected sitting on servers, hard drives, websites, etc. These untapped data sources serve as the inspiration for this paper.
The goal of this thesis is to explore and test various methods of semi-supervised learning (SSL) for convolutional neural networks (CNN). These methods will be analyzed and evaluated based on their accuracy on a test set of data. Since this particular neural network will be used to offer paths for an autonomous robot, it is important for the networks to be lightweight in scale. This paper will then take this assortment of smaller neural networks and run them through a variety of semi-supervised training methods. The first method is to have a teacher model that is trained on properly labeled data create labels for unlabeled data and add this to the training set for the next student model. From this base method, a few variations were tried in the hopes of getting a significant improvement. The first variation tested by this thesis is the effects of having this teacher and student cycle run more than one iteration. After that, the effects of using the confidence values that the models produced were explored by both including only data with confidence above a certain value and in a different test, relabeling data below a confidence threshold. The last variation this thesis explored was to have two teacher models concurrently and have the combination of those two models decide on the proper label for the unlabeled data. Through exploration and testing, these methods are evaluated in the results section as to which one produces the best results for SSL.
298 |
Neural Network Based Diagnosis of Breast Cancer Using the Breakhis DatasetDalke, Ross E 01 June 2022 (has links) (PDF)
Breast cancer is the most common type of cancer in the world, and it is the second deadliest cancer for females. In the fight against breast cancer, early detection plays a large role in saving people’s lives. In this work, an image classifier is designed to diagnose breast tumors as benign or malignant. The classifier is designed with a neural network and trained on the BreakHis dataset. After creating the initial design, a variety of methods are used to try to improve the performance of the classifier. These methods include preprocessing, increasing the number of training epochs, changing network architecture, and data augmentation. Preprocessing includes changing image resolution and trying grayscale images rather than RGB. The tested network architectures include VGG16, ResNet50, and a custom structure. The final algorithm creates 50 classifier models and keeps the best one. Classifier designs are primarily judged on the classification accuracies of their best model and their median model. Designs are also judged on how consistently they produce their highest performing models. The final classifier design has a median accuracy of 93.62% and best accuracy of 96.35%. Of the 50 models generated, 46 of them performed with over 85% accuracy. The final classifier design is compared to the works of two groups of researchers who created similar classifiers for the same dataset. This will show that the classifier performs at the same level or better than the classifiers designed by other researchers. The classifier achieves similar performance to the classifier made by the first group of researchers and performs better than the classifier from the second. Finally, the learned lessons and future steps are discussed.
299 |
Object Tracking in Games Using Convolutional Neural NetworksVenkatesh, Anirudh 01 June 2018 (has links) (PDF)
Computer vision research has been growing rapidly over the last decade. Recent advancements in the field have been widely used in staple products across various industries. The automotive and medical industries have even pushed cars and equipment into production that use computer vision. However, there seems to be a lack of computer vision research in the game industry. With the advent of e-sports, competitive and casual gaming have reached new heights with regard to players, viewers, and content creators. This has allowed for avenues of research that did not exist prior.
In this thesis, we explore the practicality of object detection as applied in games. We designed a custom convolutional neural network detection model, SmashNet. The model was improved through classification weights generated from pre-training on the Caltech101 dataset with an accuracy of 62.29%. It was then trained on 2296 annotated frames from the competitive 2.5-dimensional fighting game Super Smash Brothers Melee to track coordinate locations of 4 specific characters in real-time. The detection model performs at a 68.25% accuracy across all 4 characters. In addition, as a demonstration of a practical application, we designed KirbyBot, a black-box adaptive bot which performs basic commands reactively based only on the tracked locations of two characters. It also collects very simple data on player habits. KirbyBot runs at a rate of 6-10 fps.
Object detection has several practical applications with regard to games, ranging from better AI design, to collecting data on player habits or game characters for competitive purposes or improvement updates.
300 |
Quantum ReLU activation for Convolutional Neural Networks to improve diagnosis of Parkinson’s disease and COVID-19Parisi, Luca, Neagu, Daniel, Ma, R., Campean, Felician 17 September 2021 (has links)
Yes / This study introduces a quantum-inspired computational paradigm to address the unresolved problem of Convolutional Neural Networks (CNNs) using the Rectified Linear Unit (ReLU) activation function (AF), i.e., the ‘dying ReLU’. This problem impacts the accuracy and the reliability in image classification tasks for critical applications, such as in healthcare. The proposed approach builds on the classical ReLU and Leaky ReLU, applying the quantum principles of entanglement and superposition at a computational level to derive two novel AFs, respectively the ‘Quantum ReLU’ (QReLU) and the ‘modified-QReLU’ (m-QReLU). The proposed AFs were validated when coupled with a CNN using seven image datasets on classification tasks involving the detection of COVID-19 and Parkinson’s Disease (PD). The out-of-sample/test classification accuracy and reliability (precision, recall and F1-score) of the CNN were compared against those of the same classifier when using nine classical AFs, including ReLU-based variations. Findings indicate higher accuracy and reliability for the CNN when using either QReLU or m-QReLU on five of the seven datasets evaluated. Whilst retaining the best classification accuracy and reliability for handwritten digits recognition on the MNIST dataset (ACC = 99%, F1-score = 99%), avoiding the ‘dying ReLU’ problem via the proposed quantum AFs improved recognition of PD-related patterns from spiral drawings with the QReLU especially, which achieved the highest classification accuracy and reliability (ACC = 92%, F1-score = 93%). Therefore, with these increased accuracy and reliability, QReLU and m-QReLU can aid critical image classification tasks, such as diagnoses of COVID-19 and PD. / The authors declare that this was the result of a HEIF 2020 University of Bradford COVID-19 response-funded project ‘Quantum ReLU-based COVID-19 Detector: A Quantum Activation Function for Deep Learning to Improve Diagnostics and Prognostics of COVID-19 from Non-ionising Medical Imaging’. However, the funding source was not involved in conducting the study and/or preparing the article.
Page generated in 0.0183 seconds