Spelling suggestions: "subject:"[een] CNN"" "subject:"[enn] CNN""
181 |
CLASSIFYING ANXIETY BASED ON A VOICERECORDING USING LEARNING ALGORITHMSSherlock, Oscar, Rönnbäck, Olle January 2022 (has links)
Anxiety is becoming more and more common, seeking help to evaluate your anxiety canfirst of all take a long time, secondly, many of the tests are self-report assessments that could cause incorrect results. It has been shown there are several voice characteristics that are affected in people with anxiety. Knowing this, we got the idea that an algorithm can be developed to classify the amount of anxiety based on a person's voice. Our goal is that the developed algorithm can be used in collaboration with today's evaluation methods to increase the validity of anxiety evaluation. The algorithm would, in our opinion, give a more objective result than self-report assessments. In this thesis we answer questions such as “Is it possible toclassify anxiety based on a speech recording?”, as well as if deep learning algorithms perform better than machine learning algorithms on such a task. To answer the research questions we compiled a data set containing samples of people speaking with a varying degree of anxiety applied to their voice. We then implemented two algorithms able to classify the samples from our data set. One of the algorithms was a machine learning algorithm (ANN) with manual feature extraction, and the other one was a deep learning model (CNN) with automatic feature extraction. The performance of the two models were compared, and it was concluded that ANN was the better algorithm. When evaluating the models a 5-fold cross validation was used with a data split of 80/20. Every fold contains 100 epochs meaning we train both the models for a total of 500 epochs. For every fold the accuracy, precision, and recall is calculated. From these metrics we have then calculated other metrics such as sensitivity and specificity to compare the models. The ANN model performed a lot better than the CNN model on every single metric that was measured: accuracy, sensitivity, precision, f1-score, recall andspecificity.
|
182 |
Applications of machine learningYuen, Brosnan 01 September 2020 (has links)
In this thesis, many machine learning algorithms were applied to electrocardiogram (ECG), spectral analysis, and Field Programmable Gate Arrays (FPGAs). In ECG, QRS complexes are useful for measuring the heart rate and for the segmentation of ECG signals. QRS complexes were detected using WaveletCNN Autoencoder filters and ConvLSTM detectors. The WaveletCNN Autoencoders filters the ECG signals using the wavelet filters, while the ConvLSTM detects the spatial temporal patterns of the QRS complexes. For the spectral analysis topic, the detection of chemical compounds using spectral analysis is useful for identifying unknown substances. However, spectral analysis algorithms require vast amounts of data. To solve this problem, B-spline neural networks were developed for the generation of infrared and ultraviolet/visible spectras. This allowed for the generation of large training datasets from a few experimental measurements. Graphical Processing Units (GPUs) are good for training and testing neural networks. However, using multiple GPUs together is hard because PCIe bus is not suited for scattering operations and reduce operations. FPGAs are more flexible as they can be arranged in a mesh or toroid or hypercube configuration on the PCB. These configurations provide higher data throughput and results in faster computations. A general neural network framework was written in VHDL for Xilinx FPGAs. It allows for any neural network to be trained or tested on FPGAs. / Graduate
|
183 |
Monitorovací systém laboratória založený na detekcii tváreGvizd, Peter January 2019 (has links)
In the last decades there has been such a fundamental development in the technologies including technologies focusing on face detection and identification supported by computer vision. Algorithm optimization has reached the point, when face detection is possible on mobile devices. At the outset, this work analy-ses common used algorithms for face detection and identification, for instance Haar features, LBP, EigenFaces and FisherFaces. Moreover, this work focuses on more up-to-date approaches of this topic, such as convolutional neural networks, or FaceNet from Google. The goal of this work is a design and its subsequent im-plementation of an automated, monitoring system designated for a lab, which is based on aforementioned algorithms. Within the design of the monitoring system, algorithms are compared with each other and their success rate and possible ap-plication in the final solution is evaluated.
|
184 |
Word Recognition in Nutrition Labels with Convolutional Neural NetworkKhasgiwala, Anuj 01 August 2018 (has links)
Nowadays, everyone is very busy and running around trying to maintain a balance between their work life and family, as the working hours are increasing day by day. In such hassled life people either ignore or do not give enough attention to a healthy diet. An imperative part of a healthy eating routine is the cognizance and maintenance of nourishing data and comprehension of how extraordinary sustenance and nutritious constituents influence our bodies. Besides in the USA, in many other countries, nutritional information is fundamentally passed on to consumers through nutrition labels (NLs) which can be found in all packaged food products in the form of nutrition table. However, sometimes it turns out to be challenging to utilize this information available in these NLs notwithstanding for consumers who are health conscious as they may not be familiar with nutritional terms and discover it hard to relate nutritional information into their day by day activities because of lack of time, inspiration, or training. So it is essential to automate this information gathering and interpretation procedure by incorporating Machine Learning based algorithm to abstract nutritional information from NLs on the grounds that it enhances the consumer’s capacity to participate in nonstop nutritional information gathering and analysis.
|
185 |
Adversarial Framework with Temperature as a Regularizer for Semantic SegmentationKim, Chanho 14 January 2022 (has links)
Semantic Segmentation processes RGB scenes and classifies pixels collectively as an object. Recent deep learning methods have shown promising results in the accuracy and the speed of semantic segmentation. However, it is inevitable for the deep learning models to fall in overfitting to data used in training due to its nature of data-centric approaches.
There have been numerous Regularization methods to overcome an overfitting problem, such as data augmentation, additional loss methods such as Euclidean or Least-Square terms, and structure-related methods by adding or modifying layers like Dropout and DropConnect in a network. Among those methods, penalizing a model via an additional loss or a weight constraint does not require memory increase.
With this sight, our work purposes to improve a given segmentation model through temperatures and a lightweight discriminator. Temperatures have the role of generating different versions of probability maps through the division in softmax calculations. On top of probability maps from temperatures, we concatenate a simple discriminator after the segmentation network for the competition between groundtruth feature maps and modified feature maps. We pass the additional loss calculated from those probability maps into the principal network.
Our contribution consists of two parts. Firstly, we use the adversarial loss as the regularization loss in the segmentation networks and validate that it can substitute the L2 regularization loss with better validation results. Also, we apply temperatures in segmentation probability maps for providing different information without using additional convolutional layers.
The experiments indicate that the spiking temperature in a generator with keeping an original probability map in a discriminator provides the model improvement in terms of pixel accuracy and mean Intersection-of-Union (mIoU). Our framework shows that the segmentation model can be improved with a small increase in training time and the number of parameters.
|
186 |
The clash between two worlds in human action recognition: supervised feature training vs Recurrent ConvNetRaptis, Konstantinos 28 November 2016 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Action recognition has been an active research topic for over three decades. There are various applications of action recognition, such as surveillance, human-computer interaction, and content-based retrieval. Recently, research focuses on movies, web videos, and TV shows datasets. The nature of these datasets make action recognition very challenging due to scene variability and complexity, namely background clutter, occlusions, viewpoint changes, fast irregular motion, and large spatio-temporal search space (articulation configurations and motions). The use of local space and time image features shows promising results, avoiding the cumbersome and often inaccurate frame-by-frame segmentation (boundary estimation). We focus on two state of the art methods for the action classification problem: dense trajectories and recurrent neural networks (RNN). Dense trajectories use typical supervised training (e.g., with Support Vector Machines) of features such as 3D-SIFT, extended SURF, HOG3D, and local trinary patterns; the main idea is to densely sample these features in each frame and track them in the sequence based on optical flow. On the other hand, the deep neural network uses the input frames to detect action and produce part proposals, i.e., estimate information on body parts (shapes and locations). We compare qualitatively and numerically these two approaches, indicative to what is used today, and describe our conclusions with respect to accuracy and efficiency.
|
187 |
Real-Time Video Object Detection with Temporal Feature AggregationChen, Meihong 05 October 2021 (has links)
In recent years, various high-performance networks have been proposed for single-image object detection. An obvious choice is to design a video detection network based on state-of-the-art single-image detectors. However, video object detection is still challenging due to the lower quality of individual frames in a video, and hence the need to include temporal information for high-quality detection results. In this thesis, we design a novel interleaved architecture combining a 2D convolutional network and a 3D temporal network. We utilize Yolov3 as the base detector. To explore inter-frame information, we propose feature aggregation based on a temporal network. Our temporal network utilizes Appearance-preserving 3D convolution (AP3D) for extracting aligned features in the temporal dimension. Our multi-scale detector and multi-scale temporal network communicate at each scale and also across scales. The number of inputs of our temporal network can be either 4, 8, or 16 frames in this thesis and correspondingly we name our temporal network TemporalNet-4, TemporalNet-8 and TemporalNet-16. Our approach achieves 77.1\% mAP (mean Average Precision) on ImageNet VID 2017 dataset with TemporalNet-4, where TemporalNet-16 achieves 80.9\% mAP which is a competitive result on this video object detection benchmark. Our network is also real-time with a running time of 35ms/frame.
|
188 |
Deep Learning based 3D Image Segmentation Methods and ApplicationsChen, Yani 05 June 2019 (has links)
No description available.
|
189 |
Multiple Drone Detection and Acoustic Scene Classification with Deep LearningVemula, Hari Charan January 2018 (has links)
No description available.
|
190 |
Detecting Image Forgery with Color PhenomenologyStanton, Jamie Alyssa 30 May 2019 (has links)
No description available.
|
Page generated in 0.0895 seconds