Spelling suggestions: "subject:"[een] CNN"" "subject:"[enn] CNN""
421 |
Accelerated Deep Learning using Intel Xeon PhiViebke, André January 2015 (has links)
Deep learning, a sub-topic of machine learning inspired by biology, have achieved wide attention in the industry and research community recently. State-of-the-art applications in the area of computer vision and speech recognition (among others) are built using deep learning algorithms. In contrast to traditional algorithms, where the developer fully instructs the application what to do, deep learning algorithms instead learn from experience when performing a task. However, for the algorithm to learn require training, which is a high computational challenge. High Performance Computing can help ease the burden through parallelization, thereby reducing the training time; this is essential to fully utilize the algorithms in practice. Numerous work targeting GPUs have investigated ways to speed up the training, less attention have been paid to the Intel Xeon Phi coprocessor. In this thesis we present a parallelized implementation of a Convolutional Neural Network (CNN), a deep learning architecture, and our proposed parallelization scheme, CHAOS. Additionally a theoretical analysis and a performance model discuss the algorithm in detail and allow for predictions if even more threads are available in the future. The algorithm is evaluated on an Intel Xeon Phi 7120p, Xeon E5-2695v2 2.4 GHz and Core i5 661 3.33 GHz using various architectures and thread counts on the MNIST dataset. Findings show a 103.5x, 99.9x, 100.4x speed up for the large, medium, and small architecture respectively for 244 threads compared to 1 thread on the coprocessor. Moreover, a 10.9x - 14.1x (large to small) speed up compared to the sequential version running on Xeon E5. We managed to decrease training time from 7 days on the Core i5 and 31 hours on the Xeon E5, to 3 hours on the Intel Xeon Phi when training our large network for 15 epochs
|
422 |
Reconnaissance de l'émotion thermiqueFu, Yang 05 1900 (has links)
Pour améliorer les interactions homme-ordinateur dans les domaines de la santé, de l'e-learning et des jeux vidéos, de nombreux chercheurs ont étudié la reconnaissance des émotions à partir des signaux de texte, de parole, d'expression faciale, de détection d'émotion ou d'électroencéphalographie (EEG). Parmi eux, la reconnaissance d'émotion à l'aide d'EEG a permis une précision satisfaisante. Cependant, le fait d'utiliser des dispositifs d'électroencéphalographie limite la gamme des mouvements de l'utilisateur. Une méthode non envahissante est donc nécessaire pour faciliter la détection des émotions et ses applications. C'est pourquoi nous avons proposé d'utiliser une caméra thermique pour capturer les changements de température de la peau, puis appliquer des algorithmes d'apprentissage machine pour classer les changements d'émotion en conséquence. Cette thèse contient deux études sur la détection d'émotion thermique avec la comparaison de la détection d'émotion basée sur EEG. L'un était de découvrir les profils de détection émotionnelle thermique en comparaison avec la technologie de détection d'émotion basée sur EEG; L'autre était de construire une application avec des algorithmes d'apprentissage en machine profonds pour visualiser la précision et la performance de la détection d'émotion thermique et basée sur EEG. Dans la première recherche, nous avons appliqué HMM dans la reconnaissance de l'émotion thermique, et après avoir comparé à la détection de l'émotion basée sur EEG, nous avons identifié les caractéristiques liées à l'émotion de la température de la peau en termes d'intensité et de rapidité. Dans la deuxième recherche, nous avons mis en place une application de détection d'émotion qui supporte à la fois la détection d'émotion thermique et la détection d'émotion basée sur EEG en appliquant les méthodes d'apprentissage par machine profondes - Réseau Neuronal Convolutif (CNN) et Mémoire à long court-terme (LSTM). La précision de la détection d'émotion basée sur l'image thermique a atteint 52,59% et la précision de la détection basée sur l'EEG a atteint 67,05%. Dans une autre étude, nous allons faire plus de recherches sur l'ajustement des algorithmes d'apprentissage machine pour améliorer la précision de détection d'émotion thermique. / To improve computer-human interactions in the areas of healthcare, e-learning and video
games, many researchers have studied on recognizing emotions from text, speech, facial
expressions, emotion detection, or electroencephalography (EEG) signals. Among them,
emotion recognition using EEG has achieved satisfying accuracy. However, wearing
electroencephalography devices limits the range of user movement, thus a noninvasive method
is required to facilitate the emotion detection and its applications. That’s why we proposed using
thermal camera to capture the skin temperature changes and then applying machine learning
algorithms to classify emotion changes accordingly. This thesis contains two studies on thermal
emotion detection with the comparison of EEG-base emotion detection. One was to find out the
thermal emotional detection profiles comparing with EEG-based emotion detection technology;
the other was to implement an application with deep machine learning algorithms to visually
display both thermal and EEG based emotion detection accuracy and performance. In the first
research, we applied HMM in thermal emotion recognition, and after comparing with EEG-base
emotion detection, we identified skin temperature emotion-related features in terms of intensity
and rapidity. In the second research, we implemented an emotion detection application
supporting both thermal emotion detection and EEG-based emotion detection with applying the
deep machine learning methods – Convolutional Neutral Network (CNN) and LSTM (Long-
Short Term Memory). The accuracy of thermal image based emotion detection achieved 52.59%
and the accuracy of EEG based detection achieved 67.05%. In further study, we will do more
research on adjusting machine learning algorithms to improve the thermal emotion detection
precision.
|
423 |
Climate Translators: Broadcast New's Contribution to the Political Divide over Climate Change in the United StatesMacy, Dylan V 01 January 2020 (has links)
In many instances, television news is the primary outlet through which people gain knowledge on climate change. Both the perceived threat of climate change and American news media have grown politically divided since the 1980s. I make the argument that American news media influences the partisan divide over climate change. In addition to the political landscape of news media, focus on political events and figures in climate coverage further contributes to a partisan divide. Supporting these claims are research displaying how climate change news is processed in a partisan manner and a selection of three case study periods in which climate change coverage spiked among MSNBC, CNN, and Fox News in the last twenty years (2000-2019). I collected news footage from all three case studies using the online database archive.org. Using this footage, an accompanying documentary short was produced that focused on the Paris Climate Accord Withdrawal in 2017. Presented in the documentary and the three case study periods, Fox News held a consistently hands-off and dismissive tone towards climate change, while MSNBC and CNN implemented climate science into coverage while advocating for collective climate action. I report that media is selected and processed via partisanship among viewers; these case studies illustrate the ways in which news media drives the political divide on climate change. I conclude by offering some future ways climate coverage can be more unifying, such as more emphasis on the economic benefits of “a green economy” in news coverage.
|
424 |
AI on the Edge with CondenseNeXt: An Efficient Deep Neural Network for Devices with Constrained Computational ResourcesKalgaonkar, Priyank B. 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Research work presented within this thesis propose a neoteric variant of deep convolutional neural network architecture, CondenseNeXt, designed specifically for ARM-based embedded computing platforms with constrained computational resources. CondenseNeXt is an improved version of CondenseNet, the baseline architecture whose roots can be traced back to ResNet. CondeseNeXt replaces group convolutions in CondenseNet with depthwise separable convolutions and introduces group-wise pruning, a model compression technique, to prune (remove) redundant and insignificant elements that either are irrelevant or do not affect performance of the network upon disposition. Cardinality, a new dimension to the existing spatial dimensions, and class-balanced focal loss function, a weighting factor inversely proportional to the number of samples, has been incorporated in order to relieve the harsh effects of pruning, into the design of CondenseNeXt’s algorithm. Furthermore, extensive analyses of this novel CNN architecture was performed on three benchmarking image datasets: CIFAR-10, CIFAR-100 and ImageNet by deploying the trained weight on to an ARM-based embedded computing platform: NXP BlueBox 2.0, for real-time image classification. The outputs are observed in real-time in RTMaps Remote Studio’s console to verify the correctness of classes being predicted. CondenseNeXt achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error), and up to 59.98% reduction in forward FLOPs compared to CondenseNet. CondenseNeXt can also achieve a final trained model size of 2.9 MB, however at the cost of 2.26% in accuracy loss. Thus, performing image classification on ARM-Based computing platforms without requiring a CUDA enabled GPU support, with outstanding efficiency.
|
425 |
Word2vec modely s přidanou kontextovou informací / Word2vec Models with Added Context InformationŠůstek, Martin January 2017 (has links)
This thesis is concerned with the explanation of the word2vec models. Even though word2vec was introduced recently (2013), many researchers have already tried to extend, understand or at least use the model because it provides surprisingly rich semantic information. This information is encoded in N-dim vector representation and can be recall by performing some operations over the algebra. As an addition, I suggest a model modifications in order to obtain different word representation. To achieve that, I use public picture datasets. This thesis also includes parts dedicated to word2vec extension based on convolution neural network.
|
426 |
Využití hlubokého učení pro rozpoznání textu v obrazu grafického uživatelského rozhraní / Deep Learning for OCR in GUIHamerník, Pavel January 2019 (has links)
Optical character recognition (OCR) has been a topic of interest for many years. It is defined as the process of digitizing a document image into a sequence of characters. Despite decades of intense research, OCR systems with capabilities to that of human still remains an open challenge. In this work there is presented a design and implementation of such system, which is capable of detecting texts in graphical user interfaces.
|
427 |
Využití hlubokého učení pro rozpoznání textu v obrazu grafického uživatelského rozhraní / Deep Learning for OCR in GUIHamerník, Pavel January 2019 (has links)
Optical character recognition (OCR) has been a topic of interest for many years. It is defined as the process of digitizing a document image into a sequence of characters. Despite decades of intense research, OCR systems with capabilities to that of human still remains an open challenge. In this work there is presented a design and implementation of such system, which is capable of detecting texts in graphical user interfaces.
|
428 |
Analýza zvukových nahrávek pomocí hlubokého učení / Deep learning based sound records analysisKramář, Denis January 2021 (has links)
This master thesis deals with the problem of audio-classification of the chainsaw logging sound in natural environment using mainly convolutional neural networks. First, a theory of grafical representation of audio signal is discussed. Following part is devoted to the machine learning area. In third chapter, some of present works dealing with this problematics are given. Within the practical part, used dataset and tested neural networks are presented. Final resultes are compared by achieved accuracy and by ROC curves. The robustness of the presented solutions was tested by proposed detection program and evaluated using objective criteria.
|
429 |
Visual Transformers for 3D Medical Images Classification: Use-Case Neurodegenerative DisordersKhorramyar, Pooriya January 2022 (has links)
A Neurodegenerative Disease (ND) is progressive damage to brain neurons, which the human body cannot repair or replace. The well-known examples of such conditions are Dementia and Alzheimer’s Disease (AD), which affect millions of lives each year. Although conducting numerous researches, there are no effective treatments for the mentioned diseases today. However, early diagnosis is crucial in disease management. Diagnosing NDs is challenging for neurologists and requires years of training and experience. So, there has been a trend to harness the power of deep learning, including state-of-the-art Convolutional Neural Network (CNN), to assist doctors in diagnosing such conditions using brain scans. The CNN models lead to promising results comparable to experienced neurologists in their diagnosis. But, the advent of transformers in the Natural Language Processing (NLP) domain and their outstanding performance persuaded Computer Vision (CV) researchers to adapt them to solve various CV tasks in multiple areas, including the medical field. This research aims to develop Vision Transformer (ViT) models using Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset to classify NDs. More specifically, the models can classify three categories (Cognitively Normal (CN), Mild Cognitive Impairment (MCI), Alzheimer’s Disease (AD)) using brain Fluorodeoxyglucose (18F-FDG) Positron Emission Tomography (PET) scans. Also, we take advantage of Automated Anatomical Labeling (AAL) brain atlas and attention maps to develop explainable models. We propose three ViTs, the best of which obtains an accuracy of 82% on the test dataset with the help of transfer learning. Also, we encode the AAL brain atlas information into the best performing ViT, so the model outputs the predicted label, the most critical region in its prediction, and overlaid attention map on the input scan with the crucial areas highlighted. Furthermore, we develop two CNN models with 2D and 3D convolutional kernels as baselines to classify NDs, which achieve accuracy of 77% and 73%, respectively, on the test dataset. We also conduct a study to find out the importance of brain regions and their combinations in classifying NDs using ViTs and the AAL brain atlas. / <p>This thesis was awarded a prize of 50,000 SEK by Getinge Sterilization for projects within Health Innovation.</p>
|
430 |
VISUAL DETECTION OF PERSONAL PROTECTIVE EQUIPMENT & SAFETY GEAR ON INDUSTRY WORKERSStrand, Fredrik, Karlsson, Jonathan January 2022 (has links)
Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions and worker safety. The goal is thus to develop a system that will improve construction workers' safety. Building such a system necessitates computer vision, which entails object recognition, facial recognition, and human recognition, among other things. The basic idea is first to detect the human and remove the background to speed up the process and avoid potential interferences. After that, the cropped image is subjected to facial and object recognition. The code is written in Python and includes libraries such as OpenCV, face_recognition, and CVZone. Some of the different algorithms chosen were YOLOv4 and Histogram of Oriented Gradients. The results were measured at three respectively five-meter distances. As a result of the system’s pipeline, algorithms, and software, a mean average precision of 99% and 89% was achieved at the respective distances. At three and five meters, the model achieved a precision rate of 100%. The recall rates were 96% - 100% at 3m and 54% - 100% at 5m. Finally, the fps was measured at 1.2 on a system without GPU. / Skador på arbetsplatsen är vanliga i dagens samhälle på grund av att skyddsutrustning inte används eller används felaktigt. Målet är därför att bygga ett robust system som ska förbättra säkerhet. Ett system som endast ger tillträde till personal med rätt skyddsutrustning kan skapas för att förbättra arbetsförhållandena och arbetarsäkerheten. Att bygga ett sådant system kräver datorseende, vilket bland annat innebär objektigenkänning, ansiktsigenkänning och mänsklig igenkänning. Grundidén är att först upptäcka människan och ta bort bakgrunden för att göra processen mer effektiv och undvika potentiella störningar. Därefter appliceras ansikts- och objektigenkänning på den beskurna bilden. Koden är skriven i Python och inkluderar bland annat bibliotek som: OpenCV, face_recognition och CVZone. Några av de algoritmer som valdes var YOLOv4 och Histogram of Oriented Gradients. Resultatet mättes på tre, respektive fem meters avstånd. Systemets pipeline, algoritmer och mjukvara gav en medelprecision för alla klasser på 99%, och 89% för respektive avstånd. För tre och fem meters avstånd uppnådde modellen en precision på 100%. Recall uppnådde värden mellan 96% - 100% vid 3 meters avstånd och 54% - 100% vid 5 meters avstånd. Avslutningsvis uppmättes antalet bilder per sekund till 1,2 på ett system utan GPU.
|
Page generated in 0.045 seconds