Spelling suggestions: "subject:"convolutional beural betworks."" "subject:"convolutional beural conetworks.""
121 |
Convolutional Neural Networks for Predicting Blood Glucose Levels from Nerve SignalsSay, Daniel, Spang Dyhrberg Nielsen, Frederik January 2024 (has links)
Convolutional Neural Networks (CNNs) have traditionally been used for image analysis and computer vision and are known for their ability to detect complex patterns in data. This report studies an application of CNNs within bioelectronic medicine, namely predicting blood glucose levels using nerve signals. Nerve signals and blood glucose levels were measured on a mouse before and after administration of glucose injections. The nerve signals were measured by placing 16 voltage-measuring electrodes on the vagus nerve of the mouse. The obtained nerve signal data was segmented into time intervals of 5 ms and aligned with the corresponding glucose measurements. Two LeNet-5 based CNN architectures, one 1-dimensional and one 2-dimensional, were implemented and trained on the data. Evaluation of the models’ performance was based on the mean squared error, the mean absolute error, and the R2-score of a simple moving average over the dataset. Both models had promising performance with an R2-score of above 0.92, suggesting a strong correlation between nerve signals and blood glucose levels. The difference in performance between the 1-dimensional and 2-dimensional model was insignificant. These results highlight the potential of using CNNs in bioelectronic medicine for prediction of physiological parameters from nerve signal data.
|
122 |
Enhanced Neural Network Training Using Selective Backpropagation and Forward PropagationBendelac, Shiri 22 June 2018 (has links)
Neural networks are making headlines every day as the tool of the future, powering artificial intelligence programs and supporting technologies never seen before. However, the training of neural networks can take days or even weeks for bigger networks, and requires the use of super computers and GPUs in academia and industry in order to achieve state of the art results. This thesis discusses employing selective measures to determine when to backpropagate and forward propagate in order to reduce training time while maintaining classification performance. This thesis tests these new algorithms on the MNIST and CASIA datasets, and achieves successful results with both algorithms on the two datasets. The selective backpropagation algorithm shows a reduction of up to 93.3% of backpropagations completed, and the selective forward propagation algorithm shows a reduction of up to 72.90% in forward propagations and backpropagations completed compared to baseline runs of always forward propagating and backpropagating. This work also discusses employing the selective backpropagation algorithm on a modified dataset with disproportional under-representation of some classes compared to others. / Master of Science / Neural Networks are some of the most commonly used and best performing tools in machine learning. However, training them to perform well is a tedious task that can take days or even weeks, since bigger networks perform better but take exponentially longer to train. What can be done to reduce training time? Imagine a student studying for a test. The student likely solves practice problems that cover the different topics that may be covered on the test. The student then evaluates which topics he/she knew well, and forgoes extensive practice and review on those in favor of focusing on topics he/she missed or was not as confident on. This thesis discusses following a similar approach in training neural networks in order to reduce their training time needed to achieve desired performance levels.
|
123 |
On the Use of Convolutional Neural Networks for Specific Emitter IdentificationWong, Lauren J. 12 June 2018 (has links)
Specific Emitter Identification (SEI) is the association of a received signal to an emitter, and is made possible by the unique and unintentional characteristics an emitter imparts onto each transmission, known as its radio frequency (RF) fingerprint. SEI systems are of vital importance to the military for applications such as early warning systems, emitter tracking, and emitter location. More recently, cognitive radio systems have started making use of SEI systems to enforce Dynamic Spectrum Access (DSA) rules. The use of pre-determined and expert defined signal features to characterize the RF fingerprint of emitters of interest limits current state-of-the-art SEI systems in numerous ways. Recent work in RF Machine Learning (RFML) and Convolutional Neural Networks (CNNs) has shown the capability to perform signal processing tasks such as modulation classification, without the need for pre-defined expert features. Given this success, the work presented in this thesis investigates the ability to use CNNs, in place of a traditional expert-defined feature extraction process, to improve upon traditional SEI systems, by developing and analyzing two distinct approaches for performing SEI using CNNs. Neither approach assumes a priori knowledge of the emitters of interest. Further, both approaches use only raw IQ data as input, and are designed to be easily tuned or modified for new operating environments. Results show CNNs can be used to both estimate expert-defined features and to learn emitter-specific features to effectively identify emitters. / Master of Science / When a device sends a signal, it unintentionally modifies the signal due to small variations and imperfections in the device’s hardware. These modifications, which are typically called the device’s radio frequency (RF) fingerprint, are unique to each device, and, generally, are independent of the data contained within the signal.
The goal of a Specific Emitter Identification (SEI) system is to use these RF fingerprints to match received signals to the devices, or emitters, which sent the given signals. SEI systems are often used for military applications, and, more recently, have been used to help make more efficient use of the highly congested RF spectrum.
Traditional state-of-the-art SEI systems detect the RF fingerprint embedded in each received signal by extracting one or more features from the signal. These features have been defined by experts in the field, and are determined ahead of time, in order to best capture the RF fingerprints of the emitters the system will likely encounter. However, this use of pre-determined expert features in traditional SEI systems limits the system in a variety of ways.
The work presented in this thesis investigates the ability to use Machine Learning (ML) techniques in place of the typically used expert-defined feature extraction processes, in order to improve upon traditional SEI systems. More specifically, in this thesis, two distinct approaches for performing SEI using Convolutional Neural Networks (CNNs) are developed and evaluated. These approaches are designed to have no knowledge of the emitters they may encounter and to be easily modified, unlike traditional SEI systems
|
124 |
Training und Evaluation eines neuroyalen Netzes zur Lösung der „Visual Referee Challenge“Jurkat, Freijdis 14 October 2024 (has links)
Die Schätzung von Posen ist ein bedeutendes Forschungsgebiet im Bereich der künstlichen Intelligenz, das die Mensch-Maschine-Interaktion vorantreibt und auch im Sport immer mehr an Relevanz gewinnt. Während menschliche Fußballspieler auf dem Feld mit den Schiedsrichtern ganz natürlich interagieren, wurde dieser Aspekt jedoch bisher in der Standard Platform League des Robocups vernachlässigt. Diese Arbeit untersucht einen weiteren Ansatz, um die Klassifizierung von statischen und dynamischen Schiedsrichterposen durchzuführen und damit dem großen Ziel, dass bis Mitte des 21. Jahrhunderts ein vollständig autonomes Roboter-Team nach den offiziellen FIFA-Regeln gegen den aktuellen Weltmeister gewinnen soll, einen Schritt näher zu kommen. Hierfür wurden Videos von relevanten Schiedsrichterposen erstellt und gesammelt. Anschließend wurden die menschlichen Gelenke mittels MoveNet extrahiert und die Pose mithilfe eines Convolutional Neural Networks klassifiziert. Dabei wurden zwei verschiedene Ansätze verfolgt: Ein Modell für jede Pose und ein Modell für alle Posen. Die Untersuchung zeigt, dass gute bis sehr gute Ergebnisse für statische und dynamische Posen erzielt werden können, wobei die Genauigkeit von einem Modell pro Pose 91,3% bis 99,3% mit einem Durchschnitt von 96,1% erreicht und die Genauigkeit von einem Modell für alle Posen eine Genauigkeit von 90,9% erreicht. Die erfolgreiche Anwendung der entwickelten Methodik zur Schätzung von Posen im Roboterfußball eröffnet vielversprechende Perspektiven für die Zukunft dieses Bereichs. Die gewonnenen Erkenntnisse können nicht nur zur Verbesserung der Leistungsfähigkeit von Fußballrobotern beitragen, sondern auch einen bedeutenden Beitrag zur weiteren Integration von KI-Technologien in unsere Gesellschaft leisten.:Inhaltsverzeichnis
Abbildungsverzeichnis
Tabellenverzeichnis
Abkürzungsverzeichnis
1 Einleitung
2 Einsatzszenario
2.1 Der RoboCup
2.2 Die Standard Platform League
2.3 Die In-Game Visual Referee Challenge
3 Grundlagen neuronaler Netze
3.1 Artificial Neural Networks
3.2 Convolutional Neural Networks
3.2.1 Architektur
3.2.2 Aktivierungsfunktionen
3.2.3 Weitere Optimierungsmöglichkeiten
3.3 Verschiedene Lernmethoden
3.4 Evaluation
4 State of the Art 10
4.1 Machine Learning Ansätze
4.1.1 Decision Trees
4.1.2 k-NN Algorithmus
4.2 Deep Learning Ansätze
4.2.1 Artificial Neural Network
4.2.2 Convolutionan Neural Network
4.2.3 Recurrent Neural Network
4.3 Auswahl des Vorgehens
4.3.1 Schlüsselpunkterkennung
4.3.2 Posenerkennung
5 Eigene Implementierung
5.1 Datensatz
5.2 Vorverarbeitung der Daten
5.2.1 Vorverarbeitung der Videos
5.2.2 Erstellung der Trainings- und Validierungsdaten
5.3 Ansatz 1: Ein Model pro Pose
5.3.1 Datensatz
5.3.2 Architektur
5.3.3 Bewertung
5.4 Ansatz 2: Ein Model für alle Posen
5.4.1 Datensatz
5.4.2 Architektur
5.4.3 Bewertung
5.5 Vergleich der Ansätze
6 Fazit und Ausblick
6.1 Fazit
6.2 Ausblick
Literatur
A Anhang
A.1 RoboCup Standard Platform League (NAO) Technical Challenges
A.2 Modelcard Movenet
A.3 Code und Datensätze
Eigenständigkeitserklärung
|
125 |
Fully Convolutional Neural Networks for Pixel Classification in Historical Document ImagesStewart, Seth Andrew 01 October 2018 (has links)
We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place.
|
126 |
Fully Convolutional Neural Networks for Pixel Classification in Historical Document ImagesStewart, Seth Andrew 01 October 2018 (has links)
We use a Fully Convolutional Neural Network (FCNN) to classify pixels in historical document images, enabling the extraction of high-quality, pixel-precise and semantically consistent layers of masked content. We also analyze a dataset of hand-labeled historical form images of unprecedented detail and complexity. The semantic categories we consider in this new dataset include handwriting, machine-printed text, dotted and solid lines, and stamps. Segmentation of document images into distinct layers allows handwriting, machine print, and other content to be processed and recognized discriminatively, and therefore more intelligently than might be possible with content-unaware methods. We show that an efficient FCNN with relatively few parameters can accurately segment documents having similar textural content when trained on a single representative pixel-labeled document image, even when layouts differ significantly. In contrast to the overwhelming majority of existing semantic segmentation approaches, we allow multiple labels to be predicted per pixel location, which allows for direct prediction and reconstruction of overlapped content. We perform an analysis of prevalent pixel-wise performance measures, and show that several popular performance measures can be manipulated adversarially, yielding arbitrarily high measures based on the type of bias used to generate the ground-truth. We propose a solution to the gaming problem by comparing absolute performance to an estimated human level of performance. We also present results on a recent international competition requiring the automatic annotation of billions of pixels, in which our method took first place.
|
127 |
Ensembles of Single Image Super-Resolution Generative Adversarial Networks / Ensembler av generative adversarial networks för superupplösning av bilderCastillo Araújo, Victor January 2021 (has links)
Generative Adversarial Networks have been used to obtain state-of-the-art results for low-level computer vision tasks like single image super-resolution, however, they are notoriously difficult to train due to the instability related to the competing minimax framework. Additionally, traditional ensembling mechanisms cannot be effectively applied with these types of networks due to the resources they require at inference time and the complexity of their architectures. In this thesis an alternative method to create ensembles of individual, more stable and easier to train, models by using interpolations in the parameter space of the models is found to produce better results than those of the initial individual models when evaluated using perceptual metrics as a proxy of human judges. This method can be used as a framework to train GANs with competitive perceptual results in comparison to state-of-the-art alternatives. / Generative Adversarial Networks (GANs) har använts för att uppnå state-of-the- art resultat för grundläggande bildanalys uppgifter, som generering av högupplösta bilder från bilder med låg upplösning, men de är notoriskt svåra att träna på grund av instabiliteten relaterad till det konkurrerande minimax-ramverket. Dessutom kan traditionella mekanismer för att generera ensembler inte tillämpas effektivt med dessa typer av nätverk på grund av de resurser de behöver vid inferenstid och deras arkitekturs komplexitet. I det här projektet har en alternativ metod för att samla enskilda, mer stabila och modeller som är lättare att träna genom interpolation i parameterrymden visat sig ge bättre perceptuella resultat än de ursprungliga enskilda modellerna och denna metod kan användas som ett ramverk för att träna GAN med konkurrenskraftig perceptuell prestanda jämfört med toppmodern teknik.
|
128 |
Deep Learning based Video Super- Resolution in Computer Generated Graphics / Deep Learning-baserad video superupplösning för datorgenererad grafikJain, Vinit January 2020 (has links)
Super-Resolution is a widely studied problem in the field of computer vision, where the purpose is to increase the resolution of, or super-resolve, image data. In Video Super-Resolution, maintaining temporal coherence for consecutive video frames requires fusing information from multiple frames to super-resolve one frame. Current deep learning methods perform video super-resolution, yet most of them focus on working with natural datasets. In this thesis, we use a recurrent back-projection network for working with a dataset of computer-generated graphics, with example applications including upsampling low-resolution cinematics for the gaming industry. The dataset comes from a variety of gaming content, rendered in (3840 x 2160) resolution. The objective of the network is to produce the upscaled version of the low-resolution frame by learning an input combination of a low-resolution frame, a sequence of neighboring frames, and the optical flow between each neighboring frame and the reference frame. Under the baseline setup, we train the model to perform 2x upsampling from (1920 x 1080) to (3840 x 2160) resolution. In comparison against the bicubic interpolation method, our model achieved better results by a margin of 2dB for Peak Signal-to-Noise Ratio (PSNR), 0.015 for Structural Similarity Index Measure (SSIM), and 9.3 for the Video Multi-method Assessment Fusion (VMAF) metric. In addition, we further demonstrate the susceptibility in the performance of neural networks to changes in image compression quality, and the inefficiency of distortion metrics to capture the perceptual details accurately. / Superupplösning är ett allmänt studerat problem inom datorsyn, där syftet är att öka upplösningen på eller superupplösningsbilddata. I Video Super- Resolution kräver upprätthållande av tidsmässig koherens för på varandra följande videobilder sammanslagning av information från flera bilder för att superlösa en bildruta. Nuvarande djupinlärningsmetoder utför superupplösning i video, men de flesta av dem fokuserar på att arbeta med naturliga datamängder. I denna avhandling använder vi ett återkommande bakprojektionsnätverk för att arbeta med en datamängd av datorgenererad grafik, med exempelvis applikationer inklusive upsampling av film med låg upplösning för spelindustrin. Datauppsättningen kommer från en mängd olika spelinnehåll, återgivna i (3840 x 2160) upplösning. Målet med nätverket är att producera en uppskalad version av en ram med låg upplösning genom att lära sig en ingångskombination av en lågupplösningsram, en sekvens av intilliggande ramar och det optiska flödet mellan varje intilliggande ram och referensramen. Under grundinställningen tränar vi modellen för att utföra 2x uppsampling från (1920 x 1080) till (3840 x 2160) upplösning. Jämfört med den bicubiska interpoleringsmetoden uppnådde vår modell bättre resultat med en marginal på 2 dB för Peak Signal-to-Noise Ratio (PSNR), 0,015 för Structural Similarity Index Measure (SSIM) och 9.3 för Video Multimethod Assessment Fusion (VMAF) mätvärde. Dessutom demonstrerar vi vidare känsligheten i neuronal nätverk för förändringar i bildkomprimeringskvaliteten och ineffektiviteten hos distorsionsmätvärden för att fånga de perceptuella detaljerna exakt.
|
129 |
GVT-BDNet : Convolutional Neural Network with Global Voxel Transformer Operators for Building Damage Assessment / GVT-BDNet : Convolutional Neural Network med Global Voxel Transformer Operators för Building Damage AssessmentRemondini, Leonardo January 2021 (has links)
Natural disasters strike anywhere, disrupting local communication and transportation infrastructure, making the process of assessing specific local damage difficult, dangerous, and slow. The goal of Building Damage Assessment (BDA) is to quickly and accurately estimate the location, cause, and severity of the damage to maximize the efficiency of rescuers and saved lives. In current machine learning BDA solutions, attention operators are the most recent innovations adopted by researchers to increase generalizability and overall performances of Convolutional Neural Networks for the BDA task. However, the latter, nowadays exploit attention operators tailored to the specific task and specific neural network architecture, leading them to be hard to apply to other scenarios. In our research, we want to contribute to the BDA literature while also addressing this limitation. We propose Global Voxel Transformer Operators (GVTOs): flexible attention-operators originally proposed for Augmented Microscopy that can replace up-sampling, down-sampling, and size-preserving convolutions within either a U-Net or a general CNN architecture without any limitation. Dissimilar to local operators, like convolutions, GVTOs can aggregate global information and have input-specific weights during inference time, improving generalizability performance, as already proved by recent literature. We applied GVTOs on a state-of-the-art BDA model and named it GVT-BDNet. We trained and evaluated our proposal neural network on the xBD dataset; the largest and most complete dataset for BDA. We compared GVT-BDNet performance with the baseline architecture (BDNet) and observed that the former improves damaged buildings segmentation by a factor of 0.11. Moreover, GVT-BDNet achieves state-of-the-art performance on a 10% split of the xBD training dataset and on the xBD test dataset with an overall F1- score of 0.80 and 0.79, respectively. To evaluate the architecture consistency, we have also tested BDNet’s and GVT-BDNet’s generalizability performance on another segmentation task: Tree & Shadow segmentation. Results showed that both models achieved overall good performances, scoring an F1-score of 0.79 and 0.785, respectively. / Naturkatastrofer sker överallt, stör lokal kommunikations- och transportinfrastruktur, vilket gör bedömningsprocessen av specifika lokala skador svår, farlig och långsam. Målet med Building Damage Assessment (BDA) är att snabbt och precist uppskatta platsen, orsaken och allvarligheten av skadorna för att maximera effektiviteten av räddare och räddade liv. Nuvarande BDA-lösningar använder Convolutional Neural Network (CNN) och ad-hoc Attention Operators för att förbättra generaliseringsprestanda. Nyligen föreslagna attention operators är dock specifikt skräddarsydda för uppgiften och kan sakna flexibilitet för andra scenarier eller neural nätverksarkitektur. I vår forskning bidrar vi till BDA -litteraturen genom att föreslå Global Voxel Transformer Operators (GVTO): flexibla attention operators som kan appliceras på en CNN -arkitektur utan att vara bundna till en viss uppgift. Nyare litteratur visar dessutom att de kan öka utvinningen av global information och därmed generaliseringsprestanda. Vi tillämpade GVTO på en toppmodern CNN-modell för BDA. GVTO: er förbättrade skadessegmenteringsprestandan med en faktor av 0,11. Dessutom förbättrade de den senaste tekniken för xBD-testdatauppsättningen och nådde toppmodern prestanda på en 10% delning av xBD-träningsdatauppsättningen. Vi har också utvärderat generaliserbarheten av det föreslagna neurala nätverket på en annan segmenteringsuppgift (Tree Shadow segmentering), vilket uppnådde över lag bra prestationer.
|
130 |
Using Satellite Images And Self-supervised Deep Learning To Detect Water Hidden Under Vegetation / Använda satellitbilder och Självövervakad Deep Learning Till Upptäck vatten gömt under VegetationIakovidis, Ioannis January 2024 (has links)
In recent years the wide availability of high-resolution satellite images has made the remote monitoring of water resources all over the world possible. While the detection of open water from satellite images is relatively easy, a significant percentage of the water extent of wetlands is covered by vegetation. Convolutional Neural Networks have shown great success in the task of detecting wetlands in satellite images. However, these models require large amounts of manually annotated satellite images, which are slow and expensive to produce. In this paper we use self-supervised training methods to train a Convolutional Neural Network to detect water from satellite images without the use of annotated data. We use a combination of deep clustering and negative sampling based on the paper ”Unsupervised Single-Scene Semantic Segmentation for Earth Observation”, and we expand the paper by changing the clustering loss, the model architecture and implementing an ensemble model. Our final ensemble of self-supervised models outperforms a single supervised model, showing the power of self-supervision. / Under de senaste åren har den breda tillgången på högupplösta satellitbilder möjliggjort fjärrövervakning av vattenresurser över hela världen. Även om det är relativt enkelt att upptäcka öppet vatten från satellitbilder, täcks en betydande andel av våtmarkernas vattenutbredning av vegetation. Lyckligtvis kan radarsignaler tränga igenom vegetation, vilket gör det möjligt för oss att upptäcka vatten gömt under vegetation från satellitradarbilder. Under de senaste åren har Convolutional Neural Networks visat stor framgång i denna uppgift. Tyvärr kräver dessa modeller stora mängder manuellt annoterade satellitbilder, vilket är långsamt och dyrt att producera. Självövervakad inlärning är ett område inom maskininlärning som syftar till att träna modeller utan användning av annoterade data. I den här artikeln använder vi självövervakad träningsmetoder för att träna en Convolutional Neural Network-baserad modell för att detektera vatten från satellitbilder utan användning av annoterade data. Vi använder en kombination av djup klustring och kontrastivt lärande baserat på artikeln ”Unsupervised Single-Scene Semantic Segmentation for Earth Observation”. Dessutom utökar vi uppsatsen genom att modifiera klustringsförlusten och modellarkitekturen som används. Efter att ha observerat hög varians i våra modellers prestanda implementerade vi också en ensemblevariant av vår modell för att få mer konsekventa resultat. Vår slutliga ensemble av självövervakade modeller överträffar en enda övervakad modell, vilket visar kraften i självövervakning.
|
Page generated in 0.1209 seconds