• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 159
  • 54
  • 15
  • 13
  • 13
  • 7
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 313
  • 313
  • 125
  • 97
  • 75
  • 74
  • 72
  • 60
  • 49
  • 46
  • 46
  • 45
  • 44
  • 44
  • 42
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
241

Algorithm Design and Optimization of Convolutional Neural Networks Implemented on FPGAs

Du, Zekun January 2019 (has links)
Deep learning develops rapidly in recent years. It has been applied to many fields, which are the main areas of artificial intelligence. The combination of deep learning and embedded systems is a good direction in the technical field. This project is going to design a deep learning neural network algorithm that can be implemented on hardware, for example, FPGA. This project based on current researches about deep learning neural network and hardware features. The system uses PyTorch and CUDA as assistant methods. This project focuses on image classification based on a convolutional neural network (CNN). Many good CNN models can be studied, like ResNet, ResNeXt, and MobileNet. By applying these models to the design, an algorithm is decided with the model of MobileNet. Models are selected in some ways, like floating point operations (FLOPs), number of parameters and classification accuracy. Finally, the algorithm based on MobileNet is selected with a top-1 error of 5.5%on software with a 6-class data set.Furthermore, the hardware simulation comes on the MobileNet based algorithm. The parameters are transformed from floating point numbers to 8-bit integers. The output numbers of each individual layer are cut to fixed-bit integers to fit the hardware restriction. A number handling method is designed to simulate the number change on hardware. Based on this simulation method, the top-1 error increases to 12.3%, which is acceptable. / Deep learning har utvecklats snabbt under den senaste tiden. Det har funnit applikationer inom många områden, som är huvudfälten inom Artificial Intelligence. Kombinationen av Deep Learning och innbyggda system är en god inriktning i det tekniska fältet. Syftet med detta projekt är att designa en Deep Learning-baserad Neural Network algoritm som kan implementeras på hårdvara, till exempel en FPGA. Projektet är baserat på modern forskning inom Deep Learning Neural Networks samt hårdvaruegenskaper.Systemet är baserat på PyTorch och CUDA. Projektets fokus är bild klassificering baserat på Convolutional Neural Networks (CNN). Det finns många bra CNN modeller att studera, t.ex. ResNet, ResNeXt och MobileNet. Genom att applicera dessa modeller till designen valdes en algoritm med MobileNetmodellen. Valet av modell är baserat på faktorer så som antal flyttalsoperationer, antal modellparametrar och klassifikationsprecision. Den mjukvarubaserade versionen av den MobileNet-baserade algoritmen har top-1 error på 5.5En hårdvarusimulering av MobileNet nätverket designades, i vilket parametrarna är konverterade från flyttal till 8-bit heltal. Talen från varje lager klipps till fixed-bit heltal för att anpassa nätverket till befintliga hårdvarubegränsningar. En metod designas för att simulera talförändringen på hårdvaran. Baserat på denna simuleringsmetod reduceras top-1 error till 12.3
242

Impact of data augmentations when training the Inception model for image classification

Barai, Milad, Heikkinen, Anthony January 2017 (has links)
Image classification is the process of identifying to which class a previously unobserved object belongs to. Classifying images is a commonly occurring task in companies. Currently many of these companies perform this classification manually. Automated classification however, has a lower expected accuracy. This thesis examines how automated classification could be improved by the addition of augmented data into the learning process of the classifier. We conduct a quantitative empirical study on the effects of two image augmentations, random horizontal/vertical flips and random rotations (<180◦). The data set that is used is from an auction house search engine under the commercial name of Barnebys. The data sets contain 700 000, 50 000 and 28 000 images with each set containing 28 classes. In this bachelor’s thesis, we re-trained a convolutional neural network model called the Inception-v3 model with the two larger data sets. The remaining set is used to get more class specific accuracies. In order to get a more accurate value of the effects we used a tenfold cross-validation method. Results of our quantitative study shows that the Inception-v3 model can reach a base line mean accuracy of 64.5% (700 000 data set) and a mean accuracy of 51.1% (50 000 data set). The overall accuracy decreased with augmentations on our data sets. However, our results display an increase in accuracy for some classes. The highest flat accuracy increase observed is in the class "Whine & Spirits" in the small data set where it went from 42.3% correctly classified images to 72.7% correctly classified images of the specific class. / Bildklassificering är uppgiften att identifiera vilken klass ett tidigare osett objekt tillhör. Att klassificera bilder är en vanligt förekommande uppgift hos företag. För närvarande utför många av dessa företag klassificering manuellt. Automatiserade klassificerare har en lägre förväntad nogrannhet. I detta examensarbete studeradas hur en maskinklassificerar kan förbättras genom att lägga till ytterligare förändrad data i inlärningsprocessen av klassificeraren. Vi genomför en kvantitativ empirisk studie om effekterna av två bildförändringar, slumpmässiga horisontella/vertikala speglingar och slumpmässiga rotationer (<180◦). Bilddatasetet som används är från ett auktionshus sökmotor under det kommersiella namnet Barnebys. De dataseten som används består av tre separata dataset, 700 000, 50 000 och 28 000 bilder. Var och en av dataseten innehåller 28 klasser vilka mappas till verksamheten. I det här examensarbetet har vi tränat Inception-v3-modellen med dataset av storlek 700 000 och 50 000. Vi utvärderade sedan noggrannhet av de tränade modellerna genom att klassificera 28 000-datasetet. För att få ett mer exakt värde av effekterna använde vi en tiofaldig korsvalideringsmetod. Resultatet av vår kvantitativa studie visar att Inceptionv3-modellen kan nå en genomsnittlig noggrannhet på 64,5% (700 000 dataset) och en genomsnittlig noggrannhet på 51,1% (50 000 dataset). Den övergripande noggrannheten minskade med förändringar på vårat dataset. Dock visar våra resultat en ökad noggrannhet i vissa klasser. Den observerade högsta noggrannhetsökningen var i klassen Åhine & Spirits", där vi gick från 42,3 % korrekt klassificerade bilder till 72,7 % korrekt klassificerade bilder i det lilla datasetet med förändringar.
243

Accelerating CNN on FPGA : An Implementation of MobileNet on FPGA

Shen, Yulan January 2019 (has links)
Convolutional Neural Network is a deep learning algorithm that brings revolutionary impact on computer vision area. One of its applications is image classification. However, problem exists in this algorithm that it involves huge number of operations and parameters, which limits its possibility in time and resource restricted embedded applications. MobileNet, a neural network that uses separable convolutional layers instead of standard convolutional layers, largely reduces computational consumption compared to traditional CNN models. By implementing MobileNet on FPGA, image classification problems could be largely accelerated. In this thesis, we have designed an accelerator block for MobileNet. We have implemented a simplified MobileNet on Xilinx UltraScale+ Zu104 FPGA board with 64 accelerators. We use the implemented MobileNet to solve a gesture classification problem. The implemented design works under 100MHz frequency. It shows a 28.4x speed up than CPU (Intel(R) Pentium(R) CPU G4560 @ 3.50GHz), and a 6.5x speed up than GPU (NVIDIA GeForce 940MX 1.004GHz). Besides, it is a power efficient design. Its power consumption is 4.07w. The accuracy reaches 43% in gesture classification. / CNN-Nätverk är en djupinlärning algoritm som ger revolutionerande inverkan på datorvision, till exempel, bildklassificering. Det finns emellertid problem i denna algoritm att det innebär ett stort antal operationer och parametrar, vilket begränsar möjligheten i tidsbegränsade och resursbegränsade inbäddade applikationer. MobileNet, ett neuralt nätverk som använder separerbara convolution lager i stället för standard convolution lager, minskar i stor utsträckning beräkningsmängder än traditionella CNN-modeller. Genom att implementera MobileNet på FPGA kan problem med bildklassificering accelereras i stor utsträckning. Vi har utformat ett acceleratorblock för MobileNet. Vi har implementerat ett förenklat MobileNet på Xilinx UltraScale + Zu104 FPGA-kort med 64 acceleratorer. Vi använder det implementerade MobileNet för att lösa ett gestklassificeringsproblem. Implementerade designen fungerar under 100MHzfrekvens. Den visar en hastighet på 28,4x än CPU (Intel (R) Pentium (R) CPU G4560 @ 3,50 GHz) och en 6,5x snabbare hastighet än GPU (NVIDIA GeForce 940MX 1,004GHz). Det är en energieffektiv design. Strömförbrukningen är 4,07w. Noggrannheten når 43% i gestklassificering.
244

Multitemporal Remote Sensing for Urban Mapping using KTH-SEG and KTH-Pavia Urban Extractor

Jacob, Alexander January 2014 (has links)
The objective of this licentiate thesis is to develop novel algorithms and improve existing methods for urban land cover mapping and urban extent extraction using multi-temporal remote sensing imagery. Past studies have demonstrated that synthetic aperture radar (SAR) have very good properties for the analysis of urban areas, the synergy of SAR and optical data is advantageous for various applications. The specific objectives of this research are: 1. To develop a novel edge-aware region-growing and -merging algorithm, KTH-SEG, for effective segmentation of SAR and optical data for urban land cover mapping; 2. To evaluate the synergistic effects of multi-temporal ENVISAT ASAR and HJ-1B multi-spectral data for urban land cover mapping; 3. To improve the robustness of an existing method for urban extent extraction by adding effective pre- and post-processing. ENVISAT ASAR data and the Chinese HJ-1B multispectral , as well as TerraSAR-X data were used in this research. For objectives 1 and 2 two main study areas were chosen, Beijing and Shanghai, China. For both sites a number of multitemporal ENVISAT ASAR (30m C-band) scenes with varying image characteristics were selected during the vegetated season of 2009. For Shanghai TerraSAR-X strip-map images at 3m resolution X-band) were acquired for a similar period in 2010 to also evaluate high resolution X-band SAR for urban land cover mapping. Ten  major landcover classes were extracted including high density built-up, low density built-up, bare field, low vegetation, forest, golf course, grass, water, airport runway and major road. For Objective 3, eleven globally distributed study areas where chosen, Berlin, Beijing, Jakarta, Lagos, Lombardia (northern Italy), Mexico City, Mumbai, New York City, Rio de Janeiro, Stockholm and Sydney. For all cities ENVISAT ASAR imagery was acquired and for cities in or close to mountains even SRTM digital elevation data. The methodology of this thesis includes two major components, KTH-SEG and KTH-Pavia Urban Extractor. KTH-SEG is an edge aware region-growing and -merging algorithm that utilizes both the benefit of finding local high frequency changes as well as determining robustly homogeneous areas of a low frequency in local change. The post-segmentation classification is performed using support vector machines. KTH-SEG was evaluated using multitemporal, multi-angle, dual-polarization ASAR data and multispectral HJ-1B data as well as TerraSAR-X data. The KTH-Pavia urban extractor is a processing chain. It includes: Geometrical corrections, contrast enhancement, builtup area extraction using spatial stastistics and GLCM texture features, logical operator based fusion and DEM based mountain masking. For urban land cover classification using multitemporal ENVISAT ASAR data, the results showed that KTH-SEG achieved an overall accuracy of almost 80% (0.77 Kappa ) for the 10 urban land cover classes both Beijign and Shanghai, compared to eCognition results of 75% (0.71 Kappa) In particular the detection of small linear features with respect to the image resolution such as roads in 30m resolved data went well with 83% user accuracy from KTH-SEG versus 57% user accuracy using the segments derived from eCognition. The other urban classes which in particular in SAR imagery are characterized by a high degree of heterogeneity were classified superiorly by KTH-SEG. ECognition in general performed better on vegetation classes such as grass, low vegetation and forest which are usually more homogeneous. It is was also found that the combination of ASAR and HJ-1B optical data was beneficial, increasing the final classification accuracy by at least 10% compared to ASAR or HJ-1B data alone. The results also further confirmed that a higher diversity of SAR type images is more important for the urban classification outcome. However, this is not the case when classifying high resolution TerraSAR-X strip-map imagery. Here the different image characteristics of different look angles, and orbit orientation created more confusion mainly due to the different layover and foreshortening effects on larger buildings. The TerraSAR-X results showed also that accurate urban classification can be achieved using high resolution SAR data alone with almost 84% for  eight classes around the Shanghai international Airport (high and low density built-up were not separated as well as roads and runways). For urban extent extraction, the results demonstrated that built-up areas can be effectively extracted using a single ENVISAT ASAR image in 10 global cities reaching overall accuracies around 85%, compared to 75% of MODIS urban class and 73% GlobCover Urban class. Multitemporal ASAR can improve the urban extraction results by 5-10% in Beijing. Mountain masking applied in Mumbai and Rio de Janeiro increased the accuracy by 3-5%.The research performed in  this thesis has contributed to the remote sensing community by providing algorithms and methods for both extracting urban areas and identifying urban land cover in a more detailed fashion. / <p>QC 20140625</p>
245

[pt] APRIMORAÇÃO DO ALGORITMO Q-NAS PARA CLASSIFICAÇÃO DE IMAGENS / [en] ENHANCED Q-NAS FOR IMAGE CLASSIFICATION

JULIA DRUMMOND NOCE 31 October 2022 (has links)
[pt] Redes neurais profundas são modelos poderosos e flexíveis que ganharam a atenção da comunidade de aprendizado de máquina na última década. Normalmente, um especialista gasta um tempo significativo projetando a arquitetura neural, com longas sessões de tentativa e erro para alcançar resultados bons e relevantes. Por causa do processo manual, há um maior interesse em abordagens de busca de arquitetura neural, que é um método que visa automatizar a busca de redes neurais. A busca de arquitetura neural(NAS) é uma subárea das técnicas de aprendizagem de máquina automatizadas (AutoML) e uma etapa essencial para automatizar os métodos de aprendizado de máquina. Esta técnica leva em consideração os aspectos do espaço de busca das arquiteturas, estratégia de busca e estratégia de estimativa de desempenho. Algoritmos evolutivos de inspiração quântica apresentam resultados promissores quanto à convergência mais rápida quando comparados a outras soluções com espaço de busca restrito e alto custo computacional. Neste trabalho, foi aprimorado o Q-NAS: um algoritmo de inspiração quântica para pesquisar redes profundas por meio da montagem de subestruturas simples. O Q-NAS também pode evoluir alguns hiperparâmetros numéricos do treinamento, o que é um primeiro passo na direção da automação completa. Foram apresentados resultados aplicando Q-NAS, evoluído, sem transferência de conhecimento, no conjunto de dados CIFAR-100 usando apenas 18 GPU/dias. Nossa contribuição envolve experimentar outros otimizadores no algoritmo e fazer um estudo aprofundado dos parâmetros do Q-NAS. Nesse trabalho, foi possível atingir uma acurácia de 76,40%. Foi apresentado também o Q-NAS aprimorado aplicado a um estudo de caso para classificação COVID-19 x Saudável em um banco de dados de tomografia computadorizada de tórax real. Em 9 GPU/dias, conseguimos atingir uma precisão de 99,44% usando menos de 1000 amostras para dados de treinamento. / [en] Deep neural networks are powerful and flexible models that have gained the attention of the machine learning community over the last decade. Usually, an expert spends significant time designing the neural architecture, with long trial and error sessions to reach good and relevant results. Because of the manual process, there is a greater interest in Neural Architecture Search (NAS), which is an automated method of architectural search in neural networks. NAS is a subarea of Automated Machine Learning (AutoML) and is an essential step towards automating machine learning methods. It is a technique that aims to automate the construction process of a neural network architecture. This technique is defined by the search space aspects of the architectures, search strategy and performance estimation strategy. Quantum-inspired evolutionary algorithms present promising results regarding faster convergence when compared to other solutions with restricted search space and high computational costs. In this work, we enhance Q-NAS: a quantum-inspired algorithm to search for deep networks by assembling simple substructures. Q-NAS can also evolve some numerical hyperparameters, which is a first step in the direction of complete automation. Our contribution involves experimenting other types of optimizers in the algorithm and make an indepth study of the Q-NAS parameters. Additionally, we present Q-NAS results, evolved from scratch, on the CIFAR-100 dataset using only 18 GPU/days. We were able to achieve an accuracy of 76.40% which is a competitive result regarding other works in literature. Finally, we also present the enhanced QNAS applied to a case study for COVID-19 x Healthy classification on a real chest computed tomography database. In 9 GPU/days we were able to achieve an accuracy of 99.44% using less than 1000 samples for training data. This accuracy overcame benchmark networks such as ResNet, GoogleLeNet and VGG.
246

Generating Synthetic Training Data with Stable Diffusion

Rynell, Rasmus, Melin, Oscar January 2023 (has links)
The usage of image classification in various industries has grown significantly in recentyears. There are however challenges concerning the data used to train such models. Inmany cases the data used in training is often difficult and expensive to obtain. Furthermore,dealing with image data may come with additional problems such as privacy concerns. Inrecent years, synthetic image generation models such as Stable Diffusion has seen signifi-cant improvement. Solely using a textual description, Stable Diffusion is able to generate awide variety of photorealistic images. In addition to textual descriptions, other condition-ing models such as ControlNet has enabled the possibility of additional grounding infor-mation, such as canny edge and segmentation images. This thesis investigates if syntheticimages generated by Stable Diffusion can be used effectively in training an image classifier.To find the most effective method for generating training data, multiple conditioning meth-ods are investigated and evaluated. The results show that it is possible to generate high-quality training data using several conditioning techniques. The best performing methodwas using canny edge grounded images to augment already existing data. Extending twoclasses with additional synthetic data generated by the best performing method, achievedthe highest average F1-score increase of 0.85 percentage points compared with a baselinesolely trained on real images.
247

Mutual Enhancement of Environment Recognition and Semantic Segmentation in Indoor Environment

Challa, Venkata Vamsi January 2024 (has links)
Background:The dynamic field of computer vision and artificial intelligence has continually evolved, pushing the boundaries in areas like semantic segmentation andenvironmental recognition, pivotal for indoor scene analysis. This research investigates the integration of these two technologies, examining their synergy and implicayions for enhancing indoor scene understanding. The application of this integrationspans across various domains, including smart home systems for enhanced ambientliving, navigation assistance for Cleaning robots, and advanced surveillance for security. Objectives: The primary goal is to assess the impact of integrating semantic segmentation data on the accuracy of environmental recognition algorithms in indoor environments. Additionally, the study explores how environmental context can enhance the precision and accuracy of contour-aware semantic segmentation. Methods: The research employed an extensive methodology, utilizing various machine learning models, including standard algorithms, Long Short-Term Memorynetworks, and ensemble methods. Transfer learning with models like EfficientNet B3, MobileNetV3 and Vision Tranformer was a key aspect of the experimentation. The experiments were designed to measure the effect of semantic segmentation on environmental recognition and its reciprocal influence. Results: The findings indicated that the integration of semantic segmentation data significantly enhanced the accuracy of environmental recognition algorithms. Conversely, incorporating environmental context into contour-aware semantic segmentation led to notable improvements in precision and accuracy, reflected in metrics such as Mean Intersection over Union(MIoU). Conclusion: This research underscores the mutual enhancement between semantic segmentation and environmental recognition, demonstrating how each technology significantly boosts the effectiveness of the other in indoor scene analysis. The integration of semantic segmentation data notably elevates the accuracy of environmental recognition algorithms, while the incorporation of environmental context into contour-aware semantic segmentation substantially improves its precision and accuracy.The results also open avenues for advancements in automated annotation processes, paving the way for smarter environmental interaction.
248

Avoiding Catastrophic Forgetting in Continual Learning through Elastic Weight Consolidation

Evilevitch, Anton, Ingram, Robert January 2021 (has links)
Image classification is an area of computer science with many areas of application. One key issue with using Artificial Neural Networks (ANN) for image classification is the phenomenon of Catastrophic Forgetting when training tasks sequentially (i.e Continual Learning). This is when the network quickly looses its performance on a given task after it has been trained on a new task. Elastic Weight Consolidation (EWC) has previously been proposed as a remedy to lessen the effects of this phenomena through the use of a loss function which utilizes a Fisher Information Matrix. We want to explore and establish if this still holds true for modern network architectures, and to what extent this can be applied using today’s state- of- the- art networks. We focus on applying this approach on tasks within the same dataset. Our results indicate that the approach is feasible, and does in fact lessen the effect of Catastrophic Forgetting. These results are achieved, however, at the cost of much longer execution times and time spent tuning the hyper- parameters. / Bildklassifiering är ett område inom dataologi med många tillämpningsområden. En nyckelfråga när det gäller användingen av Artificial Neural Networks (ANN) för bildklassifiering är fenomenet Catastrophic Forgetting. Detta inträffar när ett nätverk tränas sekventiellt (m.a.o. Continual Learning). Detta innebär att nätverket snabbt tappar prestanda för en viss uppgift efter att den har tränats på en ny uppgift. Elastic Weight Consolidation (EWC) har tidigare föreslagits som ett lindring genom applicering av en förlustfunktion som använder Fisher Information Matrix. Vi vill utforska och fastställa om detta fortfarande gäller för moderna nätverksarkitekturer, och i vilken utsträckning det kan tillämpas. Vi utför metoden på uppgifter inom en och samma dataset. Våra resultat visar att metoden är genomförbar och har en minskande effekt på Catastrophic Forgetting. Dessa resultat uppnås dock på bekostnad av längre körningstider och ökad tidsåtgång för val av hyperparametrar.
249

Simulating Large Scale Memristor Based Crossbar for Neuromorphic Applications

Uppala, Roshni 03 June 2015 (has links)
No description available.
250

A Comparison of CNN and Transformer in Continual Learning / En jämförelse mellan CNN och Transformer för kontinuerlig Inlärning

Fu, Jingwen January 2023 (has links)
Within the realm of computer vision tasks, Convolutional Neural Networks (CNN) and Transformers represent two predominant methodologies, often subject to extensive comparative analyses elucidating their respective merits and demerits. This thesis embarks on an exploration of these two models within the framework of continual learning, with a specific focus on their propensities for resisting catastrophic forgetting. We hypothesize that Transformer models exhibit a higher resilience to catastrophic forgetting in comparison to their CNN counterparts. To substantiate this hypothesis, a meticulously crafted experimental design was implemented, involving the selection of diverse models and continual learning approaches, and careful tuning of the networks to ensure an equitable comparison. In the majority of conducted experiments, encompassing both the contexts of class incremental learning settings and task incremental learning settings, our results substantiate the aforementioned hypothesis. Nevertheless, the insights garnered also underscore the necessity for more exhaustive and encompassing experimental evaluations to fully validate the asserted hypothesis. / Inom datorseende är Convolutional Neural Networks (CNN) och Transformers två dominerande metoder, som ofta är föremål för omfattande jämförande analyser som belyser deras respektive fördelar och nackdelar. Denna avhandling utforskar dessa två modeller inom ramen för kontinuerligt lärande, med särskilt fokus på deras benägenhet att motstå katastrofal glömska. Vi antar att Transformer-modeller uppvisar en ökad motståndskraft mot katastrofal glömska i jämförelse med deras CNN-motsvarigheter. För att underbygga denna hypotes implementerades en noggrant utformad experimentell design, som involverade val av olika modeller och kontinuerliga inlärningstekniker, och noggrann inställning av nätverken för att säkerställa en rättvis jämförelse. I majoriteten av de genomförda experimenten, som omfattade både inkrementell klassinlärning och inkrementell uppgiftsinlärning, bekräftade våra resultat den ovannämnda hypotesen. De insikter vi fått understryker dock också behovet av mer uttömmande och omfattande experimentella utvärderingar för att fullt ut validera den påstådda hypotesen.

Page generated in 0.0812 seconds