Global ETD Search

1	Deep Learning Using Vision And LiDAR For Global Robot Localization Gowling, Brett E 01 May 2024 (has links) (PDF) As the field of mobile robotics rapidly expands, precise understanding of a robot’s position and orientation becomes critical for autonomous navigation and efficient task performance. In this thesis, we present a snapshot-based global localization machine learning model for a mobile robot, the e-puck, in a simulated environment. Our model uses multimodal data to predict both position and orientation using the robot’s on-board cameras and LiDAR sensor. In an effort to minimize localization error, we explore different sensor configurations by varying the number of cameras and LiDAR layers used. Additionally, we investigate the performance benefits of different multimodal fusion strategies while leveraging the EfficientNet CNN architecture as our model’s foundation. Data collection and testing is conducted using Webots simulation software, and our results show that, when tested in a 12m x 12m simulated apartment environment, our model is able to achieve positional accuracy within 0.2m for each of the x and y coordinates and orientation accuracy within 2°, all without the need for sequential data history. Our results demonstrate the potential for accurate global localization of mobile robots in simulated environments without the need for existing maps or temporal data. Deep Learning Sensor Fusion LiDAR Vision Localization EfficientNet Robotics
2	USING ADVANCED DEEP LEARNING TECHNIQUES TO IDENTIFY DRAINAGE CROSSING FEATURES Edidem, Michael Isaiah 01 August 2024 (has links) (PDF) High-resolution digital elevation models (HRDEMs) enable precise mapping of hydrographic features. However, the absence of drainage crossings underpassing roads or bridges hinders accurate delineation of stream networks. Traditional methods such as on-screen digitization and field surveys for locating these crossings are time-consuming and expensive for extensive areas. This study investigates the effectiveness of deep learning models for automated drainage crossing detection using HRDEMs. The study also explores the performance of advanced classification algorithm such as EfficientNetV2 model using various co-registered HRDRM-derived geomorphological features, such as positive openness, geometric curvature, and topographic position index (TPI) variants, for drainage crossings classification. The results reveal that individual layers, particularly HRDEM and TPI21, achieve the best performance, while combining all five layers doesn't improve accuracy. Hence, effective feature screening is crucial, as eliminating less informative features enhances the F1 score. For drainage crossing detection, this study develops and trains deep learning models, Faster R-CNN and YOLOv5 object detectors, using HRDEM tiles and ground truth labels. These models achieve an average F1-score of 0.78 in Nebraska watershed and demonstrate successful transferability to other watersheds. This spatial object detection approach offers a promising avenue for automated, large-scale drainage crossing detection, facilitating the integration of these features into HRDEMs and improving the accuracy of hydrographic network delineation. Drainage crossing EfficientNet Faster R-CNN GeoAI Hydrography YOLO
3	Исследование модели нейронной сети с пространственным вниманием для классификации фракции щебня : магистерская диссертация / Research on a neural network model with spatial attention for gravel fraction classification Тряпицын, Д. Л., Tryapitsyn, D. L. January 2024 (has links) В строительной сфере применяют различные фракции щебня в качестве засыпных смесей. Учитывая, что стоимость щебня зависит от его вида, требуется автоматизированная система для проверки его типа и исключения человеческих ошибок. В данной работе предлагается метод классификации фракции щебня на изображении с помощью архитектуры EfficientNet-b1 с использованием пространственного внимания (Spatial Attention), совмещенный с функцией потерь LDAM. Для обучения и тестирования модели использовался набор из 635 изображений, разделенный на 7 фракций щебня. Полученная модель показала высокую точность, достигнув уровня 97%. / In the construction industry, various fractions of gravel are used as aggregate materials. Considering that the cost of gravel depends on its type, an automated system is required to verify its type and eliminate human errors. This work proposes a method for classifying gravel fractions in images using the EfficientNet-b1 architecture with the addition of Spatial Attention, combined with the LDAM loss function. A dataset of 635 images, divided into 7 gravel fractions, was used for training and testing the model. The resulting model demonstrated high accuracy, reaching a level of 97%. MASTER'S THESIS COMPUTER VISION CRUSHED STONE FRACTION EFFICIENTNET SPATIAL ATTENTION LDAM IMAGE CLASSIFICATION КОМПЬЮТЕРНОЕ ЗРЕНИЕ ФРАКЦИЯ ЩЕБНЯ EFFICIENTNET LDAM
4	Produktmatchning EfficientNet vs. ResNet : En jämförelse / Product matching EfficientNet vs. ResNet Malmgren, Emil, Järdemar, Elin January 2021 (has links) E-handeln ökar stadigt och mellan åren 2010 och 2014 var det en ökning på antalet konsumenter som handlar online från 28,9% till 34,2%. Otillräcklig information kring en produkts pris tvingar köpare att leta bland flera olika återförsäljare efter det bästa priset. Det finns olika sätt att ta fram informationen som krävs för att kunna jämföra priser. En metod för att kunna jämföra priser är automatiserad produktmatchning. Denna metod använder algoritmer för bildigenkänning där dess syfte är att detektera, lokalisera och känna igen objekt i bilder. Bildigenkänningsalgoritmer har ofta problem med att hitta objekt i bilder på grund av yttre faktorer såsom belysning, synvinklar och om bilden innehåller mycket onödig information. Tidigare har algoritmer såsom ANN (artificial neural network), random forest classifier och support vector machine används men senare undersökningar har visat att CNN (convolutional neural network) är bättre på att hitta viktiga egenskaper hos objekt som gör dem mindre känsliga mot dessa yttre faktorer. Två exempel på alternativa CNN-arkitekturer som vuxit fram är EfficientNet och ResNet som båda har visat bra resultat i tidigare forskning men det finns inte mycket forskning som hjälper en välja vilken CNN-arkitektur som leder till ett så bra resultat som möjligt. Vår frågeställning är därför: Vilken av EfficientNet- och ResNetarkitekturerna ger det högsta resultatet på produktmatchning med utvärderingsmåtten f1-score, precision och recall? Resultatet av studien visar att EfficientNet är den över lag bästa arkitekturen för produktmatchning på studiens datamängd. Resultatet visar också att ResNet var bättre än EfficientNet på att föreslå rätt matchningar av bilderna. De matchningarna ResNet gör stämmer mer än de matchningar EfficientNet föreslår då Resnet fick ett högre recall än vad EfficientNet fick. EfficientNet uppnår dock en bättre recall som visar att EfficientNet är bättre än ResNet på att hitta fler eller alla korrekta matchningar bland sina potentiella matchningar. Men skillnaden i recall är större mellan modellerna vilket göra att EfficientNet får en högre f1-score och är över lag bättre än ResNet, men vad som är viktigast kan diskuteras. Är det viktigt att de föreslagna matchningarna är korrekta eller att man hittar alla korrekta matchningar. Är det viktigaste att de föreslagna matchningarna är korrekta har ResNet ett övertag men är det viktigare att hitta alla korrekta matchningar har EfficientNet ett övertag. Resultatet beror därför på vad som anses vara viktigast för att avgöra vilken av arkitekturerna som ger bäst resultat. / E-commerce is steadily increasing and between the years 2010 and 2014, there was an increase in the number of consumers shopping online from 28,9% to 34,2%. Insufficient information about the price of a product forces buyers to search among several different retailers for the best price. There are different ways to produce the information required to be able to compare prices. One method to compare prices is automated product matching. This method uses image recognition algorithms where its purpose is to detect, locate and recognize objects in images. Image recognition algorithms often have problems finding objects in images due to external factors such as brightness, viewing angles and if the image contains a lot of unnecessary information. In the past, algorithms such as ANN, random forest classifier and support vector machine have been used, but recent studies have shown that CNN is better at finding important properties of objects that make them less sensitive to these external factors. Two examples of alternative CNN architectures that have emerged are EfficientNet and ResNet, both of which have shown good results in previous studies, but there is not a lot of research that helps one choose which CNN architecture that leads to the best possible result. Our question is therefore: Which of the EfficientNet and ResNet architectures gives the highest result on product matching with the evaluation measures f1-score, precision, and recall? The results of the study show that EfficientNet is the overall best architecture for product matching on the dataset. The results also show that ResNet was better than EfficientNet in proposing the right matches for the images. The matches ResNet makes are more accurate than the matches EfficientNet suggests when Resnet received a higher precision than EfficientNet. However, EfficientNet achieves a better recall that shows that EfficientNet is better than ResNet at finding more or all correct matches among its potential matches. The difference in recall is greater than the difference in precision between the models, which means that EfficientNet gets a higher f1-score and is generally better than ResNet, but what is most important can be discussed. Is it important that the suggested matches are correct or that you find all the correct matches? If the most important thing is that the proposed matches are correct, ResNet has an advantage, but if it is more important to find all correct matches, EfficientNet has an advantage. The result therefore depends on what is considered to be most important in determining which of the architectures gives the best results EfficientNet ResNet CNN Convolutional Neural Network image classification product matching price matching object recognition. EfficientNet ResNet CNN Convolutional Neural Network bildklassificering produktmatchning prismatchning objektigenkänning. Computer and Information Sciences Data- och informationsvetenskap
5	Efektivnost hlubokých konvolučních neuronových sítí na elementární klasifikační úloze / Efficiency of deep convolutional neural networks on an elementary classification task Prax, Jan January 2021 (has links) In this thesis deep convolutional neural networks models and feature descriptor models are compared. Feature descriptors are paired with suitable chosen classifier. These models are a part of machine learning therefore machine learning types are described in this thesis. Further these chosen models are described, and their basics and problems are explained. Hardware and software used for tests is listed and then test results and results summary is listed. Then comparison based on the validation accuracy and training time of these said models is done.
6	Detekce aktuálního podlaží při jízdě výtahem / Floor detection during elevator ride Havelka, Martin January 2021 (has links) This diploma thesis deals with the detection of the current floor during elevator ride. This functionality is necessary for robot to move in multi-floor building. For this task, a fusion of accelerometric data during the ride of the elevator and image data obtained from the information display inside the elevator cabin is used. The research describes the already implemented solutions, data fusion methods and image classification options. Based on this part, suitable approaches for solving the problem were proposed. First, datasets from different types of elevator cabins were obtained. An algorithm for working with data from the accelerometric sensor was developed. A convolutional neural network, which was used to classify image data from displays, was selected and trained. Subsequently, the data fusion method was implemented. The individual parts were tested and evaluated. Based on their evaluation, integration into one functional system was performed. System was successfully verified and tested. Result of detection during the ride in different elevators was 97%.
7	Improving Object Detection using Enhanced EfficientNet Architecture Michael Youssef Kamel Ibrahim (16302596) 30 August 2023 (has links) <p>EfficientNet is designed to achieve top accuracy while utilizing fewer parameters, in addition to less computational resources compared to previous models. </p> <p><br></p> <p>In this paper, we are presenting compound scaling method that re-weight the network's width (w), depth (d), and resolution (r), which leads to better performance than traditional methods that scale only one or two of these dimensions by adjusting the hyperparameters of the model. Additionally, we are presenting an enhanced EfficientNet Backbone architecture. </p> <p><br></p> <p>We show that EfficientNet achieves top accuracy on the ImageNet dataset, while being up to 8.4x smaller and up to 6.1x faster than previous top performing models. The effectiveness demonstrated in EfficientNet on transfer learning and object detection tasks, where it achieves higher accuracy with fewer parameters and less computation. Henceforward, the proposed enhanced architecture will be discussed in detail and compared to the original architecture.</p> <p><br></p> <p>Our approach provides a scalable and efficient solution for both academic research and practical applications, where resource constraints are often a limiting factor.</p> <p><br></p> Automotive safety engineering EfficientNet Sandglass Bottleneck Mixed Convolution Hard Swish
8	Boosting CNN Performance in Digital Pathology Using Colour Normalisation and Ensembling Kvarnström, Emelie, Tibbling, Axel January 2021 (has links) Researchers within digital pathology are endeavouringto develop machine-learning tools to support dentists whenmaking a diagnosis. The purpose of this study was to investigatehow applying colour normalisation (CN) algorithms on an oral,histopathological dataset would impact both machine-learningmodels and ensembles of models when classifying cell types.The dataset was run through four different CN algorithms byusing a stain normalisation toolbox. The now five datasets (1 +4) were then fed separately into a pipeline to create machinelearningmodels, specifically convolutional neural networks withEfficientNet architecture. Two different ensembles were studied,one that used all the models and one that used the three modelswith the highest test accuracy. Each model gave a cell typeprediction of each cell. The ensembles super positioned theirmodels’ predictions of the same cell and used the results as theirown predictions.The models based on datasets created by two of the CNalgorithms had a weighted, average accuracy of ca. four percentagepoints higher than the model based on the unnormaliseddataset. Unexpectedly, the models based on the colour-normaliseddatasets had a larger standard deviation than the model basedon the unnormalised dataset. All the models were generally badat classifying two of the four cell types. Both the ensembleshad a weighted, average accuracy of ca. ten percentage pointshigher than the model based on the unnormalised dataset, aswell as a larger standard deviation. The increase in accuracyis significant and could move forward the timeline for whenmachine-leaning tools can be implemented into dentists’ andpathologists’ workflow. / Forskare inom digital patologi strävar efteratt utveckla maskininlärnings-verktyg som stödjer tandläkarenär de ställer diagnoser. Syftet med denna studie är att utreda hurtillämpning av färgnormaliserande algoritmer (CN algoritmer)på ett oralt, histopatologiskt dataset påverkar hur både maskininlärningsmodeller och ensembler av modeller klassificerarcelltyper.Datasetet kördes igenom fyra olika CN algoritmer med hjälpav en färgnormaliserings-verktygslåda. De nu fem dataseten(1 + 4) matades separat in i en ”pipeline” för att skapamaskininlärningsmodeller, specifikt djupa neurala nätverk medEfficientNet arkitektur. Två olika ensembler skapades, en somanvände alla modeller och en som endast använde de tre somhade högst noggrannhet på testsettet. Varje modell uppskattadecelltypen för varje cell. Ensemblerna superpositionerade derasmodellers uppskattningar för varje cell och använde resultatensom sina egna uppskattningar.Modellerna som tränats på två av de färgnormaliseradedataseten ökade i viktad, snitt-noggrannhet med fyra procentenheteri förhållande till modeller tränade på det ursprungligadatasetet. Förvånansvärt nog så ökade även standardavvikelsenhos modeller tränade på de färgnormaliserade dataseten. Allamodeller var generellt dåliga på att klassificera två av de fyracelltyperna. Ensemblen uppnådde en viktad snitt-noggrannhet på ca. tio procentenheter mer än modeller tränade på detursprungliga datasetet. Noggrannhetens signifikanta ökning kanleda till en tidigare implementering av maskininlärnings-verktygi tandläkares och patologers arbetsflöde. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm Colour normalisation Histogram Khan Macenko Reinhard ensemble digital pathology histopathology deep learning EfficientNet Elektroteknik och elektronik
9	ENHANCED MULTIPLE DENSE LAYER EFFICIENTNET Aswathy Mohan (18806656) 03 September 2024 (has links) <p dir="ltr">In the dynamic and ever-evolving landscape of Artificial Intelligence (AI), the domain of deep learning has emerged as a pivotal force, propelling advancements across a broad spectrum of applications, notably in the intricate field of image classification. Image classification, a critical task that involves categorizing images into predefined classes, serves as the backbone for numerous cutting-edge technologies, including but not limited to, automated surveillance, facial recognition systems, and advanced diagnostics in healthcare. Despite the significant strides made in the area, the quest for models that not only excel in accuracy but also demonstrate robust generalization across varied datasets, and maintain resilience against the pitfalls of overfitting, remains a formidable challenge.</p><p dir="ltr">EfficientNetB0, a model celebrated for its optimized balance between computational efficiency and accuracy, stands at the forefront of solutions addressing these challenges. However, the nuanced complexities of datasets such as CIFAR-10, characterized by its diverse array of images spanning ten distinct categories, call for specialized adaptations to harness the full potential of such sophisticated architectures. In response, this thesis introduces an optimized version of the EffciientNetB0 architecture, meticulously enhanced with strategic architectural modifications, including the incorporation of an additional Dense layer endowed with 512 units and the strategic use of Dropout regularization. These adjustments are designed to amplify the model's capacity for learning and interpreting complex patterns inherent in the data.</p><p dir="ltr">Complimenting these architectural refinements, a nuanced two-phase training methodology is also adopted in the proposed model. This approach commences with the initial phase of training where the base model's pre-trained weights are frozen, thus leveraging the power of transfer learning to secure a solid foundational understanding. The subsequent phase of fine-tuning, characterized by the selective unfreezing of layers, meticulously calibrates the model to the intricacies of the CIFAR-10 dataset. This is further bolstered by the implementation of adaptive learning rate adjustments, ensuring the model’s training process is both efficient and responsive to the nuances of the learning curve.</p><p><br></p> Computer vision Image processing Neural networks image classification effect efficientnet algorithm Machine Learning (ML) models machine learning and AI computer vision approaches overfitting Overfitting
10	Art to Genre through Deep Learning: A Comparative Analysis of ResNet and EfficientNet for Album Cover Image-Based Music Classification Bernsdorff Wallstedt, Simon January 2024 (has links) Musical genres enable listeners to differentiate between diverse styles and forms of music, serving as a practical tool to organize and categorize artists, albums, and songs. Album covers, featuring graphic depictions that reflect the vibe and tone of the music, serve as a visual intermediary between the artist and the audience. While numerous machine learning techniques leverage textual, visual, and audio information in a multi-modal approach to categorize music, the sole focus on visual aspects, specifically album cover images, and their correlation with musical genres has been less explored. The question guides this research: How do EfficientNet and ResNet compare in their ability to accurately classify album cover images into specific genres based solely on visual features? Two state-of-the-art convolutional neural networks, ResNet and EfficientNet, are employed to classify a newly created dataset (the EquiGen dataset) of 60,000 album cover images into 15 distinct genres. The dataset was divided into 70% for training, 15% for validation, and 15% for testing.The findings reveal that both ResNet and EfficientNet achieve better-than-random classification accuracy, indicating that visual features alone can be informative for genre classification. Some genres performed much better than others, namely Metal, New Age and Rap. EfficientNet demonstrated slightly superior performance compared to ResNet, with higher accuracy, precision, recall, and F1 scores. However, both models exhibited challenges in generalizing well-to-unseen data and showed signs of overfitting.This study contributes to the interdisciplinary research on Music Genre Categorization (MGC), machine learning, and music. CNN convolutional neural network deep learning music genre categorization (MGC) music information retrieval (MIR) EfficientNet ResNet album cover artwork Information Systems, Social aspects

Search results