Global ETD Search

91	Real-time object detection robotcontrol : Investigating the use of real time object detection on a Raspberry Pi for robot control / Autonom robot styrning via realtids bildigenkänning : Undersökning av användningen av realtids bildigenkänning på en Raspberry Pi för robotstyrning Ryberg, Simon, Jansson, Jonathan January 2022 (has links) The field of autonomous robots have been explored more and more over the last decade. The combination of machine learning advances and increases in computational power have created possibilities to explore the usage of machine learning models on edge devices. The usage of object detection on edge devices is bottlenecked by the edge devices' limited computational power and they therefore have constraints when compared to the usage of machine learning models on other devices. This project explored the possibility to use real time object detection on a Raspberry Pi as input in different control systems. The Raspberry with the help of a coral USB accelerator was able to find a specified object and drive to it, and it did so successfully with all the control systems tested. As the robot was able to navigate to the specified object with all control systems, the possibility of using real time object detection in faster paced situations can be explored. / Ämnet autonoma robotar har blivit mer och mer undersökt under de senaste årtiondet. Kombinationen av maskin inlärnings förbättringar och ökade beräknings möjligheter hos datorer och chip har gjort det möjligt att undersöka användningen av maskin inlärningsmodeller på edge enheter. Användandet av bildigenkänning på edge enheter är begränsad av edge enheten begränsade datorkraft, och har därför mer begränsningar i jämförelse med om man använder bildigenkänning på en annan typ av enhet. Det här projektet har undersökt möjligheten att använda bildigenkänning i realtid som input för kontrollsystem på en Raspberry Pi. Raspberry Pien med hjälp av en Coral USB accelerator lyckades att lokalisera och köra till ett specificerat objekt, Raspberryn gjorde detta med alla kontrollsystem som testades på den. Eftersom roboten lyckades med detta, så öppnas möjligheten att använda bildigenkänning på edge enheter i snabbare situationer. Engineering and Technology Teknik och teknologier
92	APPLICATIONS OF 4-STATE NANOMAGNETIC LOGIC USING MULTIFERROIC NANOMAGNETS POSSESSING BIAXIAL MAGNETOCRYSTALLINE ANISOTROPY AND EXPERIMENTS ON 2-STATE MULTIFERROIC NANOMAGNETIC LOGIC D'Souza, Noel 01 January 2014 (has links) Nanomagnetic logic, incorporating logic bits in the magnetization orientations of single-domain nanomagnets, has garnered attention as an alternative to transistor-based logic due to its non-volatility and unprecedented energy-efficiency. The energy efficiency of this scheme is determined by the method used to flip the magnetization orientations of the nanomagnets in response to one or more inputs and produce the desired output. Unfortunately, the large dissipative losses that occur when nanomagnets are switched with a magnetic field or spin-transfer-torque inhibit the promised energy-efficiency. Another technique offering superior energy efficiency, “straintronics”, involves the application of a voltage to a piezoelectric layer to generate a strain which is transferred to an elastically coupled magnetrostrictive layer, causing magnetization rotation. The functionality of this scheme can be enhanced further by introducing magnetocrystalline anisotropy in the magnetostrictive layer, thereby generating four stable magnetization states (instead of the two stable directions produced by shape anisotropy in ellipsoidal nanomagnets). Numerical simulations were performed to implement a low-power universal logic gate (NOR) using such 4-state magnetostrictive/piezoelectric nanomagnets (Ni/PZT) by clocking the piezoelectric layer with a small electrostatic potential (~0.2 V) to switch the magnetization of the magnetic layer. Unidirectional and reliable logic propagation in this system was also demonstrated theoretically. Besides doubling the logic density (4-state versus 2-state) for logic applications, these four-state nanomagnets can be exploited for higher order applications such as image reconstruction and recognition in the presence of noise, associative memory and neuromorphic computing. Experimental work in strain-based switching has been limited to magnets that are multi-domain or magnets where strain moves domain walls. In this work, we also demonstrate strain-based switching in 2-state single-domain ellipsoidal magnetostrictive nanomagnets of lateral dimensions ~200 nm fabricated on a piezoelectric substrate (PMN-PT) and studied using Magnetic Force Microscopy (MFM). A nanomagnetic Boolean NOT gate and unidirectional bit information propagation through a finite chain of dipole-coupled nanomagnets are also shown through strain-based "clocking". This is the first experimental demonstration of strain-based switching in nanomagnets and clocking of nanomagnetic logic (Boolean NOT gate), as well as logic propagation in an array of nanomagnets. nanomagnetic logic spintronics straintronics multiferroics four-state magnetic anisotropy magnetoresistive devices piezoelectric materials image recognition strain-based clocking NOR logic PMN-PT ultra low power devices Engineering
93	REGTEST - an Automatic & Adaptive GUI Regression Testing Tool. Forsgren, Robert, Petersson Vasquez, Erik January 2018 (has links) Software testing is something that is very common and is done to increase the quality of and confidence in a software. In this report, an idea is proposed to create a software for GUI regression testing which uses image recognition to perform steps from test cases. The problem that exists with such a solution is that if a GUI has had changes made to it, then many test cases might break. For this reason, REGTEST was created which is a GUI regression testing tool that is able to handle one type of change that has been made to the GUI component, such as a change in color, shape, location or text. This type of solution is interesting because setting up tests with such a tool can be very fast and easy, but one previously big drawback of using image recognition for GUI testing is that it has not been able to handle changes well. It can be compared to tools that use IDs to perform a test where the actual visualization of a GUI component does not matter; It only matters that the ID stays the same; however, when using such tools, it either requires underlying knowledge of the GUI component naming conventions or the use of tools which automatically constructs XPath queries for the components. To verify that REGTEST can work as well as existing tools a comparison was made against two professional tools called Ranorex and Kantu. In those tests, REGTEST proved very successful and performed close to, or better than the other software. GUI Test Regression Regression test Machine vision Google Cloud Vision OpenCV Adaptive Automatic Automation Image recognition Web Testing Similarity. Övrig annan teknik Computer Engineering Datorteknik
94	Video Recommendation Based on Object Detection Nyberg, Selma January 2018 (has links) In this thesis, various machine learning domains have been combined in order to build a video recommender system that is based on object detection. The work combines two extensively studied research fields, recommender systems and computer vision, that also are rapidly growing and popular techniques on commercial markets. To investigate the performance of the approach, three different content-based recommender systems have been implemented at Spotify, which are based on the following video features: object detections, titles and descriptions, and user preferences. These systems have then been evaluated and compared against each other together with their hybridized result. Two algorithms have been implemented, the prediction and the top-N algorithm, where the former is the more reliable source for evaluating the system's performance. The evaluation of the system shows that the overall performance scores for predicting values of the users' liked and disliked videos are in the range from about 40 % to 70 % for the prediction algorithm and from about 15 % to 70 % for the top-N algorithm. The approach based on object detection performs worse in comparison to the other approaches. Hence, there seems to be is a low correlation between the user preferences and the video contents in terms of object detection data. Therefore, this data is not very suitable for describing the content of videos and using it in the recommender system. However, the results of this study cannot be generalized to apply for other systems before the approach has been evaluated in other environments and for various data sets. Moreover, there are plenty of room for refinements and improvements to the system, as well as there are many interesting research areas for future work. Spotify Machine Learning Artificial Intelligence Recommender Systems Content-Based Filtering Collaborative Filtering Hybrid Filtering Deep Learning Image Recognition Object Detection Natural Language Processing Paragraph Vectors Doc2Vec TensorFlow Classification K-Nearest Neighbors Cross-Validation Computer and Information Sciences Data- och informationsvetenskap
95	Méthodes fréquentielles pour la reconnaissance d'images couleur : une approche par les algèbres de Clifford / Frequency methods for color image recognition : An approach based on Clifford algebras Mennesson, José 18 November 2011 (has links) Dans cette thèse, nous nous intéressons à la reconnaissance d’images couleur à l’aide d’une nouvelle approche géométrique du domaine fréquentiel. La plupart des méthodes existantes ne traitent que les images en niveaux de gris au travers de descripteurs issus de la transformée de Fourier usuelle. L’extension de telles méthodes aux images multicanaux, comme par exemple les images couleur, consiste généralement à reproduire un traitement identique sur chacun des canaux. Afin d’éviter ce traitement marginal, nous étudions et mettons en perspective les différentes généralisations de la transformée de Fourier pour les images couleur. Ce travail nous oriente vers la transformée de Fourier Clifford pour les images couleur définie dans le cadre des algèbres géométriques. Une étude approfondie de celle-ci nous conduit à définir un algorithme de calcul rapide et à proposer une méthode de corrélation de phase pour les images couleur. Dans un deuxième temps, nous cherchons à généraliser à travers cette transformée de Fourier les définitions des descripteurs de Fourier de la littérature. Nous étudions ainsi les propriétés, notamment l’invariance à la translation, rotation et échelle, des descripteurs existants. Ce travail nous mène à proposer trois nouveaux descripteurs appelés “descripteurs de Fourier couleur généralisés”(GCFD) invariants en translation et en rotation.Les méthodes proposées sont évaluées sur des bases d’images usuelles afin d’estimer l’apport du contenu fréquentiel couleur par rapport aux méthodes niveaux de gris et marginales. Les résultats obtenus à l’aide d’un classifieur SVM montrent le potentiel des méthodes proposées ; les descripteurs GCFD se révèlent être plus compacts, de complexité algorithmique moindre pour des performances de classification au minimum équivalentes. Nous proposons également des heuristiques pour le choix du paramètre de la transformée de Fourier Clifford.Cette thèse constitue un premier pas vers une généralisation des méthodes fréquentielles aux images multicanaux. / In this thesis, we focus on color image recognition using a new geometric approach in the frequency domain. Most existing methods only process grayscale images through descriptors defined from the usual Fourier transform. The extension of these methods to multichannel images such as color images usually consists in reproducing the same processing for each channel. To avoid this marginal processing,we study and compare the different generalizations of color Fourier transforms. This work leads us to use the Clifford Fourier transform for color images defined in the framework of geometric algebra. A detailed study of it leads us to define a fast algorithm and to propose a phase correlation for colorimages. In a second step, with the aim of generalizing Fourier descriptors of the literature with thisFourier transform, we study their properties, including invariance to translation, rotation and scale.This work leads us to propose three new descriptors called “generalized color Fourier descriptors”(GCFD) invariant in translation and in rotation.The proposed methods are evaluated on usual image databases to estimate the contribution of color frequency content compared with grayscale and marginal methods. The results obtained usingan SVM classifier show the potential of the proposed methods ; the GCFD are more compact, have less computational complexity and give better recognition rates. We also propose heuristics for choosing the parameter of the color Clifford Fourier transform.This thesis is a first step towards a generalization of frequency methods to multichannel images. Transformée de Fourier Reconnaissance d'images Descripteurs de Fourier couleur Corrélation de phase couleur Algèbre de Clifford Images couleur Méthodes géométriques Fourier transform Image recognition Color Fourier descriptors Color phase correlation Clifford algebra Color images Geometric methods
96	Charakterizace chodců ve videu / Pedestrian Attribute Analysis Studená, Zuzana January 2019 (has links) This work deals with obtaining pedestrian information, which are captured by static, external cameras located in public, outdoor or indoor spaces. The aim is to obtain as much information as possible. Information such as gender, age and type of clothing, accessories, fashion style, or overall personality are obtained using using convolutional neural networks. One part of the work consists of creating a new dataset that captures pedestrians and includes information about the person's sex, age, and fashion style. Another part of the thesis is the design and implementation of convolutional neural networks, which classify the mentioned pedestrian characteristics. Neural networks evaluate pedestrian input images in PETA, FashionStyle14 and BUT Pedestrian Attributes datasets. Experiments performed over the PETA and FashionStyle datasets compare my results to various convolutional neural networks described in publications. Further experiments are shown on created BUT data set of pedestrian attributes.
97	Detekce objektů pomocí Kinectu / Object Detection Using Kinect Řehánek, Martin January 2012 (has links) With the release of the Kinect device new possibilities appeared, allowing a simple use of image depth in image processing. The aim of this thesis is to propose a method for object detection and recognition in a depth map. Well known method Bag of Words and a descriptor based on Spin Image method are used for the object recognition. The Spin Image method is one of several existing approaches to depth map which are described in this thesis. Detection of object in picture is ensured by the sliding window technique. That is improved and speeded up by utilization of the depth information.
98	Matching Sticky Notes Using Latent Representations / Matchning av klisterlappar med hjälp av latent representation García San Vicent, Javier January 2022 (has links) his project addresses the issue of accurately identifying repeated images of sticky notes. Due to environmental conditions and the 3D location of the camera, different pictures taken of sticky notes may look distinct enough to be hard to determine if they belong to the same note. More specifically, this thesis aims to create latent representations of these pictures of sticky notes to encode their content so that all the pictures of the same note have a similar representation that allows to identify them. Thus, those representations must be invariant to light conditions, blur and camera position. To that end, a Siamese neural architecture will be trained based on data augmentation methods. The method consists of learning to embed two augmented versions of the same image into similar representations. This architecture has been trained with unsupervised learning and fine-tuned with supervised learning to detect if two representations belong or not to the same note. The performance of ResNet, EfficientNet and Vision Transformers in encoding the images into their representations has been compared with different configurations. The results show that, while the most complex models overfit small amounts of data, the simplest encoders are capable of properly identifying more than 95% of the sticky notes in grey scale. Those models can create invariant representations that are close to each other in the latent space for pictures of the same sticky note. Gathering more data could result in an improvement of the performance of the model and the possibility of applying it to other fields such as handwritten documents. / Detta projekt tar upp frågan om att identifiera upprepade bilder av klisterlappar. På grund av miljöförhållanden och kamerans 3D-placering kan olika bilder som tagits till klisterlappar se tillräckligt distinkta ut för att det ska vara svårt att avgöra om de faktiskt tillhör samma klisterlappar. Mer specifikt är syftet med denna avhandling att skapa latenta representationer av bilder av klisterlappar som kodar deras innehåll, så att alla bilder av en klisterlapp har en liknande representation som gör det möjligt att identifiera dem. Sålunda måste representationerna vara oföränderliga för ljusförhållanden, oskärpa och kameraposition. För det ändamålet kommer en enkel siamesisk neural arkitektur att tränas baserad på dataförstärkningsmetoder. Metoden går ut på att lära sig att göra representationerna av två förstärkta versioner av en bild så lika som möjligt. Genomatt tillämpa vissa förbättringar av arkitekturen kan oövervakat lärande användas för att träna nätverket. Prestandan hos ResNet, EfficientNet och Vision Transformers när det gäller att koda bilderna till deras representationer har jämförts med olika konfigurationer. Resultaten visar att även om de mest komplexa modellerna överpassar små mängder data, kan de enklaste kodarna korrekt identifiera mer än 95% av klisterlapparna. Dessa modeller kan skapa oföränderliga representationer som är nära i det latenta utrymmet för bilder av samma klisterlapp. Att samla in mer data kan resultera i en förbättring av modellens prestanda och möjligheten att tillämpa den på andra områden som till exempel handskrivna dokument. Pattern matching Image matching Image recognition Representation learning Unsupervised learning Semisupervised learning Siamese architecture Deep learning Transfer learning Mönstermatchning Bildmatchning Bildigenkänning Representationsinlärning Oövervakat lärande Halvövervakat lärande Siamesisk arkitektur Djup lärning Överfört lärande Computer and Information Sciences Data- och informationsvetenskap
99	Mobilní aplikace využívající hlubokých konvolučních neuronových sítí / Mobile Application Using Deep Convolutional Neural Networks Poliak, Sebastián January 2018 (has links) This thesis describes a process of creating a mobile application using deep convolutional neural networks. The process starts with proposal of the main idea, followed by product and technical design, implementation and evaluation. The thesis also explores the technical background of image recognition, and chooses the most suitable options for the purpose of the application. These are object detection and multi-label classification, which are both implemented, evaluated and compared. The resulting application tries to bring value from both user and technical point of view.
100	Detekce vad vláknitého materiálu užitím metod strojového učení / Defect detection on fiber materials using machine learning Lang, Matěj January 2019 (has links) Cílem této diplomové práce je automatizace detekce vad ve vláknitých materiálech. Firma SILON se již přes padesát let zabývá výrobou jemné vaty z recyklovaných PET lahví. Tato vata se následně používá ve stavebnictví, automobilovém průmyslu, ale nejčastěji v dámských hygienických potřebách a dětských plenách. Cílem firmy je produkovat co nejkvalitnější výrobek a proto je každá dávka testována v laboratoři s několika přísnými kritérii. Jednám z testů je i množství vadných vláken, jako jsou zacuchané smotky vláken, nebo nevydloužená vlákna, která jsou tvrdá a snadno se lámou. Navrhovaný systém sestává ze snímací lavice fungující jako scanner, která nasnímá vzorek vláken, který byl vložen mezi dvě skleněné desky. Byla provedena série testů s různým osvětlením, která ověřovala vlastnosti Rhodaminu, který se používá právě na rozlišení defektů od ostatních vláken. Tyto defekty mají zpravidla jinou molekulární strukturu, na kterou se barvivo chytá lépe. Protože je Rhodamin fluorescenční barvivo, je možné ho například pod UV světlem snáze rozeznat. Tento postup je využíván při manuální detekci. Při snímání kamerou je možno si vypomoci filtrem na kameře, který odfiltruje excitační světlo a propustí pouze světlo vyzářené Rhodaminem. Součástí výroby skeneru byla i tvorba ovládacího programu. Byla vytvořena vlastní knihovna pro ovládání motoru a byla upravena knihovna pro kameru. Oba systém pak bylo možno ovládat pomocí jednotného GUI, které zajišťovalo pořizování snímku celé desky. Pomocí skeneru byla nasnímána řada snímků, které bylo třeba anotovat, aby bylo možné naučit počítač rozlišovat defekty. Anotace proběhla na pixelové úrovni; každý defekt byl označen v grafickém editoru ve speciální vrstvě. Pro rozlišování byla použita umělá neuronová síť, která funguje na principu konvolucí. Tento typ sítě je navíc plně konvoluční, takže výstupem sítě je obraz, který by měl označit na tom původním vadné pixely. Výsledky naučené sítě jsou v práci prezentovány a diskutovány. Síť byla schopna se naučit rozeznávat většinu defektů a spolehlivě je umí rozeznat a segmentovat. Potíže má v současné době s detekcí rozmazaných defektů na krajích zorného pole a s defekty, jejichž hranice není tolik zřetelná na vstupních obrazech. Nutno zmínit, že zákazník má zájem o kompletní řešení scanneru i s detekčním softwarem a vývoj tohoto zařízení bude pokračovat i po závěru této diplomové práce.

Search results