Global ETD Search

51	Probabilistic incremental learning for image recognition : modelling the density of high-dimensional data Carvalho, Edigleison Francelino January 2014 (has links) Atualmente diversos sistemas sensoriais fornecem dados em fluxos e essas observações medidas são frequentemente de alta dimensionalidade, ou seja, o número de variáveis medidas é grande, e as observações chegam em sequência. Este é, em particular, o caso de sistemas de visão em robôs. Aprendizagem supervisionada e não-supervisionada com esses fluxos de dados é um desafio, porque o algoritmo deve ser capaz de aprender com cada observação e depois descartá-la antes de considerar a próxima, mas diversos métodos requerem todo o conjunto de dados a fim de estimar seus parâmetros e, portanto, não são adequados para aprendizagem em tempo real. Além disso, muitas abordagens sofrem com a denominada maldição da dimensionalidade (BELLMAN, 1961) e não conseguem lidar com dados de entrada de alta dimensionalidade. Para superar os problemas descritos anteriormente, este trabalho propõe um novo modelo de rede neural probabilístico e incremental, denominado Local Projection Incremental Gaussian Mixture Network (LP-IGMN), que é capaz de realizar aprendizagem perpétua com dados de alta dimensionalidade, ou seja, ele pode aprender continuamente considerando a estabilidade dos parâmetros do modelo atual e automaticamente ajustar sua topologia levando em conta a fronteira do subespaço encontrado por cada neurônio oculto. O método proposto pode encontrar o subespaço intrísico onde os dados se localizam, o qual é denominado de subespaço principal. Ortogonal ao subespaço principal, existem as dimensões que são ruidosas ou que carregam pouca informação, ou seja, com pouca variância, e elas são descritas por um único parâmetro estimado. Portanto, LP-IGMN é robusta a diferentes fontes de dados e pode lidar com grande número de variáveis ruidosas e/ou irrelevantes nos dados medidos. Para avaliar a LP-IGMN nós realizamos diversos experimentos usando conjunto de dados simulados e reais. Demonstramos ainda diversas aplicações do nosso método em tarefas de reconhecimento de imagens. Os resultados mostraram que o desempenho da LP-IGMN é competitivo, e geralmente superior, com outras abordagens do estado da arte, e que ela pode ser utilizada com sucesso em aplicações que requerem aprendizagem perpétua em espaços de alta dimensionalidade. / Nowadays several sensory systems provide data in ows and these measured observations are frequently high-dimensional, i.e., the number of measured variables is large, and the observations are arriving in a sequence. This is in particular the case of robot vision systems. Unsupervised and supervised learning with such data streams is challenging, because the algorithm should be capable of learning from each observation and then discard it before considering the next one, but several methods require the whole dataset in order to estimate their parameters and, therefore, are not suitable for online learning. Furthermore, many approaches su er with the so called curse of dimensionality (BELLMAN, 1961) and can not handle high-dimensional input data. To overcome the problems described above, this work proposes a new probabilistic and incremental neural network model, called Local Projection Incremental Gaussian Mixture Network (LP-IGMN), which is capable to perform life-long learning with high-dimensional data, i.e., it can continuously learn considering the stability of the current model's parameters and automatically adjust its topology taking into account the subspace's boundary found by each hidden neuron. The proposed method can nd the intrinsic subspace where the data lie, which is called the principal subspace. Orthogonal to the principal subspace, there are the dimensions that are noisy or carry little information, i.e., with small variance, and they are described by a single estimated parameter. Therefore, LP-IGMN is robust to di erent sources of data and can deal with large number of noise and/or irrelevant variables in the measured data. To evaluate LP-IGMN we conducted several experiments using simulated and real datasets. We also demonstrated several applications of our method in image recognition tasks. The results have shown that the LP-IGMN performance is competitive, and usually superior, with other stateof- the-art approaches, and it can be successfully used in applications that require life-long learning in high-dimensional spaces. Redes neurais Inteligência artificial Local projection Probabilistic learning Online learning Incremental learning High-dimensional data Gaussian mixture models Image recognition
52	Reconhecimento automático de defeitos de fabricação em painéis TFT-LCD através de inspeção de imagem SILVA, Antonio Carlos de Castro da 15 January 2016 (has links) Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2016-09-12T14:09:09Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) MSc_Antonio Carlos de Castro da Silva_digital_12_04_16.pdf: 2938596 bytes, checksum: 9d5e96b489990fe36c4e1ad5a23148dd (MD5) / Made available in DSpace on 2016-09-12T14:09:09Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) MSc_Antonio Carlos de Castro da Silva_digital_12_04_16.pdf: 2938596 bytes, checksum: 9d5e96b489990fe36c4e1ad5a23148dd (MD5) Previous issue date: 2016-01-15 / A detecção prematura de defeitos nos componentes de linhas de montagem de fabricação é determinante para a obtenção de produtos finais de boa qualidade. Partindo desse pressuposto, o presente trabalho apresenta uma plataforma desenvolvida para detecção automática dos defeitos de fabricação em painéis TFT-LCD (Thin Film Transistor-Liquid Cristal Displays) através da realização de inspeção de imagem. A plataforma desenvolvida é baseada em câmeras, sendo o painel inspecionado posicionado em uma câmara fechada para não sofrer interferência da luminosidade do ambiente. As etapas da inspeção consistem em aquisição das imagens pelas câmeras, definição da região de interesse (detecção do quadro), extração das características, análise das imagens, classificação dos defeitos e tomada de decisão de aprovação ou rejeição do painel. A extração das características das imagens é realizada tomando tanto o padrão RGB como imagens em escala de cinza. Para cada componente RGB a intensidade de pixels é analisada e a variância é calculada, se um painel apresentar variação de 5% em relação aos valores de referência, o painel é rejeitado. A classificação é realizada por meio do algorítimo de Naive Bayes. Os resultados obtidos mostram um índice de 94,23% de acurácia na detecção dos defeitos. Está sendo estudada a incorporação da plataforma aqui descrita à linha de produção em massa da Samsung em Manaus. / The early detection of defects in the parts used in manufacturing assembly lines is crucial for assuring the good quality of the final product. Thus, this paper presents a platform developed for automatically detecting manufacturing defects in TFT-LCD (Thin Film Transistor-Liquid Cristal Displays) panels by image inspection. The developed platform is based on câmeras. The panel under inspection is positioned in a closed chamber to avoid interference from light sources from the environment. The inspection steps encompass image acquisition by the cameras, setting the region of interest (frame detection), feature extraction, image analysis, classification of defects, and decision making. The extraction of the features of the acquired images is performed using both the standard RGB and grayscale images. For each component the intensity of RGB pixels is analyzed and the variance is calculated. A panel is rejected if the value variation of the measure obtained is 5% of the reference values. The classification is performed using the Naive Bayes algorithm. The results obtained show an accuracy rate of 94.23% in defect detection. Samsung (Manaus) is considering the possibility of incorporating the platform described here to its mass production line. TFT-LCD plataforma reconhecimento de imagem detecção automática classificador Naive Bayes TFT-LCD platform image recognition automatic detection Naive Bayes classifier
53	Avståndsvarnare till Mobiltelefon Johansson, Joakim January 2011 (has links) This report describes a study, description and testing of parts to an application adapted to the operating system Android. The application is supposed to measure the distance to a car ahead. Apart from distance measurements the ability of the application to calculate its own speed with the help of GPS is tested. From these two parameters, speed, distance and some constants the theoretical stopping distance of the car will be calculated in order to warn the driver if the car is too close to the car ahead in relation to its own speed and stopping distance. Tests were conducted on the different applications that were programmed and the result showed that the camera technique in the mobile phone itself limits the maximum distance of the distance measurement application. The max distance the tests in this thesis revealed was approximately 5 meters. The measurement done to the GPS speed calculating application showed that the application was more accurate than the speedometer in the test car. The result of this thesis was that if all the parts were to be put together to a single application the maximum speed that it could be used with some functionality was 13,8 kilometers/hour assuming that the car ahead is at a standstill and the camera on the mobile phone is in a straight line from the license plate. / Denna uppsats beskriver en studie, utveckling och testning av delar och teknik till en applikation anpassad till operativsystemet Android. Applikationen skall kunna mäta avståndet till framförvarande bil. Utöver avståndsmätning så testas applikationens förmåga att kalkylera sin egna hastighet med hjälp av GPS. Utifrån dessa två parametrar, hastighet och avstånd samt några konstanter skall den teoretiska stoppsträckan kunna räknas ut för att kunna varna om fordonet är för nära farmförvarande bil i förhållande till sin egna fart.. Tester utfördes på de olika applikationerna som programmerades och resultatet visade att tekniken i sig sätter stopp för att kunna mäta avståndet till nummerplåten på ett längre avstånd än ca 5m. Mätning av hastigheten var mer noggrann än hastighetsmätaren i bilen. Resultatet blev att om alla delar sattes ihop till en enda applikation så skulle den i bästa fall kunna användas i maximalt 13,8 km/h förutsatt att framförvarande bil är stillastående, och att kameran från telefonen är i en rak vinkel mot framförvarande nummerplåt. Image recognition Android OpenCv Computer Vision Bildigenkänning OpenCv Android Computer Vision Computer Engineering Datorteknik Software Engineering Programvaruteknik
54	Development of three AI techniques for 2D platform games Persson, Martin January 2005 (has links) This thesis serves as an introduction to anyone that has an interest in artificial intelligence games and has experience in programming or anyone who knows nothing of computer games but wants to learn about it. The first part will present a brief introduction to AI, then it will give an introduction to games and game programming for someone that has little knowledge about games. This part includes game programming terminology, different game genres and a little history of games. Then there is an introduction of a couple of common techniques used in game AI. The main contribution of this dissertation is in the second part where three techniques that never were properly implemented before 3D games took over the market are introduced and it is explained how they would be done if they were to live up to today’s standards and demands. These are: line of sight, image recognition and pathfinding. These three techniques are used in today’s 3D games so if a 2D game were to be released today the demands on the AI would be much higher then they were ten years ago when 2D games stagnated. The last part is an evaluation of the three discussed topics. Artificial intelligence AI Game 2D games Platform games Line of sight Image recognition Pathfinding Computer Sciences Datavetenskap (datalogi)
55	Detection of safety equipment in the manufacturing industry using image recognition / Detektering av säkerhetsutrustning i tillverkningsindustrin med hjälp av bildigenkänning Hallonqvist, Linn, Cromsjö, Mimmi January 2021 (has links) Safety is an essential part of the paper industry, as the industry can be very hazardous and accidents can lead to serious injuries for the people involved. In order to mitigate and prevent accidents, it has been shown that proactive measures are of great value. One type of proactive measure is the use of Personal Protective Equipment (PPE), such as gloves, hard hats, safety glasses and reflective vests. Despite that it is often required to wear PPE in a work place, it is not always guaranteed and non-usage can affect the safety of workers. To detect unsafe conditions, such as non-usage of PPE, automated video monitoring with image recognition can be useful. The intention of this work is to investigate whether an image recognition model can be created using the cloud service Azure and used in a system that can detect PPE, which in this work is limited to reflective vests. The work results in an artifact using an image recognition model. Additionally, this work examines how the training data can affect the model's performance. It is found that the model can be improved by training the model on images with varying backgrounds, angles, distances, and occlusions. While there are many advantages with automated monitoring, the use of it can raise questions regarding the privacy of the people being monitored and how it can be perceived in a workplace. Therefore, this thesis examines the privacy concerns and attitudes regarding an image recognition system for monitoring. This is accomplished by performing a literature study and interviews with employees at a paper mill. The results reveals challenges with systems for automated monitoring as well as factors that can affect how employees feel about them. Machine learning Image recognition Occupational safety Personal Protective Equipment Azure Monitoring Other Computer and Information Science Annan data- och informationsvetenskap
56	WaldBoost na GPU / WaldBoost on GPU Polok, Lukáš January 2009 (has links) Image recognition and machine vision in general is quickly evolving field, due boom of cheap and powerful computation technologies. Image recognition has many different applications in wide spectrum of industries, ranging from communications trough security to entertainment. Algorithms for image recognition are still evolving and are often quite computationaly demanding. That is why some of authors deal with implementing the algorithms on specialized hardware accelerators. This work describes implementation of image recognition using the WaldBoost algorithm on the graphic accelerator (GPU) platform.
57	Omnidirectional pong playing robot : Pong playing robot using kiwi drive and a PID controller / Flerdimensionell pongrobot Björklund, Filip, Strand, Christopher January 2019 (has links) This project goal was to determine the flexibility of an omnidirectional robot with a physical implementation of the video game pong. A robot was created to follow and catch a ball and could play against a human player. The challenge of the project was to create a stable system that could move in a straight path and catch the ball within a reasonable distance from the other player. A camera was used to implement an image recognition system that could determine the two-dimensional position of the ball and hard coded values for the size of the ball was used to simulate a three-dimensional position. Given these values, the robot was able to follow the ball and push the ball when close. For the omnidirectional system, socalled kiwi drive with three DC motors and omnidirectional wheels was used. Ultrasonic sensors were also used to stop the robot if a nearby wall was too close. To make the robot move in a straight path, control theory together with a compass module was used to measure the angular error which was fed as feedback to the system. This enabled the robot to travel in a straight path and catch the ball. The results of the project showed that it is possible to control an omnidirectional robot with control theory in a stable manner. Using image recognition with a web camera together with OpenCV is fast enough to create a fast robotic system that can successfully complete a given task. / Detta projekts mål var att analysera hur flexibel det går att göra en robot med flerdimensionella hjul, det vill säga en robot som har hjul som gör att den kan röra sig med tre frihetsgrader. Detta gjordes genom att implementera en fysisk version av datorspelet pong. I projektet byggdes en robot som kunde följa och fånga en boll samt spela mot en mänsklig spelare. Utmaningen i projektet var att skapa ett stabilt system som kunde möjliggöra för roboten att färdas en rak väg och fånga bollen inom ett rimligt avstånd från motspelaren. En webbkamera användes för att implementera ett bildigenkänningssystem som kunde avgöra den tvådimensionella positionen för bollen och hårdkodade värden på bollens storlek användes för att simulera en tredimensionell position. Givet dessa värden lyckades roboten följa efter bollen och trycka ifrån den när bollen närmade sig. Tre stycken DC-motorer med tillhörande hjul användes för att skapa en treaxlig konfiguration för det flerdimensionella systemet. Ultraljudssensorer användes för att stanna roboten om den kom för nära en vägg i spelplanen. För att få roboten att röra sig längs en rak linje användes en kompassmodul för att mäta vinkelfelet som uppstod när roboten körde på ett felaktigt sätt. Detta vinkelfel användes som återkoppling för en PID-regulator vilket i sin tur m¨ojliggjorde f¨or roboten att kunna följa och fånga bollen längs en rak linje. Resultaten från projektet visade att en flerdimensionell robot går att kontrollera på ett stabilt sätt genom en PIDregulator och bildigenkänning med hjälp av en webbkamera och OpenCV ¨ar tillräckligt snabbt för att kunna skapa ett robotsystem som kan lösa en given uppgift. Mechatronics Omnidirectional Kiwi drive Image recognition Control theory Mekatronik Bildigenk¨anning Flerdimensionell robot Reglerteknik Engineering and Technology Teknik och teknologier
58	Detection and categorization of suggestive thumbnails : A step towards a safer internet / Upptäckt och kategorisering av suggestiva miniatyrer : Ett steg mot ett säkrare internet Oliveira Franca, Matheus January 2021 (has links) The aim of this work is to compare methods that predict whether an image has suggestive content, such as pornographic images and erotic fashion. Using binary classification, this work contributes to an internet environment where these images are not seen out of context. It is, therefore, necessary for user experience improvement purposes, such as child protection, publishers not having their campaign associated with inappropriate content, and companies improving their brand safety. For this study, a data set with more than 500k images was created to test the Convolutional Neural Networks (CNN) models: NSFW model, ResNet, EfficientNet, BiT, NudeNet and Yahoo Model. The image classification model EfficientNet-B7 and Big Transfer (BiT) presented the best results with over 91% samples correctly classified on the test set, with precision and recall around 0.7. Model prediction was further investigated using Local Interpretable Model-agnostic Explanation (LIME), a model explainability technique, and concluded that the model uses coherent regions of the thumbnail according to a human perspective such as legs, abdominal, and chest to classify images as unsafe. / Syftet med detta arbete är att jämföra metoder som förutsäger om en bild har suggestivt innehåll, såsom pornografiska bilder och erotiskt mode. Med binär klassificering bidrar detta arbete till en internetmiljö där dessa bilder inte ses ur sitt sammanhang. Det är därför nödvändigt för att förbättra användarupplevelsen, till exempel barnskydd, utgivare som inte har sina kampanjer kopplade till olämpligt innehåll och företag som förbättrar deras varumärkessäkerhet. För denna studie skapades en datamängd med mer än 500 000 bilder för att testa Convolutional Neural Networks (CNN) modeller: NSFW-modell, ResNet, EfficientNet, BiT, NudeNet och Yahoo-modell. Bild klassificerings modellen EfficientNet-B7 och Big Transfer (BiT) presenterade de bästa resultaten med över 91%prover korrekt klassificerade på testuppsättningen, med precision och återkallelse runt 0,7. Modell förutsägelse undersöktes ytterligare med hjälp av Local Interpretable Model-agnostic Explanation (LIME), en modell förklarbarhetsteknik, och drog slutsatsen att modellen använder sammanhängande regioner i miniatyren enligt ett mänskligt perspektiv såsom ben, buk och bröst för att klassificera bilder som osäkra. Suggestive Image Recognition Deep Learning NSFW CNN Förslag på bildigenkänning Deep Learning NSFW CNN Computer Sciences Datavetenskap (datalogi)
59	Track the number of people in a premises in real time / Spåra antalet personer i en lokal i realtid Heidar, Hamza January 2022 (has links) Det har blivit allt vanligare att inomhusverksamheter vill kunna bevaka antalet personer som befinner sig i deras lokaler. Att manuellt räkna antalet personer eller att använda sig utav rörelsesensorer har olika nackdelar. På grund av den anledningen är det lämpligt att utforska andra tekniska och mer automatiserade lösningar, som använder sig utav enkla komponenter. Litteraturstudien gav en förståelse om bildanalys och vilka tekniska verktyg som kan användas för att analysera bilder. Amazon Rekognition och OpenCV är två av de verktyg som användes för att kunna bygga en prototyp, som kan räkna antalet personer i en lokal i realtid. Resultatet visade att en lösning med OpenCV inte är möjlig, med de kunskaper litteraturstudien gav. Resultatet ifrån Amazon Rekognition indikerar att det är möjligt att räkna antalet personer med väldigt hög noggrannhet och precision. Precis som att en människa kan bli distraherad, kan även prototypen missa enstaka personer. Amazon Rekognition kunde även särskilja människor ifrån andra objekt, vilket en rörelsesensor inte kan göra. Resultatet visade även fåtal brister så som dålig responstid, men dessa brister hade kunnat åtgärdas ifall mer tid återstod. / It has become increasingly common for indoor businesses to be able to monitor the number of people who are in their premises. Manually counting the number of people or using motion sensors has various disadvantages. For this reason, it is advisable to explore other technical and more automated solutions, which use simple components. The literature study provided an understanding of image analysis and the technical tools that can be used to analyze images. Amazon Recognition and OpenCV are two of the tools used to build two prototypes that can count the number of people in a room in real time. The results showed that a solution with OpenCV is not possible, with the knowledge the literature study provided. The result from Amazon Recognition indicates that it is possible to count the number of people with very high accuracy and precision. Just as a human being can be distracted, the prototype can also miss individual people. Amazon Recognition could also distinguish people from other objects, which a motion sensor cannot do. The results also showed a few shortcomings such as poor response time, but these shortcomings could have been remedied if more time remained. Person counter image recognition Amazon Web Service automation Personräknare bildigenkänning Amazon Web Service automatisering Computer and Information Sciences Data- och informationsvetenskap
60	Traffic Sign Recognition Using Machine Learning / Igenkänning av parkeringsskyltar med hjälp av maskininlärning Sharif, Sharif, Lilja, Joanna January 2020 (has links) Computer vision is an area in computer science that attempts to give computers the ability to see and recognise objects using varying sources of input, such as video or pictures. This problem is usually solved by using artificial intelligence (AI) techniques. The most common being deep learning. The project investigates the possibility of using these techniques to recognisetraffic signs in real time. This would make it possible in the future to build a user application that does this. The case study gathers information about available AI techniques, and three object detection deep learning models are selected. These are YOLOv3, SSD, and Faster R-CNN. The chosen models are used in a case study to find out which one is best suited to the task of identifying parking signs in real-time. Faster R-CNN performed the best in terms of recall and precision combined. YOLOv3 slacked behind in recall, but this could be because of how we chose to label the training data. Finally, SSD performed the worst in terms of recall, but was also relatively fast. Evaluation of the case study shows that it is possible to detect parking signs in real time. However, the hardware necessary is more powerful than that offered by currently available mobile platforms. Therefore it is concluded that a cloud solution would be optimal, if the techniques tested were to be implemented in a parking sign reading mobile app. / Datorseende är ett område inom datorvetenskap som fokuserar på att ge maskiner förmågan att se och känna igen objekt med olika typer av input, såsom bilder eller video. Detta är ett problem som ofta löses med hjälp av artificiell intelligens (AI). Mer specifikt, djupinlärning. I detta projekt undersöks möjligheten att använda djupinlärning för att känna igen trafikskyltar i realtid. Detta så att i framtiden kunna bygga en applikation, som kan byggas att känna igen parkeringsskyltar i realtid. Fallstudien samlar information om tillgängliga AI-tekniker, och tre djupinlärningsmodeller väljs ut. Dessa är YOLOv, SSD, och Faster R-CNN. Dessa modeller används i en fallstudie för att ta reda på vilken av dem som är bäst lämpad för uppgiften att känna igen parkeringsskyltar i realtid. Faster R-CNN presterade bäst vad gäller upptäckande av objekt och precision tillsammans. YOLOv3 upptäckte färre object, men det är sannolikt att detta berodde på hur vi valde att markera träningsdatan. Slutligen upptäckte SSD minst antal objekt, men presterade också relativt snabbt. Bedömning av fallstudien visar att det är möjligt att känna igen parkeringsskyltar i realtid. Den nödvändiga hårdvaran är dock kraftfullare än den som erbjuds av mobiler för närvarande. Därför dras slutsatsen att en molnlösning skulle vara optimal, om de testade teknikerna skulle användas för att implementera en app för att känna igen parkeringskyltar. Computer and Information Sciences Data- och informationsvetenskap

Search results