Global ETD Search

41	Learning to Search for Targets : A Deep Reinforcement Learning Approach to Visual Search in Unseen Environments / Inlärd sökning efter mål Lundin, Oskar January 2022 (has links) Visual search is the perceptual task of locating a target in a visual environment. Due to applications in areas like search and rescue, surveillance, and home assistance, it is of great interest to automate visual search. An autonomous system can potentially search more efficiently than a manually controlled one and has the advantages of reduced risk and cost of labor. In many environments, there is structure that can be utilized to find targets quicker. However, manually designing search algorithms that properly utilize structure to search efficiently is not trivial. Different environments may exhibit vastly different characteristics, and visual cues may be difficult to pick up. A learning system has the advantage of being applicable to any environment where there is a sufficient number of samples to learn from. In this thesis, we investigate how an agent that learns to search can be implemented with deep reinforcement learning. Our approach jointly learns control of visual attention, recognition, and localization from a set of sample search scenarios. A recurrent convolutional neural network takes an image of the visible region and the agent's position as input. Its outputs indicate whether a target is visible and control where the agent looks next. The recurrent step serves as a memory that lets the agent utilize features of the explored environment when searching. We compare two memory architectures: an LSTM, and a spatial memory that remembers structured visual information. Through experimentation in three simulated environments, we find that the spatial memory architecture achieves superior search performance. It also searches more efficiently than a set of baselines that do not utilize the appearance of the environment and achieves similar performance to that of a human searcher. Finally, the spatial memory scales to larger search spaces and is better at generalizing from a limited number of training samples. visual search reinforcement learning deep learning computer vision autonomous systems visuell sökning förstärkningsinlärning djupinlärning datorseende autonoma system
42	Region Proposal Based Object Detectors Integrated With an Extended Kalman Filter for a Robust Detect-Tracking Algorithm Khajo, Gabriel January 2019 (has links) In this thesis we present a detect-tracking algorithm (see figure 3.1) that combines the detection robustness of static region proposal based object detectors, like the faster region convolutional neural network (R-CNN) and the region-based fully convolutional networks (R-FCN) model, with the tracking prediction strength of extended Kalman filters, by using, what we have called, a translating and non-rigid user input region of interest (RoI-) mapping. This so-called RoI-mapping maps a region, which includes the object that one is interested in tracking, to a featureless three-channeled image. The detection part of our proposed algorithm is then performed on the image that includes only the RoI features (see figure 3.2). After the detection step, our model re-maps the RoI features to the original frame, and translates the RoI to the center of the prediction. If no prediction occurs, our proposed model integrates a temporal dependence through a Kalman filter as a predictor; this filter is continuously corrected when detections do occur. To train the region proposal based object detectors that we integrate into our detect-tracking model, we used TensorFlow®’s object detection api, with a random search hyperparameter tuning, where we fine-tuned, all models from TensorFlow® slim base network classification checkpoints. The trained region proposal based object detectors used the inception V2 base network for the faster R-CNN model and the R-FCN model, while the inception V3 base network only was applied to the faster R-CNN model. This was made to compare the two base networks and their corresponding affects on the detection models. In addition to the deep learning part of this thesis, for the implementation part of our detect-tracking model, like for the extended Kalman filter, we used Python and OpenCV® . The results show that, with a stationary camera reference frame, our proposed detect-tracking algorithm, combined with region proposal based object detectors on images of size 414 × 740 × 3, can detect and track a small object in real-time, like a tennis ball, moving along a horizontal trajectory with an average velocity v ≈ 50 km/h at a distance d = 25 m, with a combined detect-tracking frequency of about 13 to 14 Hz. The largest measured state error between the actual state and the predicted state from the Kalman filter, at the aforementioned horizontal velocity, have been measured to be a maximum of 10-15 pixels, see table 5.1, but in certain frames where many detections occur this error has been shown to be much smaller (3-5 pixels). Additionally, our combined detect-tracking model has also been shown to be able to handle obstacles and two learnable features that overlap, thanks to the integrated extended Kalman filter. Lastly, our detect-tracking model also was applied on a set of infra-red images, where the goal was to detect and track a moving truck moving along a semi-horizontal path. Our results show that a faster R-CNN inception V2 model was able to extract features from a sequence of infra-red frames, and that our proposed RoI-mapping method worked relatively well at detecting only one truck in a short test-sequence (see figure 5.22). Object Detection Extended Kalman Filter Tracking
43	Dynamisk Kollisionsundvikande I Twin Stick shooter : Hastighetshinder och partikelseparation / Dynamic collision Avoidance In A twin stick shooter : Velocity Obstacle and particle seperation Bengtsson, Björn January 2019 (has links) I examensarbetet jämförs undvikande av kollision och tidsefektivitet mellan det två metoderna hastighetshinder och partikelseparation i spelgenren Twin stick shooter. Arbetet försöker besvara frågan: Hur skiljer sig undvikandet av kollision och tidseffektiviteten mellan metoderna hastighetshinder och partikelseparation, i spelgenren twin stick shooter med flockbeteende? För att besvara frågan har en artefakt skapats. I artefakten jagar agenter en spelare medan agenterna undviker kollision med andra agenter, dock eftersträvar agenterna att kollidera med spelaren. I artefakten körs olika experiment baserat på parametrar som har ställts in. Varje experiment körs en bestämd tid och all data om kollisioner och exekveringstid för respektive metod sparas i en textfil. Resultatet av experimenten pekar på att partikelseparation lämpar sig bättre för twin stick shooters. Hastighetshinder kolliderar mindre men tidsberäkningen är för hög och skalar dåligt med antal agenter. Det passar inte twinstick shooter då det oftast är många agenter på skärmen. Metoderna för undvikandet av kollision har användning till radiostyrda billar och robotar, samt simulation av folkmassa. Autonoma Agenter Styrbeteenden undvikande av kollision
44	Dehazing of Satellite Images Hultberg, Johanna January 2018 (has links) The aim of this work is to find a method for removing haze from satellite imagery. This is done by taking two algorithms developed for images taken from the sur- face of the earth and adapting them for satellite images. The two algorithms are Single Image Haze Removal Using Dark Channel Prior by He et al. and Color Im- age Dehazing Using the Near-Infrared by Schaul et al. Both algorithms, altered to fit satellite images, plus the combination are applied on four sets of satellite images. The results are compared with each other and the unaltered images. The evaluation is both qualitative, i.e. looking at the images, and quantitative using three properties: colorfulness, contrast and saturated pixels. Both the qualitative and the quantitative evaluation determined that using only the altered version of Dark Channel Prior gives the result with the least amount of haze and whose colors look most like reality. Dehazing Satellite Images Dark Channel Remote sensing
45	Improving Photogrammetry using Semantic Segmentation Kernell, Björn January 2018 (has links) 3D reconstruction is the process of constructing a three-dimensional model from images. It contains multiple steps where each step can induce errors. When doing 3D reconstruction of outdoor scenes, there are some types of scene content that regularly cause problems and affect the resulting 3D model. Two of these are water, due to its fluctuating nature, and sky because of it containing no useful (3D) data. These areas cause different problems throughout the process and do generally not benefit it in any way. Therefore, masking them early in the reconstruction chain could be a useful step in an outdoor scene reconstruction pipeline. Manual masking of images is a time-consuming and boring task and it gets very tedious for big data sets which are often used in large scale 3D reconstructions. This master thesis explores if this can be done automatically using Convolutional Neural Networks for semantic segmentation, and to what degree the masking would benefit a 3D reconstruction pipeline. / 3D-rekonstruktion är teknologin bakom att skapa 3D-modeller utifrån bilder. Det är en process med många steg där varje steg kan medföra fel. Vid 3D-rekonstruktion av stora utomhusmiljöer finns det vissa typer av bildinnehåll som ofta ställer till problem. Två av dessa är vatten och himmel. Vatten är problematiskt då det kan fluktuera mycket från bild till bild samt att det kan innehålla reflektioner som ger olika utseenden från olika vinklar. Himmel å andra sidan ska aldrig ge upphov till 3D-information varför den lika gärna kan maskas bort. Manuell maskning av bilder är väldigt tidskrävande och dyrt. Detta examensarbete undersöker huruvida denna maskning kan göras automatiskt med Faltningsnät för Semantisk Segmentering och hur detta skulle kunna förbättra en 3D-rekonstruktionsprocess. photogrammetry semantic segmentation convolutional neural networks
46	A Single-Camera Gaze Tracker using Controlled Infrared Illumination Wallenberg, Marcus January 2009 (has links) Gaze tracking is the estimation of the point in space a person is “looking at”. This is widely used in both diagnostic and interactive applications, such as visual attention studies and human-computer interaction. The most common commercial solution used to track gaze today uses a combination of infrared illumination and one or more cameras. These commercial solutions are reliable and accurate, but often expensive. The aim of this thesis is to construct a simple single-camera gaze tracker from off-the-shelf components. The method used for gaze tracking is based on infrared illumination and a schematic model of the human eye. Based on images of reflections of specific light sources in the surfaces of the eye the user’s gaze point will be estimated. Evaluation is also performed on both the software and hardware components separately, and on the system as a whole. Accuracy is measured in spatial and angular deviation and the result is an average accuracy of approximately one degree on synthetic data and 0.24 to 1.5 degrees on real images at a range of 600 mm. gaze tracking eye tracking computer vision
47	Obstacle detection using stereo vision for unmanned ground vehicles Olsson, Martin January 2009 (has links) In recent years, the market for automatized surveillance and use of unmanned ground vehicles (UGVs) has increased considerably. In order for unmanned vehicles to operate autonomously, high level algorithms of artificial intelligence need to be developed and accompanied by some way to make the robots perceive and interpret the environment. The purpose of this work is to investigate methods for real-time obstacle detection using stereo vision and implement these on an existing UGV platform. To reach real-time processing speeds, the algorithms presented in this work are designed for parallel processing architectures and implemented using programmable graphics hardware. The reader will be introduced to the basics of stereo vision and given an overview of the most common real-time stereo algorithms in literature along with possible applications. A novel wide-baseline real-time depth estimation algorithm is presented. The depth estimation is used together with a simple obstacle detection algorithm, producing an occupancy map of the environment allowing for evasion of obstacles and path planning. In addition, a complete system design for autonomous navigation in multi-UGV systems is proposed. Depth estimation Stereo vision Obstacle detection UGV
48	Elevers hjälp till förståelse genom bilder : Hur bilder hjälper elever att förstå växthuseffekten Karlsson, Charlotte January 2008 (has links) Att undersöka hur elever blir hjälpta av bilder för att förstå vår omvärld var utgångspunkten i denna undersökning. För att försöka få fram det, valdes bland annat att låta eleverna rita egna bilder. Eleverna fick i uppdrag att med hjälp av bilder förklara växthuseffekten samt intervjuades ett urval av elever. Bilderna analyserades och de transkriberade intervjuerna ställdes i förhållande till vad de ritat. En del av elevernas bilder kontrasterar mot deras egna uppfattningar om hur bilder bör ser ut för att på ett bra sätt förklara olika ting och händelseförlopp. En del bilder stämmer överens med deras egna uppfattningar om hur bilder som hjälper att förstå bör se ut. Viktigt för flertalet av elever var att det i bilden fanns förklarande text. Andra viktiga aspekter var användandet av olika färger och att bilderna såg verkliga ut. Om en bild kan visa verkligheten eller inte diskuteras även det i uppsatsen. Min slutsats utifrån detta arbete är att eleverna verkar vara hjälpta av bilder som minneshjälp och för att förklara sammanhang, men för att rättvist kunna bedöma hur eleverna blir hjälpta kräver en djupare studie. Denna undersökning kan inspirera till att resonera kring hur eleverna blir hjälpta av bilder och kanske stimulera till vidare forskning Använda bilder i undervisningen
49	An image says more than words : a qualitative essay about the pictorial language of children and youth in Westafrica Exenberger, Margareta January 2007 (has links) The pictorial language of the Swedish children is characterized by the idea that a “good” drawing should be in the right perspective and as photographically realistic as possible. This is a study about the pictorial language of the children in the Gambia and Senegal. Is the pictorial language different with the children living in a culture that has a stronger tradition of spoken word and visual communication than the children living in the western civilisation? With the help of different theories concerning children’s creating of art, this study is trying to sort out the differences. It is also explaining about different theories when it comes to development stages in the children’s drawings and how the culture, tradition and conventions influence both the pictorial grammar and the ideal image. The study is based on drawings collected in schools in The Gambia and Senegal and the drawings are analysed with the help of theories in Karin Aronssons “Barns världar – barns bilder”. The study is also based on observations and interviews with children and teachers in a school in the Gambia. Pictorial language children West Africa
50	Detektering och Identifiering av Vägmärken / Road Sign Detection and Identification Palm, Magdalena January 2017 (has links) Denna avhandling beskriver ett projekt för att skapa en prototypapplikation med syftet att förenkla vägmärkesinventering. Istället för att manuellt analysera en tagen bild och jämföra med en databas av vägmärken för inventing så kan man istället starta denna applikation, ladda in bilden och få ut ett svar på vad skylten har för identifieringskod. Idén är att vägmärkesinventerare ska spara in tiden det tar att gå igenom alla bilder tagna under en dag och istället få systemet att automatiskt lägga in vad vägmärket har för identifieringskod. Grundapplikationen är skriven som en WPF-applikation med hjälp av ramverket EmguCV som i sin tur nyttjar .NET ramverket. Den viktiga aspekten i detta projekt är att se ifall detta kan göras med rimlig beräkningskraft, kunna matcha skyltar på en rimlig tid, vilket gjordes möjligt med EmguCVs FLANN-algoritm. Projektet resulterade i en fungerande applikation där användare kan ladda upp cirkulära hastighetsvägmärken där applikationen detekterar och sedan matchar vägmärket mot en databas för att kunna identifiera det. / This dissertation describes a project for creating a prototype application for the purpose of simplifying road sign inventory. Instead of manually analyzing a captured image and comparing it to a database of inventory road signs, you can instead launch this application, load the image and get the identification code of that road sign. The idea is that road sign inventory takers will save the time it takes to review all the pictures taken during a day and instead, the system will automatically generate the identification code of that road sign. The basic application is written as a WPF application using the EmguCV framework, which in turn uses the .NET framework. The important aspect of this project is to see if matching road signs can can be done with reasonable computation and within reasonable time, this was made possible with EmguCVs FLANN-algorithm. The project resulted in a functional application in which users can upload circular velocity road signs and the application detects and identifies the road sign via a database of road signs. datorseende WPF openCV emguCV vägmärken detektering

Search results