Spelling suggestions: "subject:"autonome""
311 |
Facial Gestures for Infotainment SystemsTantai, Along, Chen, Da January 2014 (has links)
The long term purpose of this project is to reduce the attention demand of drivers whenusing infotainment systems in a car setting. With the development of the car industry,a contradiction between safety issue and entertainment demands in cars has arisen.Speech-recognition-based controls meet their bottleneck in the presence of backgroundaudio (such as engine noise, other passengers speech and/or the infotainment systemitself). We propose a new method to control the infotainment system using computervision technology in this thesis. This project uses algorithms of object detection, opticalflow(estimated motion) and feature analysis to build a communication channel betweenhuman and machine. By tracking the driver’s head and measuring the optical flow overthe lip region, the driver’s mouth feature can be indicated. Performance concerning theefficiency and accuracy of the system is analyzed. The contribution of this thesis is toprovide a method using facial gestures to communicate with the system, and we focuson the movement of lips especially. This method offers a possibility to create a new modeof interaction between human and machine.
|
312 |
Deep Learning for Autonomous Collision AvoidanceStrömgren, Oliver January 2018 (has links)
Deep learning has been rapidly growing in recent years obtaining excellent results for many computer vision applications, such as image classification and object detection. One aspect for the increased popularity of deep learning is that it mitigates the need for hand-crafted features. This thesis work investigates deep learning as a methodology to solve the problem of autonomous collision avoidance for a small robotic car. To accomplish this, transfer learning is used with the VGG16 deep network pre-trained on ImageNet dataset. A dataset has been collected and then used to fine-tune and validate the network offline. The deep network has been used with the robotic car in a real-time manner. The robotic car sends images to an external computer, which is used for running the network. The predictions from the network is sent back to the robotic car which takes actions based on those predictions. The results show that deep learning has great potential in solving the collision avoidance problem.
|
313 |
Cognition Rehearsed : Recognition and Reproduction of Demonstrated Behavior / Robotövningar : Igenkänning och återgivande av demonstrerat beteendeBilling, Erik January 2012 (has links)
The work presented in this dissertation investigates techniques for robot Learning from Demonstration (LFD). LFD is a well established approach where the robot is to learn from a set of demonstrations. The dissertation focuses on LFD where a human teacher demonstrates a behavior by controlling the robot via teleoperation. After demonstration, the robot should be able to reproduce the demonstrated behavior under varying conditions. In particular, the dissertation investigates techniques where previous behavioral knowledge is used as bias for generalization of demonstrations. The primary contribution of this work is the development and evaluation of a semi-reactive approach to LFD called Predictive Sequence Learning (PSL). PSL has many interesting properties applied as a learning algorithm for robots. Few assumptions are introduced and little task-specific configuration is needed. PSL can be seen as a variable-order Markov model that progressively builds up the ability to predict or simulate future sensory-motor events, given a history of past events. The knowledge base generated during learning can be used to control the robot, such that the demonstrated behavior is reproduced. The same knowledge base can also be used to recognize an on-going behavior by comparing predicted sensor states with actual observations. Behavior recognition is an important part of LFD, both as a way to communicate with the human user and as a technique that allows the robot to use previous knowledge as parts of new, more complex, controllers. In addition to the work on PSL, this dissertation provides a broad discussion on representation, recognition, and learning of robot behavior. LFD-related concepts such as demonstration, repetition, goal, and behavior are defined and analyzed, with focus on how bias is introduced by the use of behavior primitives. This analysis results in a formalism where LFD is described as transitions between information spaces. Assuming that the behavior recognition problem is partly solved, ways to deal with remaining ambiguities in the interpretation of a demonstration are proposed. The evaluation of PSL shows that the algorithm can efficiently learn and reproduce simple behaviors. The algorithm is able to generalize to previously unseen situations while maintaining the reactive properties of the system. As the complexity of the demonstrated behavior increases, knowledge of one part of the behavior sometimes interferes with knowledge of another parts. As a result, different situations with similar sensory-motor interactions are sometimes confused and the robot fails to reproduce the behavior. One way to handle these issues is to introduce a context layer that can support PSL by providing bias for predictions. Parts of the knowledge base that appear to fit the present context are highlighted, while other parts are inhibited. Which context should be active is continually re-evaluated using behavior recognition. This technique takes inspiration from several neurocomputational models that describe parts of the human brain as a hierarchical prediction system. With behavior recognition active, continually selecting the most suitable context for the present situation, the problem of knowledge interference is significantly reduced and the robot can successfully reproduce also more complex behaviors.
|
314 |
Geometric Computer Vision for Rolling-shutter and Push-broom SensorsRingaby, Erik January 2012 (has links)
Almost all cell-phones and camcorders sold today are equipped with a CMOS (Complementary Metal Oxide Semiconductor) image sensor and there is also a general trend to incorporate CMOS sensors in other types of cameras. The sensor has many advantages over the more conventional CCD (Charge-Coupled Device) sensor such as lower power consumption, cheaper manufacturing and the potential for on-chip processing. Almost all CMOS sensors make use of what is called a rolling shutter. Compared to a global shutter, which images all the pixels at the same time, a rolling-shutter camera exposes the image row-by-row. This leads to geometric distortions in the image when either the camera or the objects in the scene are moving. The recorded videos and images will look wobbly (jello effect), skewed or otherwise strange and this is often not desirable. In addition, many computer vision algorithms assume that the camera used has a global shutter, and will break down if the distortions are too severe. In airborne remote sensing it is common to use push-broom sensors. These sensors exhibit a similar kind of distortion as a rolling-shutter camera, due to the motion of the aircraft. If the acquired images are to be matched with maps or other images, then the distortions need to be suppressed. The main contributions in this thesis are the development of the three dimensional models for rolling-shutter distortion correction. Previous attempts modelled the distortions as taking place in the image plane, and we have shown that our techniques give better results for hand-held camera motions. The basic idea is to estimate the camera motion, not only between frames, but also the motion during frame capture. The motion can be estimated using inter-frame image correspondences and with these a non-linear optimisation problem can be formulated and solved. All rows in the rolling-shutter image are imaged at different times, and when the motion is known, each row can be transformed to the rectified position. In addition to rolling-shutter distortions, hand-held footage often has shaky camera motion. It has been shown how to do efficient video stabilisation, in combination with the rectification, using rotation smoothing. In the thesis it has been explored how to use similar techniques as for the rolling-shutter case in order to correct push-broom images, and also how to rectify 3D point clouds from e.g. the Kinect depth sensor. / VGS
|
315 |
Deep Learning for Point Detection in ImagesRunow, Björn January 2020 (has links)
The main result of this thesis is a deep learning model named BearNet, which can be trained to detect an arbitrary amount of objects as a set of points. The model is trained using the Weighted Hausdorff distance as loss function. BearNet has been applied and tested on two problems from the industry. These are: From an intensity image, detect two pocket points of an EU-pallet which an autonomous forklift could utilize when determining where to insert its forks. From a depth image, detect the start, bend and end points of a straw attached to a juice package, in order to help determine if the straw has been attached correctly. In the development process of BearNet I took inspiration from the designs of U-Net, UNet++ and a high resolution network named HRNet. Further, I used a dataset containing RGB-images from a surveillance camera located inside a mall, on which the aim was to detect head positions of all pedestrians. In an attempt to reproduce a result from another study, I found that the mall dataset suffers from training set contamination when a model is trained, validated, and tested on it with random sampling. Hence, I propose that the mall dataset is evaluated with a sequential data split strategy, to limit the problem. I found that the BearNet architecture is well suited for both the EU-pallet and straw datasets, and that it can be successfully used on either RGB, intensity or depth images. On the EU-pallet and straw datasets, BearNet consistently produces point estimates within five and six pixels of ground truth, respectively. I also show that the straw dataset only constitutes a small subset of all the challenges that exist in the problem domain related to the attachment of a straw to a juice package, and that one therefore cannot train a robust deep learning model on it. As an example of this, models trained on the straw dataset cannot correctly handle samples in which there is no straw visible.
|
316 |
Sjövägsregler och autonoma fartyg : Hur sjövägsreglerna skulle kunna fungera i möte med autonoma fartygLagerstam, Cristopher, Lundgren, Fabian January 2019 (has links)
Det finns en utvecklingsriktning inom sjöfartsbranschen mot autonoma fartyg. Hur stor andel av alla fartyg som kommer vara autonoma vet man inte i dagsläget. Utgångspunkten för studien var att det kommer bli en blandning mellan autonoma och bemannade fartyg ute till havs. Studien ville ta reda på hur aktiva sjöbefäl skulle förhålla sig om de möter ett autonomt fartyg med utgångspunkt från dagens sjövägsregler. Vidare skulle studien undersöka om respondenterna ansåg att reglerna måste anpassas för autonoma fartyg. En kvalitativ metod valdes och genomfördes med standardiserade frågor. Slutsatsen är att sjövägsreglerna är komplexa på grund av dess uppbyggnad. Det går att bryta mot bestämmelserna, men samtidigt är det möjligt att följa dem. Bakgrunden till respondenternas resonemang går att hitta i tidigare forskning där det finns ett samband mellan människa och maskin. Om människan har begränsad information om ett system desto lägre förtroende har människan för det systemet. Efter alla intervjuer kom det fram att respondenterna är för en ändring av sjövägsreglerna. Den regeländringen som var mest uppenbar, var en ändring av definitionerna på grund av att de vill ha information om att det fartyg det möter är autonomt. / There is a direction of development in the shipping industry towards autonomous vessels. How many autonomous vessels there will be at sea is currently unknown. The idea for the study was that it will be a mix between autonomous and manned vessels at sea. The study wanted to find out how active maritime officers would act if they met an autonomous vessel based on today’s regulations for preventing collisions at sea (Colregs). Furthermore, the study would investigate whether respondents felt that the rules had to be reformed for autonomous vessels. A qualitative method was selected and implemented with standardized questions. The conclusion was that the sea route rules are complex due to its structure. It is possible to violate the rules, but at the same time it is possible to follow them. The background to the respondent’s thoughts can be found in previous research where there is a connection between man and machine. If the person has limited information about a system, the less trust the person has for that system. After all the interviews were done it was revealed that the respondents agreed for a change in the Colregs. The most obvious rule change was a change in the definitions because they wanted information that the ship it encountered was autonomous.
|
317 |
Kunskapsbildning i en tvärsektoriell samverkan - En fallstudie kring autonoma fordon i Göteborgs stadRosengren, Christofer, Nilsson, Christoffer January 2018 (has links)
Många av våra samtida städer står inför en ökad urbanisering vilket kan leda till störrepåfrestningar och nya utmaningar för stadsplaneringen att handskas med. Smarta städer harkommit som en reaktion på den ökade komplexitet städerna står inför i samband med attfler människor kommer konkurrera om samma ytor. En av dessa komplexiteter är hurmobiliteten i städerna ska förbättras utan att äventyra andra värden. Idag sker det enintensiv utveckling av autonoma fordon vilket många tror kan vara en lösning påmobilitetsproblematiken. En identifierad problematik är den stora ovissheten, som härrörfrån de bristande erfarenheterna kring autonoma fordons effekter i staden. Denna ovisshetskapar behovet av en tvärsektoriell samverkan mellan kommunal planering ochteknikutvecklingsföretag för att åstadkomma en mer omfattande kunskapsbild.En fallstudie har genomförts i Göteborg där representanter från den kommunalaplaneringen, det privata näringslivet och konsultföretag har intervjuats. Observationsstudierhar genomförts för att få en uppfattning av hur kunskapsbildningsprocessen ser ut mellanden offentliga och privata sektorn. Detta för att få en bild av vilka utmaningar ochmöjligheter de olika aktörerna ser med en implementering av autonoma fordon i stadenslångsiktiga planering. Resultaten som har kommit fram i denna fallstudie pekar på problemsom kan uppstå när avgränsningen i ett kunskapsgenererande projekt leder till ettkunskapsbortfall. Undersökningen identifierar även den problematik som kan uppstå närolika fackspråk möts för att tillsammans generera en gemensam kunskapsbild. / Many of our contemporary cities face increased urbanization, which can lead to greaterstrains and new challenges for urban planning to deal with. Smart cities are a reaction tothe increased complexity that cities face, as more people will have to compete for the samespace. One of these complexities is how urban mobility should be improved withoutcompromising other values. Today there is an extensive development of autonomousvehicles, which many believe may be a solution to mobility issues. An identified problem isthe great uncertainty, that derives from the lack of experience regarding autonomousvehicle effects in the city. This uncertainty creates a need for cross-sectorial collaborationbetween municipal planning and technology developing companies in order to achieve amore comprehensive knowledge base.A case study has been conducted in Gothenburg where representatives from municipalplanning, private industry, and a consulting company have been interviewed. Observationshave been conducted to gain an understanding of how the knowledge-making processlooks like between the public and private sector, this to get a picture of the challenges andopportunities that the various actors identify in relations to the implementation ofautonomous vehicles in the city's long-term planning. The results that have emerged in thiscase study points to problems that may arise when the delimitation of a knowledgegeneratingproject leads to a loss of knowledge. The study also identifies issues that mayarise when different technical languages meet to mutually generate a comprehensiveknowledge base.
|
318 |
Pretraining a Neural Network for Hyperspectral Images Using Self-Supervised Contrastive Learning / Förträning av ett neuralt nätverk för hyperspektrala bilder baserat på självövervakad kontrastiv inlärningSyrén Grönfelt, Natalie January 2021 (has links)
Hyperspectral imaging is an expanding topic within the field of computer vision, that uses images of high spectral granularity. Contrastive learning is a discrim- inative approach to self-supervised learning, a form of unsupervised learning where the network is trained using self-created pseudo-labels. This work com- bines these two research areas and investigates how a pretrained network based on contrastive learning can be used for hyperspectral images. The hyperspectral images used in this work are generated from simulated RGB images and spec- tra from a spectral library. The network is trained with a pretext task based on data augmentations, and is evaluated through transfer learning and fine-tuning for a downstream task. The goal is to determine the impact of the pretext task on the downstream task and to determine the required amount of labelled data. The results show that the downstream task (a classifier) based on the pretrained network barely performs better than a classifier without a pretrained network. In the end, more research needs to be done to confirm or reject the benefit of a pretrained network based on contrastive learning for hyperspectral images. Also, the pretrained network should be tested on real-world hyperspectral data and trained with a pretext task designed for hyperspectral images.
|
319 |
Do Judge a Book by its Cover! : Predicting the genre of book covers using supervised deep learning. Analyzing the model predictions using explanatory artificial intelligence methods and techniques.Velander, Alice, Gumpert Harrysson, David January 2021 (has links)
In Storytel’s application on which a user can read and listen to digitalized literature, a user is displayed a list of books where the first thing the user encounters is the book title and cover. A book cover is therefore essential to attract a consumer’s attention. In this study, we take a data-driven approach to investigate the design principles for book covers through deep learning models and explainable AI. The first aim is to explore how well a Convolutional Neural Network (CNN) can interpret and classify a book cover image according to its genre in a multi-class classification task. The second aim is to increase model interpretability and investigate model feature to genre correlations. With the help of the explanatory artificial intelligence method Gradient-weighted Class Activation Map (Grad-CAM), we analyze the pixel-wise contribution to the model prediction. In addition, object detection by YOLOv3 was implemented to investigate which objects are detectable and reoccurring in the book covers. An interplay between Grad-CAM and YOLOv3 was used to investigate how identified objects and features correlate to a specific book genre and ultimately answer what makes a good book cover. Using a State-of-the-Art CNN model architecture we achieve an accuracy of 48% with the best class-wise accuracies for genres Erotica, Economy & Business and Children with accuracies 73%, 67% and 66%. Quantitative results from the Grad-CAM and YOLOv3 interplay show some strong associations between objects and genres, while indicating weak associations between abstract design principles and genres. Furthermore, a qualitative analysis of Grad-CAM visualizations show strong relevance of certain objects and text fonts for specific book genres. It was also observed that the portrayal of a feature was relevant for the model prediction of certain genres.
|
320 |
Identification of alkaline fens using convolutional neural networks and multispectral satellite imageryJernberg, John January 2021 (has links)
The alkaline fen is a particularly valuable type of wetland with unique characteristics.Due to anthropogenic risk factors and the sensitive nature of the fens, protection is highlyprioritized with identification and mapping of current locations being important parts ofthis process. To accomplish this in a cost effective manner for large areas, remote sensingmethods using satellite images might be very effective. Following the rapid developmentin computer vision, deep learning using convolutional neural networks (CNN) is thecurrent state of the art for satellite image classification. Accordingly, this study evaluatesthe combination of different CNN architectures and multispectral Sentinel 2 satelliteimages for identification of alkaline fens using semantic segmentation. The implementedmodels are different variations of the proven U-net network design. In addition, a RandomForest classifier was trained for baseline comparison. The best result was produced bya spatial attention U-net with a IoU-score of 0.31 for the alkaline fen class and a meanIoU-score of 0.61. These findings suggest that identification of alkaline fens is possiblewith the current method even with a small dataset. However, an optimal solution tothis task may require deeper research. The results also further establish deep learningto be the superior choice over traditional machine learning algorithms for satellite imageclassification.
|
Page generated in 0.0494 seconds