• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 441
  • 53
  • Tagged with
  • 494
  • 489
  • 485
  • 417
  • 414
  • 412
  • 409
  • 407
  • 407
  • 166
  • 103
  • 103
  • 98
  • 89
  • 82
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Calibration using a general homogeneous depth camera model / Kalibrering av en generell homogen djupkameramodell

Sjöholm, Daniel January 2017 (has links)
Being able to accurately measure distances in depth images is important for accurately reconstructing objects. But the measurement of depth is a noisy process and depth sensors could use additional correction even after factory calibration. We regard the pair of depth sensor and image sensor to be one single unit, returning complete 3D information. The 3D information is combined by relying on the more accurate image sensor for everything except the depth measurement. We present a new linear method of correcting depth distortion, using an empirical model based around the constraint of only modifying depth data, while keeping planes planar. The depth distortion model is implemented and tested on the Intel RealSense SR300 camera. The results show that the model is viable and generally decreases depth measurement errors after calibrating, with an average improvement in the 50 percent range on the tested data sets. / Att noggrant kunna mäta avstånd i djupbilder är viktigt för att kunna göra bra rekonstruktioner av objekt. Men denna mätprocess är brusig och dagens djupsensorer tjänar på ytterligare korrektion efter fabrikskalibrering. Vi betraktar paret av en djupsensor och en bildsensor som en enda enhet som returnerar komplett 3D information. 3D informationen byggs upp från de två sensorerna genom att lita på den mer precisa bildsensorn för allt förutom djupmätningen. Vi presenterar en ny linjär metod för att korrigera djupdistorsion med hjälp av en empirisk modell, baserad kring att enbart förändra djupdatan medan plana ytor behålls plana. Djupdistortionsmodellen implementerades och testades på kameratypen Intel RealSense SR300. Resultaten visar att modellen fungerar och i regel minskar mätfelet i djupled efter kalibrering, med en genomsnittlig förbättring kring 50 procent för de testade dataseten.
42

Närmaskbestämning från stereoseende / Ranging from stereovision

Hedlund, Gunnar January 2005 (has links)
Detta examensarbete utreder avståndsbedömning med hjälp av bildbehandling och stereoseende för känd kamerauppställning. Idag existerar ett stort antal beräkningsmetoder för att få ut avstånd till objekt, men metodernas prestanda har knappt mätts. Detta arbete tittar huvudsakligen på olika blockbaserade metoder för avståndsbedömning och tittar på möjligheter samt begränsningar då man använder sig av känd kunskap inom bildbehandling och stereoseende för avståndsbedömning. Arbetet är gjort på Bofors Defence AB i Karlskoga, Sverige, i syfte att slutligen användas i ett optiskt sensorsystem. Arbetet utreder beprövade Resultaten pekar mot att det är svårt att bestämma en närmask, avstånd till samtliga synliga objekt, men de testade metoderna bör ändå kunna användas punktvis för att beräkna avstånd. Den bästa metoden bygger på att man beräknar minsta absolutfelet och enbart behåller de säkraste värdena.
43

Utveckling av ett active vision system för demonstration av EDSDK++ i tillämpningar inom datorseende

Kargén, Rolf January 2014 (has links)
Datorseende är ett snabbt växande, tvärvetenskapligt forskningsområde vars tillämpningar tar en allt mer framskjutande roll i dagens samhälle. Med ett ökat intresse för datorseende ökar också behovet av att kunna kontrollera kameror kopplade till datorseende system. Vid Linköpings tekniska högskola, på avdelningen för datorseende, har ramverket EDSDK++ utvecklats för att fjärrstyra digitala kameror tillverkade av Canon Inc. Ramverket är mycket omfattande och innehåller en stor mängd funktioner och inställningsalternativ. Systemet är därför till stor del ännu relativt oprövat. Detta examensarbete syftar till att utveckla ett demonstratorsystem till EDSDK++ i form av ett enkelt active vision system, som med hjälp av ansiktsdetektion i realtid styr en kameratilt, samt en kamera monterad på tilten, till att följa, zooma in och fokusera på ett ansikte eller en grupp av ansikten. Ett krav var att programbiblioteket OpenCV skulle användas för ansiktsdetektionen och att EDSDK++ skulle användas för att kontrollera kameran. Dessutom skulle ett API för att kontrollera kameratilten utvecklas. Under utvecklingsarbetet undersöktes bl.a. olika metoder för ansiktsdetektion. För att förbättra prestandan användes multipla ansiktsdetektorer, som med hjälp av multitrådning avsöker en bild parallellt från olika vinklar. Såväl experimentella som teoretiska ansatser gjordes för att bestämma de parametrar som behövdes för att kunna reglera kamera och kameratilt. Resultatet av arbetet blev en demonstrator, som uppfyllde samtliga krav. / Computer vision is a rapidly growing, interdisciplinary field whose applications are taking an increasingly prominent role in today's society. With an increased interest in computer vision there is also an increasing need to be able to control cameras connected to computer vision systems. At the division of computer vision, at Linköping University, the framework EDSDK++ has been developed to remotely control digital cameras made by Canon Inc. The framework is very comprehensive and contains a large amount of features and configuration options. The system is therefore largely still relatively untested. This thesis aims to develop a demonstrator to EDSDK++ in the form of a simple active vision system, which utilizes real-time face detection in order to control a camera tilt, and a camera mounted on the tilt, to follow, zoom in and focus on a face or a group of faces. A requirement was that the OpenCV library would be used for face detection and EDSDK++ would be used to control the camera. Moreover, an API to control the camera tilt was to be developed. During development, different methods for face detection were investigated. In order to improve performance, multiple, parallel face detectors using multithreading, were used to scan an image from different angles. Both experimental and theoretical approaches were made to determine the parameters needed to control the camera and camera tilt. The project resulted in a fully functional demonstrator, which fulfilled all requirements.
44

Objektföljning med roterbar kamera / Object tracking with rotatable camera

Zetterlund, Joel January 2021 (has links)
Idag är det vanligt att det sker filmning av evenemang utan att man använder sig av en professionell videofotograf. Det kan vara knatteligans fotbollsmatch, konferensmöten, undervisning eller YouTube-klipp. För att slippa ha en kameraman kan man använda sig av något som kallas för objektföljningskameror. Det är en kamera som kan följa ett objekts position över tid utan att en kameraman styr kameran. I detta examensarbete beskrivs hur objektföljning fungerar, samt görs en jämförelse mellan objektföljningskameror med datorseende och en kameraman. För att kunna jämföra de mot varandra har en prototyp byggts. Prototypen består av en Raspberry Pi 4B med MOSSE som är en objektföljningsalgoritm och SSD300 som är en detekteringsalgoritm inom datorseende. Styrningen består av en gimbal som består av tre borstlösa motorer som styr kameran med en regulator. Resultatet blev en prototyp som klarar av att följa en person som promenerar i maximalt 100 pixlar per sekund eller 1 meter per sekund i helbild, med en maxdistans på 11,4 meter utomhus. Medan en kameraman klarar av att följa en person i 300–800 pixlar per sekund eller 3 meter per sekund. Prototypen är inte lika bra som en ka-meraman men kan användas för att följa en person som undervisar och går långsamt. Under förutsättningen att prototypen är robust vilket inte är fallet. För att få bättre resultat behövs starkare processor och bättre algoritmer än som använts med prototypen. Då ett stort problem var att uppdateringshastigheten var låg för detekteringsalgoritmen. / Today, it is common for events to be filmed without the use of a professional video photographer. It can be the little league football game, conference meetings, teaching or YouTube clips. To film without a cameraman, you can use something called object tracking cameras. It is a camera that can follow an object's position without a cameraman.This thesis describes how object tracking works as well as comparison between ob-ject tracking cameras with computer vision and a cameraman. In order to compare them against each other, a prototype has been developed. The prototype consists of a Raspberry Pi 4B with MOSSE which is an object tracking algorithm and SSD300 which is a detection algorithm in computer vision. The steering consists of a gimbal consisting of three brushless motors that control the camera with a regulator. The result was a prototype capable of following a person walking at a maximum speed 100 pixels per second or 1 meter per second in full screen, with a maximum distance of 11.4 meters outdoors. While a cameraman managed to follow a person at 300-800 pixels per second or 3 meters per second. The prototype is not as good as a cameraman but can be used to follow a person who teaches and walks slowly. Under basis that the prototype is robust, which is not the case. To get better results, stronger processor and better algorithms are needed than used with the prototype. That’s because a big problem was that the refresh rate was low for the detection algorithm.
45

Evaluating Deep Learning Algorithms for Steering an Autonomous Vehicle / Utvärdering av Deep Learning-algoritmer för styrning av ett självkörande fordon

Magnusson, Filip January 2018 (has links)
With self-driving cars on the horizon, vehicle autonomy and its problems is a hot topic. In this study we are using convolutional neural networks to make a robot car avoid obstacles. The robot car has a monocular camera, and our approach is to use the images taken by the camera as input, and then output a steering command. Using this method the car is to avoid any object in front of it. In order to lower the amount of training data we use models that are pretrained on ImageNet, a large image database containing millions of images. The model are then trained on our own dataset, which contains of images taken directly by the robot car while driving around. The images are then labeled with the steering command used while taking the image. While training we experiment with using different amounts of frozen layers. A frozen layer is a layer that has been pretrained on ImageNet, but are not trained on our dataset. The Xception, MobileNet and VGG16 architectures are tested and compared to each other. We find that a lower amount of frozen layer produces better results, and our best model, which used the Xception architecture, achieved 81.19% accuracy on our test set. During a qualitative test the car avoid collisions 78.57% of the time.
46

Robot Tool Center Point Calibration using Computer Vision

Hallenberg, Johan January 2007 (has links)
Today, tool center point calibration is mostly done by a manual procedure. The method is very time consuming and the result may vary due to how skilled the operators are. This thesis proposes a new automated iterative method for tool center point calibration of industrial robots, by making use of computer vision and image processing techniques. The new method has several advantages over the manual calibration method. Experimental verifications have shown that the proposed method is much faster, still delivering a comparable or even better accuracy. The setup of the proposed method is very easy, only one USB camera connected to a laptop computer is needed and no contact with the robot tool is necessary during the calibration procedure. The method can be split into three different parts. Initially, the transformation between the robot wrist and the tool is determined by solving a closed loop of homogeneous transformations. Second an image segmentation procedure is described for finding point correspondences on a rotation symmetric robot tool. The image segmentation part is necessary for performing a measurement with six degrees of freedom of the camera to tool transformation. The last part of the proposed method is an iterative procedure which automates an ordinary four point tool center point calibration algorithm. The iterative procedure ensures that the accuracy of the tool center point calibration only depends on the accuracy of the camera when registering a movement between two positions.
47

LEAP, A Platform for Evaluation of Control Algorithms / Labyrintbaserad plattform för algoritmutvärdering

Öfjäll, Kristoffer January 2010 (has links)
Most people are familiar with the BRIO labyrinth game and the challenge of guiding the ball through the maze. The goal of this project was to use this game to create a platform for evaluation of control algorithms. The platform was used to evaluate a few different controlling algorithms, both traditional automatic control algorithms as well as algorithms based on online incremental learning. The game was fitted with servo actuators for tilting the maze. A camera together with computer vision algorithms were used to estimate the state of the game. The evaluated controlling algorithm had the task of calculating a proper control signal, given the estimated state of the game. The evaluated learning systems used traditional control algorithms to provide initial training data. After initial training, the systems learned from their own actions and after a while they outperformed the controller used to provide initial training.
48

Learning to Search for Targets : A Deep Reinforcement Learning Approach to Visual Search in Unseen Environments / Inlärd sökning efter mål

Lundin, Oskar January 2022 (has links)
Visual search is the perceptual task of locating a target in a visual environment. Due to applications in areas like search and rescue, surveillance, and home assistance, it is of great interest to automate visual search. An autonomous system can potentially search more efficiently than a manually controlled one and has the advantages of reduced risk and cost of labor. In many environments, there is structure that can be utilized to find targets quicker. However, manually designing search algorithms that properly utilize structure to search efficiently is not trivial. Different environments may exhibit vastly different characteristics, and visual cues may be difficult to pick up. A learning system has the advantage of being applicable to any environment where there is a sufficient number of samples to learn from. In this thesis, we investigate how an agent that learns to search can be implemented with deep reinforcement learning. Our approach jointly learns control of visual attention, recognition, and localization from a set of sample search scenarios. A recurrent convolutional neural network takes an image of the visible region and the agent's position as input. Its outputs indicate whether a target is visible and control where the agent looks next. The recurrent step serves as a memory that lets the agent utilize features of the explored environment when searching. We compare two memory architectures: an LSTM, and a spatial memory that remembers structured visual information. Through experimentation in three simulated environments, we find that the spatial memory architecture achieves superior search performance. It also searches more efficiently than a set of baselines that do not utilize the appearance of the environment and achieves similar performance to that of a human searcher. Finally, the spatial memory scales to larger search spaces and is better at generalizing from a limited number of training samples.
49

Quality Assuring an Image Data Pipeline with Transfer Learning : Using Computer Vision Methodologies

Wiberg, David January 2023 (has links)
The computer vision field has taken big steps forwards and the amount of models and datasets that are being released is increasing. A large number of contemporary models are the result of extensive training sessions on massive datasets, reflecting a significant investment of time and computational resources. This opens up a new opportunity on utilizing the knowledge from this pre-trained models. It is possible to transfer the knowledge from one domain to a more fine-tuned solution on a custom created dataset, and this can help the field of computer vision to improve rapidly. This project utilizes the pre-trained models ResNet50,ResNet18 and DensNet121, for dealing with the challenge of fine-tuning models on a custom created dataset, that is created of grayscale images. The project’s results show how it’s possible to use a pre-trained model for transferring the learned features from one domain to another. In addition to this, the project included creating a binary classifier that is fine-tuned on a balanced dataset and another classifier that was fine-tuned on an imbalanced dataset. / Området för datorseende har gjort stora framsteg senaste årtionden och mängden modeller för djup maskininlärning som har tagits fram och blivit tillgängliga har ökat, tillsammans med mängden publika dataset som har blivit skapade. Nya modeller som når state-of-the-art status är tränade på stora dataset och med mycket beräkningskraft och har blivit framtagna under lång tid. Det har skapat ett nytt område inom datorseendekallat överföringsinlärning, vilket innebär att dra nytta av en modell som är tränad på ett dataset som inte är det exakta dataset som du har samlat ihop. Detta används sedan för att träna nätverket med dina specifika bilder, och lagerna i nätverket kan anpassas för att angripa det specifika problemet. Detta skapar möjligheter att på kortare framtagnings-tid dra nytta av kunskapen modellen har lärt sig och du tränar den bara på dina bilder för att anpassa nätverket. I det här projektet används de tränade modellerna ResNet50,ResNet18 och DenseNet121, och deras kunskap från tidigare träning används för att klassificera det här projektets specifika dataset med svartvita bilder. Projektets resultat visar möjligheten att använda en för tränad modell och applicera den på ett avgränsat, specifikt problem som skiljer sig från ursprungs-uppgiften. I projektet visas resultaten av att skapa ett dataset av bilder samt en binär klassificering och en klassificering av flertalet klasser.
50

Region Proposal Based Object Detectors Integrated With an Extended Kalman Filter for a Robust Detect-Tracking Algorithm

Khajo, Gabriel January 2019 (has links)
In this thesis we present a detect-tracking algorithm (see figure 3.1) that combines the detection robustness of static region proposal based object detectors, like the faster region convolutional neural network (R-CNN) and the region-based fully convolutional networks (R-FCN) model, with the tracking prediction strength of extended Kalman filters, by using, what we have called, a translating and non-rigid user input region of interest (RoI-) mapping. This so-called RoI-mapping maps a region, which includes the object that one is interested in tracking, to a featureless three-channeled image. The detection part of our proposed algorithm is then performed on the image that includes only the RoI features (see figure 3.2). After the detection step, our model re-maps the RoI features to the original frame, and translates the RoI to the center of the prediction. If no prediction occurs, our proposed model integrates a temporal dependence through a Kalman filter as a predictor; this filter is continuously corrected when detections do occur. To train the region proposal based object detectors that we integrate into our detect-tracking model, we used TensorFlow®’s object detection api, with a random search hyperparameter tuning, where we fine-tuned, all models from TensorFlow® slim base network classification checkpoints. The trained region proposal based object detectors used the inception V2 base network for the faster R-CNN model and the R-FCN model, while the inception V3 base network only was applied to the faster R-CNN model. This was made to compare the two base networks and their corresponding affects on the detection models. In addition to the deep learning part of this thesis, for the implementation part of our detect-tracking model, like for the extended Kalman filter, we used Python and OpenCV® . The results show that, with a stationary camera reference frame, our proposed detect-tracking algorithm, combined with region proposal based object detectors on images of size 414 × 740 × 3, can detect and track a small object in real-time, like a tennis ball, moving along a horizontal trajectory with an average velocity v ≈ 50 km/h at a distance d = 25 m, with a combined detect-tracking frequency of about 13 to 14 Hz. The largest measured state error between the actual state and the predicted state from the Kalman filter, at the aforementioned horizontal velocity, have been measured to be a maximum of 10-15 pixels, see table 5.1, but in certain frames where many detections occur this error has been shown to be much smaller (3-5 pixels). Additionally, our combined detect-tracking model has also been shown to be able to handle obstacles and two learnable features that overlap, thanks to the integrated extended Kalman filter. Lastly, our detect-tracking model also was applied on a set of infra-red images, where the goal was to detect and track a moving truck moving along a semi-horizontal path. Our results show that a faster R-CNN inception V2 model was able to extract features from a sequence of infra-red frames, and that our proposed RoI-mapping method worked relatively well at detecting only one truck in a short test-sequence (see figure 5.22).

Page generated in 0.0657 seconds