• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1941
  • 313
  • 150
  • 112
  • 108
  • 69
  • 56
  • 46
  • 24
  • 20
  • 14
  • 13
  • 13
  • 13
  • 13
  • Tagged with
  • 3581
  • 3581
  • 974
  • 869
  • 791
  • 791
  • 645
  • 617
  • 578
  • 538
  • 530
  • 525
  • 479
  • 449
  • 447
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
631

Design of a multi-camera system for object identification, localisation, and visual servoing

Åkesson, Ulrik January 2019 (has links)
In this thesis, the development of a stereo camera system for an intelligent tool is presented. The task of the system is to identify and localise objects so that the tool can guide a robot. Different approaches to object detection have been implemented and evaluated and the systems ability to localise objects has been tested. The results show that the system can achieve a localisation accuracy below 5 mm.
632

Transforming Thermal Images to Visible Spectrum Images using Deep Learning

Nyberg, Adam January 2018 (has links)
Thermal spectrum cameras are gaining interest in many applications due to their long wavelength which allows them to operate under low light and harsh weather conditions. One disadvantage of thermal cameras is their limited visual interpretability for humans, which limits the scope of their applications. In this thesis, we try to address this problem by investigating the possibility of transforming thermal infrared (TIR) images to perceptually realistic visible spectrum (VIS) images by using Convolutional Neural Networks (CNNs). Existing state-of-the-art colorization CNNs fail to provide the desired output as they were trained to map grayscale VIS images to color VIS images. Instead, we utilize an auto-encoder architecture to perform cross-spectral transformation between TIR and VIS images. This architecture was shown to quantitatively perform very well on the problem while producing perceptually realistic images. We show that the quantitative differences are insignificant when training this architecture using different color spaces, while there exist clear qualitative differences depending on the choice of color space. Finally, we found that a CNN trained from day time examples generalizes well on tests from night time.
633

Interactive Robot Art : A turn-based system for painting together with a robot

Westberg, Erik, Lindhqvist, Nils January 2019 (has links)
A large amount of people suffer from mental illnesses such as depression and autism. Receiving the care they need can be a very difficult process, with long queues and expensive bills. Automating part of the therapeutic process might be a solution to this. More patients could be treated at the same time, and the cost could be decreased. This project explores the possibilities of using a robot that paints together with patients. Such a robot would encourage the patient to be creative, which is thought to be an efficient way of improving their well-being. The painting will be done in a turn-based fashion, each taking turns adding details to the same painting. Software is developed for the robot Baxter, made by Rethink Robo\-tics. Computer vision concepts and algorithms is applied to interpret what the user has painted and construct a plan of what Baxter will paint. Painting is then done by tracing the target shape through a set of pre-defined points on a canvas. The constructed system performs fairly well - although the user is limited to painting lines, squares, rectangles and circles. Further work can be done to increase the amount of options available to the user. This system serves as a model of how a similar system could be used in reality. / En stor mängd folk lider av mentala sjukdomar som depression och autism. Att få den hjälp som dem behöver kan vara en jobbig process, med långa köer och dyra kostnader. Detta innebär att många människor lider längre tid än vad de vad de ska behöva göra, eller att de inte har råd att skaffa den hjälp som dem behöver. Att automatisera en del av den terapeutiska processen skulle kunna vara en lösning till detta. Flera patienter skulle kunna bli behandlade samtidigt och kostnaderna skulle minska. Detta projekt utforskar möojligheterna kring användning av en robot som målar tillsammans med patienter. Detta skulle uppmuntra patienterna att vara kreativa, vilket tros vara ett effektiv sätt att förbättra deras välmående. Roboten kommer måla tillsammans med patienter på ett tur-baserat vis, där var och en lägger till detaljer i samma målning. Mjukvara utvecklas till roboten Baxter, gjord av Rethink Robo\-tics. Koncept och algoritmer för datorseende appliceras för att tolka vad som använ\-daren har målat och bygga upp en plan av vad Baxter ska måla. Målandet utförs genom att spåra den slutgiltiga formen genom ett par för-definierade punkter på tavlan. Det tillverkade systemet presterar hyfsat bra - även om användaren är begr\-änsad till att rita linjer, kvadrater, rektanglar och cirklar. Fortsatt utveckling kan utöka antal valmöjligheter för användaren. Systemet agerar som en modell för hur ett liknande system skulle kunna användas i verkligheten.
634

One sample based feature learning and its application to object identification

Yang, Xu January 2018 (has links)
University of Macau / Faculty of Science and Technology. / Department of Computer and Information Science
635

Computer vision as a tool for forestry / Datorseende som ett verktyg för skogsbruket

Bång, Filip January 2019 (has links)
Forestry is a large industry in Sweden and methods have been developed to try to optimize the processes in the business. Yet computer vision has not been used to a large extent despite other industries using computer vision with success. Computer vision is a sub area of machine learning and has become popular thanks to advancements in the field of machine learning. This project plans to  investigate how some of the architectures used in computer vision perform when applied in the context of forestry. In this project four architectures were selected that have previously proven to perform well on a general dataset. These four architectures were configured to continue to train on trees and other objects in the forest. The trained architectures were tested by measuring frames per second (FPS) when performing object detection on a video and mean average precision (mAP) which is a measure of how well a trained architecture detects objects. The fastest one was an architecture using a Single Shot Detector together with MobileNet v2 as a base network achieving 29 FPS. The one with the best accuracy was using Faster R-CNN and Inception Resnet as a base network achieving 0.119 mAP on the test set. The overall bad mAP for the trained architectures resulted in that none of the architectures were considered to be useful in a real world scenario as is. Suggestions on how to improve the mAP is focused on improvements on the dataset.
636

Using Deep Learning Semantic Segmentation to Estimate Visual Odometry

Unknown Date (has links)
In this research, image segmentation and visual odometry estimations in real time are addressed, and two main contributions were made to this field. First, a new image segmentation and classification algorithm named DilatedU-NET is introduced. This deep learning based algorithm is able to process seven frames per-second and achieves over 84% accuracy using the Cityscapes dataset. Secondly, a new method to estimate visual odometry is introduced. Using the KITTI benchmark dataset as a baseline, the visual odometry error was more significant than could be accurately measured. However, the robust framerate speed made up for this, able to process 15 frames per second. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
637

Reconstruction and motion estimation of sparsely sampled ionospheric data

Foster, Matthew January 2009 (has links)
This thesis covers two main areas which are related to the mapping and examination of the ionosphere. The first examines the performance and specific nuances of various state-of-the-art interpolation methods with specific application to mapping the ionosphere. This work forms the most widely scoped examination of interpolation technique for ionospheric imaging to date, and includes the introduction of normalised convolution techniques to geophysical data. In this study, adaptive-normalised convolution was found to perform well in ionospheric electron content mapping, and the popular technique, kriging was found to have problems which limit its usefulness. The second, is the development and examination of automatic data-driven motion estimation methods for use on ionospheric electron content data. Particular emphasis is given to storm events, during which characteristic shapes appear and move across the North Pole. This is a particular challenge, as images covering this region tend to have a very-low resolution. Several motion estimation methods are developed and applied to such data, including methods based on optical flow, correlation and boundarycorrespondence. Correlation and relaxation labelling based methods are found to perform reasonably, and boundary based methods based on shape-context matching are found to perform well, when coupled with a regularisation stage. Overall, the techniques examined and developed here will help advance the process of examining the features and morphology of the ionosphere, both during storms and quiet times.
638

Labeling problems with smoothness-based priors in computer vision. / CUHK electronic theses & dissertations collection

January 2008 (has links)
Five algorithms in different applications are proposed in this thesis. All of them are formulated as smoothness based labeling problems, including single image segmentation, video object cutout, image/video completion, image denoising, and image matting. According to different definitions, different optimization approaches are used in these algorithms. In single image segmentation and video object cutout, the graph-cut algorithms are used; in image/video completion, belief propagation is used; and in image denoising and image matting, closed form optimization is implemented. / Many applications in computer vision can be formulated as labeling problems of assigning each pixel a label where the labels represent some local quantities. If all pixels are regarded as independent, i.e., the label of each pixel has nothing to do with the labels of other pixels, such labeling problems are seriously sensitive to noise. On the other hand, for applications in videos, if the inter-frame information is neglected, the performance of the algorithms will be degraded. / Successful performance of the five proposed algorithms, with comparisons to related methods, demonstrates that the proposed models of the labeling problems using the smoothness-based priors work very well in these computer vision applications. / Such labeling problems with smoothness-based priors can be solved by minimizing a Markov energy. According to different definitions of the energy functions, different optimization tools can be used to obtain the results. In this thesis, three optimization approaches are used due to their good performance: graph cuts, belief propagation, and optimization with a closed form solution. / To improve results of these labeling problems, smoothness-based priors can be enforced in the formulations. For a single image, the smoothness is the spatial coherence, which means that spatially close pixels trend to have similar labels. For a video, an additional temporal coherence is enforced, which means that the corresponding pixels in different frames should have similar labels. The spatial coherence constraint makes algorithms robust to noise and the temporal coherence constraint utilizes the inter-frame information for better video-based applications. / Chen, Shifeng. / Adviser: Liu Jian Zhuang. / Source: Dissertation Abstracts International, Volume: 70-06, Section: B, page: 3594. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 130-145). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
639

Applying image processing techniques to pose estimation and view synthesis.

January 1999 (has links)
Fung Yiu-fai Phineas. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 142-148). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Model-based Pose Estimation --- p.3 / Chapter 1.1.1 --- Application - 3D Motion Tracking --- p.4 / Chapter 1.2 --- Image-based View Synthesis --- p.4 / Chapter 1.3 --- Thesis Contribution --- p.7 / Chapter 1.4 --- Thesis Outline --- p.8 / Chapter 2 --- General Background --- p.9 / Chapter 2.1 --- Notations --- p.9 / Chapter 2.2 --- Camera Models --- p.10 / Chapter 2.2.1 --- Generic Camera Model --- p.10 / Chapter 2.2.2 --- Full-perspective Camera Model --- p.11 / Chapter 2.2.3 --- Affine Camera Model --- p.12 / Chapter 2.2.4 --- Weak-perspective Camera Model --- p.13 / Chapter 2.2.5 --- Paraperspective Camera Model --- p.14 / Chapter 2.3 --- Model-based Motion Analysis --- p.15 / Chapter 2.3.1 --- Point Correspondences --- p.16 / Chapter 2.3.2 --- Line Correspondences --- p.18 / Chapter 2.3.3 --- Angle Correspondences --- p.19 / Chapter 2.4 --- Panoramic Representation --- p.20 / Chapter 2.4.1 --- Static Mosaic --- p.21 / Chapter 2.4.2 --- Dynamic Mosaic --- p.22 / Chapter 2.4.3 --- Temporal Pyramid --- p.23 / Chapter 2.4.4 --- Spatial Pyramid --- p.23 / Chapter 2.5 --- Image Pre-processing --- p.24 / Chapter 2.5.1 --- Feature Extraction --- p.24 / Chapter 2.5.2 --- Spatial Filtering --- p.27 / Chapter 2.5.3 --- Local Enhancement --- p.31 / Chapter 2.5.4 --- Dynamic Range Stretching or Compression --- p.32 / Chapter 2.5.5 --- YIQ Color Model --- p.33 / Chapter 3 --- Model-based Pose Estimation --- p.35 / Chapter 3.1 --- Previous Work --- p.35 / Chapter 3.1.1 --- Estimation from Established Correspondences --- p.36 / Chapter 3.1.2 --- Direct Estimation from Image Intensities --- p.49 / Chapter 3.1.3 --- Perspective-3-Point Problem --- p.51 / Chapter 3.2 --- Our Iterative P3P Algorithm --- p.58 / Chapter 3.2.1 --- Gauss-Newton Method --- p.60 / Chapter 3.2.2 --- Dealing with Ambiguity --- p.61 / Chapter 3.2.3 --- 3D-to-3D Motion Estimation --- p.66 / Chapter 3.3 --- Experimental Results --- p.68 / Chapter 3.3.1 --- Synthetic Data --- p.68 / Chapter 3.3.2 --- Real Images --- p.72 / Chapter 3.4 --- Discussions --- p.73 / Chapter 4 --- Panoramic View Analysis --- p.76 / Chapter 4.1 --- Advanced Mosaic Representation --- p.76 / Chapter 4.1.1 --- Frame Alignment Policy --- p.77 / Chapter 4.1.2 --- Multi-resolution Representation --- p.77 / Chapter 4.1.3 --- Parallax-based Representation --- p.78 / Chapter 4.1.4 --- Multiple Moving Objects --- p.79 / Chapter 4.1.5 --- Layers and Tiles --- p.79 / Chapter 4.2 --- Panorama Construction --- p.79 / Chapter 4.2.1 --- Image Acquisition --- p.80 / Chapter 4.2.2 --- Image Alignment --- p.82 / Chapter 4.2.3 --- Image Integration --- p.88 / Chapter 4.2.4 --- Significant Residual Estimation --- p.89 / Chapter 4.3 --- Advanced Alignment Algorithms --- p.90 / Chapter 4.3.1 --- Patch-based Alignment --- p.91 / Chapter 4.3.2 --- Global Alignment (Block Adjustment) --- p.92 / Chapter 4.3.3 --- Local Alignment (Deghosting) --- p.93 / Chapter 4.4 --- Mosaic Application --- p.94 / Chapter 4.4.1 --- Visualization Tool --- p.94 / Chapter 4.4.2 --- Video Manipulation --- p.95 / Chapter 4.5 --- Experimental Results --- p.96 / Chapter 5 --- Panoramic Walkthrough --- p.99 / Chapter 5.1 --- Problem Statement and Notations --- p.100 / Chapter 5.2 --- Previous Work --- p.101 / Chapter 5.2.1 --- 3D Modeling and Rendering --- p.102 / Chapter 5.2.2 --- Branching Movies --- p.103 / Chapter 5.2.3 --- Texture Window Scaling --- p.104 / Chapter 5.2.4 --- Problems with Simple Texture Window Scaling --- p.105 / Chapter 5.3 --- Our Walkthrough Approach --- p.106 / Chapter 5.3.1 --- Cylindrical Projection onto Image Plane --- p.106 / Chapter 5.3.2 --- Generating Intermediate Frames --- p.108 / Chapter 5.3.3 --- Occlusion Handling --- p.114 / Chapter 5.4 --- Experimental Results --- p.116 / Chapter 5.5 --- Discussions --- p.116 / Chapter 6 --- Conclusion --- p.121 / Chapter A --- Formulation of Fischler and Bolles' Method for P3P Problems --- p.123 / Chapter B --- Derivation of z1 and z3 in terms of z2 --- p.127 / Chapter C --- Derivation of e1 and e2 --- p.129 / Chapter D --- Derivation of the Update Rule for Gauss-Newton Method --- p.130 / Chapter E --- Proof of (λ1λ2-λ 4)>〉0 --- p.132 / Chapter F --- Derivation of φ and hi --- p.133 / Chapter G --- Derivation of w1j to w4j --- p.134 / Chapter H --- More Experimental Results on Panoramic Stitching Algorithms --- p.138 / Bibliography --- p.148
640

Inferring facial and body language

Shan, Caifeng January 2008 (has links)
Machine analysis of human facial and body language is a challenging topic in computer vision, impacting on important applications such as human-computer interaction and visual surveillance. In this thesis, we present research building towards computational frameworks capable of automatically understanding facial expression and behavioural body language. The thesis work commences with a thorough examination in issues surrounding facial representation based on Local Binary Patterns (LBP). Extensive experiments with different machine learning techniques demonstrate that LBP features are efficient and effective for person-independent facial expression recognition, even in low-resolution settings. We then present and evaluate a conditional mutual information based algorithm to efficiently learn the most discriminative LBP features, and show the best recognition performance is obtained by using SVM classifiers with the selected LBP features. However, the recognition is performed on static images without exploiting temporal behaviors of facial expression. Subsequently we present a method to capture and represent temporal dynamics of facial expression by discovering the underlying low-dimensional manifold. Locality Preserving Projections (LPP) is exploited to learn the expression manifold in the LBP based appearance feature space. By deriving a universal discriminant expression subspace using a supervised LPP, we can effectively align manifolds of different subjects on a generalised expression manifold. Different linear subspace methods are comprehensively evaluated in expression subspace learning. We formulate and evaluate a Bayesian framework for dynamic facial expression recognition employing the derived manifold representation. However, the manifold representation only addresses temporal correlations of the whole face image, does not consider spatial-temporal correlations among different facial regions. We then employ Canonical Correlation Analysis (CCA) to capture correlations among face parts. To overcome the inherent limitations of classical CCA for image data, we introduce and formalise a novel Matrix-based CCA (MCCA), which can better measure correlations in 2D image data. We show this technique can provide superior performance in regression and recognition tasks, whilst requiring significantly fewer canonical factors. All the above work focuses on facial expressions. However, the face is usually perceived not as an isolated object but as an integrated part of the whole body, and the visual channel combining facial and bodily expressions is most informative. Finally we investigate two understudied problems in body language analysis, gait-based gender discrimination and affective body gesture recognition. To effectively combine face and body cues, CCA is adopted to establish the relationship between the two modalities, and derive a semantic joint feature space for the feature-level fusion. Experiments on large data sets demonstrate that our multimodal systems achieve the superior performance in gender discrimination and affective state analysis.

Page generated in 0.067 seconds