Global ETD Search

771	Σταθμισμένη αντιστοίχιση εικόνων Λαμπρινού, Νεφέλη 15 June 2015 (has links) Το πρόβλημα της αντιστοίχισης εικόνων είναι ένα από τα σημαντικότερα στο πεδίο της υπολογιστικής όρασης, αφού η ευθυγράμμιση δύο ή περισσότερων εικόνων χρησιμοποιείται τουλάχιστον σαν στάδιο προεπεξεργασίας σε ένα μεγάλο αριθμό εφαρμογών. Στην εργασία αυτή μας απασχόλησε το πρόβλημα της στοίχισης εικόνων στις οποίες οι φωτομετρικές παραμορφώσεις είναι τοπικές και δεν μπορούν να μοντελοποιηθούν με το γενικό σφαιρικό μοντέλο της αντίθεσης και της φωτεινότητας, ή/και τμήματα των προς στοίχιση εικόνων είναι αποκλεισμένα από τη μια από αυτές. Για την αντιμετώπιση των παραπάνω προβλημάτων, η αντιστοίχηση των εικόνων προσεγγίστηκε μέσω της σταθμισμένης ελαχιστοποίησης μετρικών σφάλματος που βασίζονται στο τετραγωνικό σφάλμα. Συγκεκριμένα, εκμεταλλευόμαστε την αμεταβλητότητα της κανονικοποιημένης κλίσης μιας εικόνας σε τοπικές φωτομετρικές παραμορφώσεις και τη δυνατότητα στοίχισης κάθε ζεύγους αντίστοιχων εικονοστοιχείων των υπό στοίχιση εικόνων με την μεγιστοποίηση της μεταξύ τους συσχέτισης. Έτσι πετυχαίνουμε την αποσύνδεση του αρχικού προβλήματος σε δύο υποπροβλήματα η λύση των οποίων καταλήγει σε δύο υπερκαθορισμένα συστήματα γραμμικών εξισώσεων, καθένα εκ των οποίων έχει ως αγνώστους τις ανά κατεύθυνση παράμετρες του μετασχηματισμού που αναζητούμε για την εξάλειψη της γεωμετρικής παραμόρφωσης και ως δεξιό μέλος τις τιμές των φωτομετρικών παραμορφώσεων. Τελικά, με την επιλογή δύο κατάλληλων υποσυνόλων των προαναφερθέντων γραμμικών εξισώσεων, που εξασφαλίζουν την εφικτότητα των επιμέρους λύσεων οδηγούμαστε στον προσδιορισμό των βέλτιστων παραμέτρων. Η προτεινόμενη τεχνική δοκιμάστηκε στη βάση προσώπων Yale Β που έχει χρησιμοποιηθεί από άλλες τεχνικές αντιστοίχισης που είναι ειδικά προσαρμοσμένες για την αντιστοίχιση προσώπων. Η απόδοση της προτεινόμενης τεχνικής είναι πολύ καλή και υπερτερεί και στα ποσοστά σύγκλισης αλλά και στην ακρίβεια των λύσεων από την απόδοση των άλλων τεχνικών τόσο στη στοίχιση εικόνων που έχουν υποστεί γεωμετρικές παραμορφώσεις (από πολύ μικρές μέχρι και πολύ έντονες) όσο και σε εικόνες με διαφορετικές έντονες φωτομετρικές παραμορφώσεις. Επίσης, η προτεινόμενη τεχνική δοκιμάστηκε στις βάσεις του Affine Covariance Regions του University of Oxford στις οποίες το περιεχόμενο των εικόνων είναι γενικό και οι ειδικού σκοπού τεχνικές αποτυγχάνουν, με εξίσου πολύ καλή απόδοση. / The image registration problem is one of the most important problems in the field of computer vision, since the process of aligning two or more images is used, at least as a preprocessing step, in many applications. In this work, we employed the problem of image alignment in which the photometric deformations are local and can not be modeled with the general spherical model of contrast and brightness, and / or portions of images to align are occluded. To address these problems, the image registration was approached by minimizing the weighted error metric based on squared error. In particular, we exploit the invariance of the normalized image gradient in local photometric deformations so we can align each pair of corresponding pixels in the images by maximizing the correlation between them. Thus, we achieve to dissolve the original problem into two subproblems the solution of which leads to two over-determined systems of linear equations, each of which has the direction parameters of the transformation we seek to estimate as unknowns and as right member the values of photometric deformations. Ultimately, the choice of two suitable subsets of the above linear equations, ensuring the feasibility of individual solutions we are lead to the identification of best parameters. The proposed technique was tested in Yale B face database which has been used by other mapping techniques adapted to matching persons. The performance of the proposed technique is very good and superior at the convergence rates and the accuracy of the solutions to the performance of other techniques concerning both images that have undergone geometrical deformation (from very small to very intense) and images in different intense photometric deformations. Also, the proposed technique was tested on database of Affine Covariance Regions of the University of Oxford in which the content of the images is general and special-purpose techniques fail, with equally good performance. Υπολογιστική όραση Αντιστοίχιση εικόνων 006.42 Computer vision Image registration
772	Who Moved My Slide? Recognizing Entities In A Lecture Video And Its Applications Tung, Qiyam Junn January 2014 (has links) Lecture videos have proliferated in recent years thanks to the increasing bandwidths of Internet connections and availability of video cameras. Despite the massive volume of videos available, there are very few systems that parse useful information from them. Extracting meaningful data can help with searching and indexing of lecture videos as well as improve understanding and usability for the viewers. While video tags and user preferences are good indicators for relevant videos, it is completely dependent on human-generated data. Furthermore, many lecture videos are technical by nature and sparse video tags are too coarse-grained to relate parts of a video by a specific topic. While extracting the text from the presentation slide will ameliorate this issue, a lecture video still contains significantly more information than what is just available on the presentation slides. That is, the actions and words of the speaker contribute to a richer and more nuanced understanding of the lecture material. The goal of the Semantically Linked Instructional Content (SLIC) project is to relate videos using more specific and relevant features such as slide text and other entities. In this work, we will present the algorithms used to recognize the entities of the lecture. Specifically, the entities in lecture videos are the laser and pointing hand gestures and the location of the slide and its text and images in the video. Our algorithms work under the assumption that the slide location (homography) is known for each frame and extend the knowledge of the scene. Specifically, gestures inform when and where on a slide notable events occur. We will also show how recognition of these entities can help with understanding lectures better and energy-savings on mobile devices. We conducted a user study that shows that magnifying text based on laser gestures on a slide helps direct a viewer's attention to the relevant text. We also performed empirical measurements on real cellphones to confirm that selectively dimming less relevant regions of the video frame would reduce energy consumption significantly. gesture recognition lecture videos Computer Science computer vision
773	Digital Trails Morris, Scott Howard January 2010 (has links) May your trails be crooked, winding, lonesome, dangerous, leading to the most amazing view. May your mountains rise into and above the clouds. Edward Abbey The digital representation of trails is a relatively new concept. Only in the last decade, with increasing adoption and accuracy of GPS technology, have large quantities of reliable data become a reality. However, the development of algorithms specific to processing digital trails has not had much attention. This dissertation presents a set of methods for collecting, improving and processing digital trails, laying the ground work for the science of trails.We first present a solution to the GPS-network problem, which determines the salient trails and structure of a trail network from a set of GPS tracklogs. This method has received significant attention from the industry and online GPS sharing sites, since it provides the basis for forming a digital library of trails from user submitted GPS tracks.A set of tracks through a GPS trail network further presents the opportunity to model and understand trail user behavior. Trail user models are useful to land managers faced with difficult management decisions. We present the K-history model, a probabilistic method for understanding and simulating trail user decisions based on GPS data. We use the K-history model to evaluate current simulation techniques and show how optimizing the number of historical decisions can lead to better predictive power.With collections of GPS trail data we can begin to learn what trails look like in aerial images. We present a statistical learning approach for automatically extracting trail data from aerial imagery, using GPS data to train our model. While the problem of recognizing relatively straight and well defined roads has been well studied in the literature, the more difficult problem of extracting trails has received no attention. We extensively test our method on a 2,500 mile trail, showing promise for obtaining digital trail data without the use of GPS.These methods present further possibilities for the study of trails and trail user behavior, resulting in increased opportunity for the outdoors lover, and more informed management of our natural areas. Computer Vision GPS Libraries Mapping Recreation Modeling Trails
774	Vision utility framework : a new approach to vision system development Afrah, Amir 05 1900 (has links) We are addressing two aspects of vision based system development that are not fully exploited in current frameworks: abstraction over low-level details and high-level module reusability. Through an evaluation of existing frameworks, we relate these shortcomings to the lack of systematic classification of sub-tasks in vision based system development. Our approach for addressing these two issues is to classify vision into decoupled sub-tasks, hence defining a clear scope for a vision based system development framework and its sub-components. Firstly, we decompose the task of vision system development into data management and processing. We then proceed to further decompose data management into three components: data access, conversion and transportation. To verify our approach for vision system development we present two frameworks: the Vision Utility (VU) framework for providing abstraction over the data management component; and the Hive framework for providing the data transportation and high-level code reuse. VU provides the data management functionality for developers while hiding the low-level system details through a simple yet flexible Application Programming Interface (API). VU mediates the communication between the developer's application, vision processing modules, and data sources by utilizing different frameworks for data access, conversion and transportation (Hive). We demonstrate VU's ability for providing abstraction over low-level system details through the examination of a vision system developed using the framework. Hive is a standalone event based framework for developing distributed vision based systems. Hive provides simple high-level methods for managing communication, control and configuration of reusable components. We verify the requirements of Hive (reusability and abstraction over inter-module data transportation) by presenting a number of different systems developed on the framework using a set of reusable modules. Through this work we aim to demonstrate that this novel approach for vision system development could fundamentally change vision based system development by addressing the necessary abstraction, and promoting high-level code reuse. Computer vision System development Distributed system Framework development
775	Recursive Estimation of Structure and Motion from Monocular Images Fakih, Adel January 2010 (has links) The determination of the 3D motion of a camera and the 3D structure of the scene in which the camera is moving, known as the Structure from Motion (SFM) problem, is a central problem in computer vision. Specifically, the recursive (online) estimation is of major interest for robotics applications such as navigation and mapping. Many problems still hinder the deployment of SFM in real-life applications namely, (1) the robustness to noise, outliers and ambiguous motions, (2) the numerical tractability with a large number of features and (3) the cases of rapidly varying camera velocities. Towards solving those problems, this research presents the following four contributions that can be used individually, together, or combined with other approaches. A motion-only filter is devised by capitalizing on algebraic threading constraints. This filter efficiently integrates information over multiple frames achieving a performance comparable to the best state of the art filters. However, unlike other filter based approaches, it is not affected by large baselines (displacement between camera centers). An approach is introduced to incorporate, with only a small computational overhead, a large number of frame-to-frame features (i.e., features that are matched only in pairs of consecutive frames) in any analytic filter. The computational overhead grows linearly with the number of added frame-to-frame features and the experimental results show an increased accuracy and consistency. A novel filtering approach scalable to accommodate a large number of features is proposed. This approach achieves both the scalability of the state of the art filter in scalability and the accuracy of the state of the art filter in accuracy. A solution to the problem of prediction over large baselines in monocular Bayesian filters is presented. This problem is due to the fact that a simple prediction, using constant velocity models for example, is not suitable for large baselines, and the projections of the 3D points that are in the state vector can not be used in the prediction due to the need of preserving the statistical independence of the prediction and update steps. Computer Vision Structure from Motion Recursive Filtering System Design Engineering
776	An Energy Efficient FPGA Hardware Architecture for the Acceleration of OpenCV Object Detection Brousseau, Braiden 21 November 2012 (has links) The use of Computer Vision in programmable mobile devices could lead to novel and creative applications. However, the computational demands of Computer Vision are ill-suited to low performance mobile processors. Also the evolving algorithms, due to active research in this fi eld, are ill-suited to dedicated digital circuits. This thesis proposes the inclusion of an FPGA co-processor in smartphones as a means of efficiently computing tasks such as Computer Vision. An open source object detection algorithm is run on a mobile device and implemented on an FPGA to motivate this proposal. Our hardware implementation presents a novel memory architecture and a SIMD processing style that achieves both high performance and energy efficiency. The FPGA implementation outperforms a mobile device by 59 times while being 13.5 times more energy efficient. FPGA Computer Vision Object Detection OpenCV Mobile 0544
777	An Energy Efficient FPGA Hardware Architecture for the Acceleration of OpenCV Object Detection Brousseau, Braiden 21 November 2012 (has links) The use of Computer Vision in programmable mobile devices could lead to novel and creative applications. However, the computational demands of Computer Vision are ill-suited to low performance mobile processors. Also the evolving algorithms, due to active research in this fi eld, are ill-suited to dedicated digital circuits. This thesis proposes the inclusion of an FPGA co-processor in smartphones as a means of efficiently computing tasks such as Computer Vision. An open source object detection algorithm is run on a mobile device and implemented on an FPGA to motivate this proposal. Our hardware implementation presents a novel memory architecture and a SIMD processing style that achieves both high performance and energy efficiency. The FPGA implementation outperforms a mobile device by 59 times while being 13.5 times more energy efficient. FPGA Computer Vision Object Detection OpenCV Mobile 0544
778	Computational Techniques for Detecting Coronary Atherosclerosis Abrich, Richard 20 November 2013 (has links) Coronary atherosclerosis is one of the leading cause of mortality in developed countries, and is increasingly diagnosed via X-ray computed tomography. Due to the large resulting volume of data, recent research has been directed towards developing automated methods of screening CT scans for coronary atherosclerosis. This task typically consists of lumen extraction, plaque detection, plaque quantification, and material discrimination. In this paper, we describe a novel set of techniques for accomplishing the first three steps, which aim to provide higher precision than previous efforts. We also discuss how such a high-precision detection and quantification system could be used to significantly improve on the state of the art in material discrimination. Our methods extract lumen for 71.2% of centreline points, detect plaque with a detection sensitivity of 67% on CTA reference data, and quantify plaque with a linear weighted kappa coefficient of 0.08. Computer vision Computer-aided diagnosis Coronary atherosclerosis 0541 0544 0984
779	3-D Reconstruction from Single Projections, with Applications to Astronomical Images Cormier, Michael January 2013 (has links) A variety of techniques exist for three-dimensional reconstruction when multiple views are available, but less attention has been given to reconstruction when only a single view is available. Such a situation is normal in astronomy, when a galaxy (for example) is so distant that it is impossible to obtain views from significantly different angles. In this thesis I examine the problem of reconstructing the three-dimensional structure of a galaxy from this single viewpoint. I accomplish this by taking advantage of the image formation process, symmetry relationships, and other structural assumptions that may be made about galaxies. Most galaxies are approximately symmetric in some way. Frequently, this symmetry corresponds to symmetry about an axis of rotation, which allows strong statements to be made about the relationships between luminosity at each point in the galaxy. It is through these relationships that the number of unknown values needed to describe the structure of the galaxy can be reduced to the number of constraints provided by the image so the optimal reconstruction is well-defined. Other structural properties can also be described under this framework. I provide a mathematical framework and analyses that prove the uniqueness of solutions under certain conditions and to show how uncertainty may be precisely and explicitly expressed. Empirical results are shown using real and synthetic data. I also show a comparison to a state-of-the-art two-dimensional modelling technique to demonstrate the contrasts between the two frameworks and show the important advantages of the three-dimensional approach. In combination, the theoretical and experimental aspects of this thesis demonstrate that the proposed framework is versatile, practical, and novel---a contribution to both computer science and astronomy. computer vision astronomy reconstruction projection galaxy Computer Science
780	EVALUATION OF QUALITY OF APPLE SLICES DURING CONVECTION DRYING USING REAL-TIME IMAGE ANALYSIS Sampson, David 26 August 2011 (has links) Computer-vision technology methods for assessing food quality were evaluated for their ability to provide non-contact measurements of apple slices. The methods evaluated were camera calibration, measurements of physical parameters of apple slices, and measurement of biochemical changes in apple slices. Each measure of food quality that was assessed by computer-vision was compared to a conventional method of measurement. The computer-vision system was capable of measuring area, thickness and volumes of apple slices. Color measurements from the computer-vision system were correlated with phenolic compound degradation in the beginning of the drying process and with hydroxymethylfurfural development later in the drying process. apple drying computer-vision image analysis quality food

Search results