Global ETD Search

151	Topics in Content Based Image Retrieval : Fonts and Color Emotions Solli, Martin January 2009 (has links) Two novel contributions to Content Based Image Retrieval are presented and discussed. The first is a search engine for font recognition. The intended usage is the search in very large font databases. The input to the search engine is an image of a text line, and the output is the name of the font used when printing the text. After pre-processing and segmentation of the input image, a local approach is used, where features are calculated for individual characters. The method is based on eigenimages calculated from edge filtered character images, which enables compact feature vectors that can be computed rapidly. A system for visualizing the entire font database is also proposed. Applying geometry preserving linear- and non-linear manifold learning methods, the structure of the high-dimensional feature space is mapped to a two-dimensional representation, which can be reorganized into a grid-based display. The performance of the search engine and the visualization tool is illustrated with a large database containing more than 2700 fonts. The second contribution is the inclusion of color-based emotion-related properties in image retrieval. The color emotion metric used is derived from psychophysical experiments and uses three scales: activity, weight and heat. It was originally designed for single-color combinations and later extended to include pairs of colors. A modified approach for statistical analysis of color emotions in images, involving transformations of ordinary RGB-histograms, is used for image classification and retrieval. The methods are very fast in feature extraction, and descriptor vectors are very short. This is essential in our application where the intended use is the search in huge image databases containing millions or billions of images. The proposed method is evaluated in psychophysical experiments, using both category scaling and interval scaling. The results show that people in general perceive color emotions for multi-colored images in similar ways, and that observer judgments correlate with derived values. Both the font search engine and the emotion based retrieval system are implemented in publicly available search engines. User statistics gathered during a period of 20 respectively 14 months are presented and discussed. image analysis content based image retrieval font recognition color emotions
152	Colorimetric and Multispectral Image Acquisition Nyström, Daniel January 2006 (has links) The trichromatic principle of representing color has for a long time been dominating in color imaging. The reason is the trichromatic nature of human color vision, but as the characteristics of typical color imaging devices are different from those of human eyes, there is a need to go beyond the trichromatic approach. The interest for multi-channel imaging, i.e. increasing the number of color channels, has made it an active research topic with a substantial potential of application. To achieve consistent color imaging, one needs to map the imaging-device data to the device-independent colorimetric representations CIEXYZ or CIELAB, the key concept of color management. As the color coordinates depend not only on the reflective spectrum of the object but also on the spectral properties of the illuminant, the colorimetric representation suffers from metamerism, i.e. objects of the same color under a specific illumination may appear different when they are illuminated by other light sources. Furthermore, when the sensitivities of the imaging device differ from the CIE color matching functions, two spectra that appear different for human observers may result in identical device response. On contrary, in multispectral imaging, color is represented by the object’s physical characteristics namely the spectrum which is illuminant independent. With multispectral imaging, different spectra are readily distinguishable, no matter they are metameric or not. The spectrum can then be transformed to any color space and be rendered under any illumination. The focus of the thesis is high quality image-acquisition in colorimetric and multispectral formats. The image acquisition system used is an experimental system with great flexibility in illumination and image acquisition setup. Besides the conventional trichromatic RGB filters, the system also provides the possibility of acquiring multi-channel images, using 7 narrowband filters. A thorough calibration and characterization of all the components involved in the image acquisition system is carried out. The spectral sensitivity of the CCD camera, which can not be derived by direct measurements, is estimated using least squares regression, optimizing the camera response to measured spectral reflectance of carefully selected color samples. To derive mappings to colorimetric and multispectral representations, two conceptually different approaches are used. In the model-based approach, the physical model describing the image acquisition process is inverted, to reconstruct spectral reflectance from the recorded device response. In the empirical approach, the characteristics of the individual components are ignored, and the functions are derived by relating the device response for a set of test colors to the corresponding colorimetric and spectral measurements, using linear and polynomial least squares regression. The results indicate that for trichromatic imaging, accurate colorimetric mappings can be derived by the empirical approach, using polynomial regression to CIEXYZ and CIELAB. Because of the media-dependency, the characterization functions should be derived for each combination of media and colorants. However, accurate spectral data reconstruction requires for multi-channel imaging, using the model-based approach. Moreover, the model-based approach is general, since it is based on the spectral characteristics of the image acquisition system, rather than the characteristics of a set of color samples. / Report code: LiU-TEK-LIC- 2006:70 Color imaging Multispectral imaging Spectral reconstruction Device characterization
153	Multiple Session 3D Reconstruction using RGB-D Cameras / 3D-rekonstruktion med RGB-D kamera över multipla sessioner Widebäck West, Nikolaus January 2014 (has links) In this thesis we study the problem of multi-session dense rgb-d slam for 3D reconstruc- tion. Multi-session reconstruction can allow users to capture parts of an object that could not easily be captured in one session, due for instance to poor accessibility or user mistakes. We first present a thorough overview of single-session dense rgb-d slam and describe the multi-session problem as a loosening of the incremental camera movement and static scene assumptions commonly held in the single-session case. We then implement and evaluate sev- eral variations on a system for doing two-session reconstruction as an extension to a single- session dense rgb-d slam system. The extension from one to several sessions is divided into registering separate sessions into a single reference frame, re-optimizing the camera trajectories, and fusing together the data to generate a final 3D model. Registration is done by matching reconstructed models from the separate sessions using one of two adaptations on a 3D object detection pipeline. The registration pipelines are evaluated with many different sub-steps on a challenging dataset and it is found that robust registration can be achieved using the proposed methods on scenes without degenerate shape symmetry. In particular we find that using plane matches between two sessions as constraints for as much as possible of the registration pipeline improves results. Several different strategies for re-optimizing camera trajectories using data from both ses- sions are implemented and evaluated. The re-optimization strategies are based on re- tracking the camera poses from all sessions together, and then optionally optimizing over the full problem as represented on a pose-graph. The camera tracking is done by incrementally building and tracking against a tsdf volume, from which a final 3D mesh model is extracted. The whole system is qualitatively evaluated against a realistic dataset for multi-session re- construction. It is concluded that the overall approach is successful in reconstructing objects from several sessions, but that other fine grained registration methods would be required in order to achieve multi-session reconstructions that are indistinguishable from singe-session results in terms of reconstruction quality. 3D-Reconstruction SLAM RGB-D 3D-Keypoints Registration
154	Pedestrian Detection Using Convolutional Neural Networks Molin, David January 2015 (has links) Pedestrian detection is an important field with applications in active safety systems for cars as well as autonomous driving. Since autonomous driving and active safety are becoming technically feasible now the interest for these applications has dramatically increased.The aim of this thesis is to investigate convolutional neural networks (CNN) for pedestrian detection. The reason for this is that CNN have recently beensuccessfully applied to several different computer vision problems. The main applications of pedestrian detection are in real time systems. For this reason,this thesis investigates strategies for reducing the computational complexity offorward propagation for CNN.The approach used in this thesis for extracting pedestrians is to use a CNN tofind a probability map of where pedestrians are located. From this probabilitymap bounding boxes for pedestrians are generated. A method for handling scale invariance for the objects of interest has also been developed in this thesis. Experiments show that using this method givessignificantly better results for the problem of pedestrian detection.The accuracy which this thesis has managed to achieve is similar to the accuracy for some other works which use CNN. Convolutional neural network pedestrian detection Caltech pedestrian dataset
155	3D Position Estimation of a Person of Interest in Multiple Video Sequences : Person of Interest Recognition / 3D positions estimering av sökt person i multipla videosekvenser : Igenkänning av sökt person Johansson, Victor January 2013 (has links) Because of the increase in the number of security cameras, there is more video footage available than a human could efficiently process. In combination with the fact that computers are getting more efficient, it is getting more and more interesting to solve the problem of detecting and recognizing people automatically. Therefore a method is proposed for estimating a 3D-path of a person of interest in multiple, non overlapping, monocular cameras. This project is a collaboration between two master theses. This thesis will focus on recognizing a person of interest from several possible candidates, as well as estimating the 3D-position of a person and providing a graphical user interface for the system. The recognition of the person of interest includes keeping track of said person frame by frame, and identifying said person in video sequences where the person of interest has not been seen before. The final product is able to both detect and recognize people in video, as well as estimating their 3D-position relative to the camera. The product is modular and any part can be improved or changed completely, without changing the rest of the product. This results in a highly versatile product which can be tailored for any given situation. Computer Vision Re-identification Pedestrian detection 3D-position estimation
156	3D Position Estimation of a Person of Interest in Multiple Video Sequences : People Detection Markström, Johannes January 2013 (has links) In most cases today when a specific person's whereabouts is monitored through video surveillance it is done manually and his or her location when not seen is based on assumptions on how fast he or she can move. Since humans are good at recognizing people this can be done accurately, given good video data, but the time needed to go through all data is extensive and therefore expensive. Because of the rapid technical development computers are getting cheaper to use and therefore more interesting to use for tedious work. This thesis is a part of a larger project that aims to see to what extent it is possible to estimate a person of interest's time dependent 3D position, when seen in surveillance videos. The surveillance videos are recorded with non overlapping monocular cameras. Furthermore the project aims to see if the person of interest's movement, when position data is unavailable, could be predicted. The outcome of the project is a software capable of following a person of interest's movement with an error estimate visualized as an area indicating where the person of interest might be at a specific time. This thesis main focus is to implement and evaluate a people detector meant to be used in the project, reduce noise in position measurement, predict the position when the person of interest's location is unknown, and to evaluate the complete project. The project combines known methods in computer vision and signal processing and the outcome is a software that can be used on a normal PC running on a Windows operating system. The software implemented in the thesis use a Hough transform based people detector and a Kalman filter for one step ahead prediction. The detector is evaluated with known methods such as Miss-rate vs. False Positives per Window or Image (FPPW and FPPI respectively) and Recall vs. 1-Precision. The results indicate that it is possible to estimate a person of interest's 3D position with single monocular cameras. It is also possible to follow the movement, to some extent, were position data are unavailable. However the software needs more work in order to be robust enough to handle the diversity that may appear in different environments and to handle large scale sensor networks. Computer Vision Sensor Networks People Detection Position Estimation
157	Utveckling av ett active vision system för demonstration av EDSDK++ i tillämpningar inom datorseende Kargén, Rolf January 2014 (has links) Datorseende är ett snabbt växande, tvärvetenskapligt forskningsområde vars tillämpningar tar en allt mer framskjutande roll i dagens samhälle. Med ett ökat intresse för datorseende ökar också behovet av att kunna kontrollera kameror kopplade till datorseende system. Vid Linköpings tekniska högskola, på avdelningen för datorseende, har ramverket EDSDK++ utvecklats för att fjärrstyra digitala kameror tillverkade av Canon Inc. Ramverket är mycket omfattande och innehåller en stor mängd funktioner och inställningsalternativ. Systemet är därför till stor del ännu relativt oprövat. Detta examensarbete syftar till att utveckla ett demonstratorsystem till EDSDK++ i form av ett enkelt active vision system, som med hjälp av ansiktsdetektion i realtid styr en kameratilt, samt en kamera monterad på tilten, till att följa, zooma in och fokusera på ett ansikte eller en grupp av ansikten. Ett krav var att programbiblioteket OpenCV skulle användas för ansiktsdetektionen och att EDSDK++ skulle användas för att kontrollera kameran. Dessutom skulle ett API för att kontrollera kameratilten utvecklas. Under utvecklingsarbetet undersöktes bl.a. olika metoder för ansiktsdetektion. För att förbättra prestandan användes multipla ansiktsdetektorer, som med hjälp av multitrådning avsöker en bild parallellt från olika vinklar. Såväl experimentella som teoretiska ansatser gjordes för att bestämma de parametrar som behövdes för att kunna reglera kamera och kameratilt. Resultatet av arbetet blev en demonstrator, som uppfyllde samtliga krav. / Computer vision is a rapidly growing, interdisciplinary field whose applications are taking an increasingly prominent role in today's society. With an increased interest in computer vision there is also an increasing need to be able to control cameras connected to computer vision systems. At the division of computer vision, at Linköping University, the framework EDSDK++ has been developed to remotely control digital cameras made by Canon Inc. The framework is very comprehensive and contains a large amount of features and configuration options. The system is therefore largely still relatively untested. This thesis aims to develop a demonstrator to EDSDK++ in the form of a simple active vision system, which utilizes real-time face detection in order to control a camera tilt, and a camera mounted on the tilt, to follow, zoom in and focus on a face or a group of faces. A requirement was that the OpenCV library would be used for face detection and EDSDK++ would be used to control the camera. Moreover, an API to control the camera tilt was to be developed. During development, different methods for face detection were investigated. In order to improve performance, multiple, parallel face detectors using multithreading, were used to scan an image from different angles. Both experimental and theoretical approaches were made to determine the parameters needed to control the camera and camera tilt. The project resulted in a fully functional demonstrator, which fulfilled all requirements. Active Vision EDSDK++ AutoBooth Face Tracking OpenCV Ansiktsdetektion Datorseende Demonstrator
158	EVALUATING THE IMPACT OF UNCERTAINTY ON THE INTEGRITY OF DEEP NEURAL NETWORKS Harborn, Jakob January 2021 (has links) Deep Neural Networks (DNNs) have proven excellent performance and are very successful in image classification and object detection. Safety critical industries such as the automotive and aerospace industry aim to develop autonomous vehicles with the help of DNNs. In order to certify the usage of DNNs in safety critical systems, it is essential to prove the correctness of data within the system. In this thesis, the research is focused on investigating the sources of uncertainty, what effects various sources of uncertainty has on NNs, and how it is possible to reduce uncertainty within an NN. Probabilistic methods are used to implement an NN with uncertainty estimation to analyze and evaluate how the integrity of the NN is affected. By analyzing and discussing the effects of uncertainty in an NN it is possible to understand the importance of including a method of estimating uncertainty. Preventing, reducing, or removing the presence of uncertainty in such a network improves the correctness of data within the system. With the implementation of the NN, results show that estimating uncertainty makes it possible to identify and classify the presence of uncertainty in the system and reduce the uncertainty to achieve an increased level of integrity, which improves the correctness of the predictions. Uncertainty Deep Neural Network Bayesian Neural Network Dependability Integrity Probability
159	Classification of tree species from 3D point clouds using convolutional neural networks Wiklander, Marcus January 2020 (has links) In forest management, knowledge about a forest's distribution of tree species is key. Being able to automate tree species classification for large forest areas is of great interest, since it is tedious and costly labour doing it manually. In this project, the aim was to investigate the efficiency of classifying individual tree species (pine, spruce and deciduous forest) from 3D point clouds acquired by airborne laser scanning (ALS), using convolutional neural networks. Raw data consisted of 3D point clouds and photographic images of forests in northern Sweden, collected from a helicopter flying at low altitudes. The point cloud of each individual tree was connected to its representation in the photos, which allowed for manual labeling of training data to be used for training of convolutional neural networks. The training data consisted of labels and 2D projections created from the point clouds, represented as images. Two different convolutional neural networks were trained and tested; an adaptation of the LeNet architecture and the ResNet architecture. Both networks reached an accuracy close to 98 %, the LeNet adaptation having a slightly lower loss score for both validation and test data compared to that of ResNet. Confusion matrices for both networks showed similar F1 scores for all tree species, between 97 % and 98 %. The accuracies computed for both networks were found higher than those achieved in similar studies using ALS data to classify individual tree species. However, the results in this project were never tested against a true population sample to confirm the accuracy. To conclude, the use of convolutional neural networks is indeed an efficient method for classification of tree species, but further studies on unbiased data is needed to validate these results. Neural networks Convolutional neural networks tree species classification Forest Science Skogsvetenskap
160	Natural Fingerprinting of Steel Strömbom, Johannes January 2021 (has links) A cornerstone in the industry's ongoing digital revolution, which is sometimes referred to as Industry 4.0, is the ability to trace products not only within the own production line but also throughout the remaining lifetime of the products. Traditionally, this is done by labeling products with, for instance, bar codes or radio-frequency identification (RFID) tags. In recent years, using the structure of the product itself as a unique identifier, a "fingerprint", has become a popular area of research. The purpose of this work was to develop software for an identification system using laser speckles as a unique identifier of steel components. Laser speckles, or simply speckles, are generated by illuminating a rough surface with coherent light, typically laser light. As the light is reflected, the granular pattern known as speckles can be seen by an observer. The complex nature of a speckle pattern together with its sensitivity to changes in the setup makes it robust against false-positive identifications and almost impossible to counterfeit. Because of this, speckles are suitable to be used as unique identifiers. In this work, three different identification algorithms have been tested in both simulations and experiments. The tested algorithms included one correlation-based, one method based on local feature extraction, and one method based on global feature extraction. The results showed that the correlation-based identification is most robust against speckle decorrelation, i.e changes in the speckle pattern, while being quite computationally expensive. The local feature-based method was shown to be unfit for this current application due to its sensitivity to speckle decorrelation and erroneous results. The global feature extraction method achieved high accuracy and fast computational speed when combined with a clustering method based on overlapping speckle patterns and a k-nearest neighbours (k-NN) search. In all the investigated methods, parallel calculations can be utilized to increase the computational speed. fingerprinting laser speckles speckle correlation scattering transform wavelets feature detection

Search results