Global ETD Search

161	Methods and systems for vision-based proactive applications Huttunen, S. (Sami) 22 November 2011 (has links) Abstract Human-computer interaction (HCI) is an integral part of modern society. Since the number of technical devices around us is increasing, the way of interacting is changing as well. The systems of the future should be proactive, so that they can adapt and adjust to people’s movements and actions without requiring any conscious control. Visual information plays a vital role in this kind of implicit human-computer interaction due to its expressiveness. It is therefore obvious that cameras equipped with computing power and computer vision techniques provide an unobtrusive way of analyzing human intentions. Despite its many advantages, use of computer vision is not always straightforward. Typically, every application sets specific requirements for the methods that can be applied. Given these motivations, this thesis aims to develop new vision-based methods and systems that can be utilized in proactive applications. As a case study, the thesis covers two different proactive computer vision applications. Firstly, an automated system that takes care of both the selection and switching of the video source in a distance education situation is presented. The system is further extended with a pan-tilt-zoom camera system that is designed to track the teacher when s/he walks at the front of the classroom. The second proactive application is targeted at mobile devices. The system presented recognizes landscape scenes which can be utilized in automatic shooting mode selection. Distributed smart cameras have been an active area of research in recent years, and they play an important role in many applications. Most of the research has focused on either the computer vision algorithms or on a specific implementation. There has been less activity on building generic frameworks which allow different algorithms, sensors and distribution methods to be used. In this field, the thesis presents an open and expendable framework for development of distributed sensor networks with an emphasis on peer-to-peer networking. From the methodological point of view, the thesis makes its contribution to the field of multi-object tracking. The method presented utilizes soft assignment to associate the measurements to the objects tracked. In addition, the thesis also presents two different ways of extracting location measurements from images. As a result, the method proposed provides location and trajectories of multiple objects which can be utilized in proactive applications. / Tiivistelmä Ihmisen ja eri laitteiden välisellä vuorovaikutuksella on keskeinen osa nyky-yhteiskunnassa. Teknisten laitteiden lisääntymisen myötä vuorovaikutustavat ovat myös muuttumassa. Tulevaisuuden järjestelmien tulisi olla proaktiivisia, jotta ne voisivat sopeutua ihmisten liikkeisiin ja toimintoihin ilman tietoista ohjausta. Ilmaisuvoimansa ansiosta visuaalisella tiedolla on keskeinen rooli tällaisessa epäsuorassa ihminen-tietokone –vuorovaikutuksessa. Tämän vuoksi on selvää, että kamerat yhdessä laskentaresurssien ja konenäkömenetelmien kanssa tarjoavat huomaamattoman tavan ihmisten toiminnan analysointiin. Lukuisista eduistaan huolimatta konenäön soveltaminen ei ole aina suoraviivaista. Yleensä jokainen sovellus asettaa erikoisvaatimuksia käytettäville menetelmille. Tästä syystä väitöskirjassa on päämääränä kehittää uusia kuvatietoon perustuvia menetelmiä ja järjestelmiä, joita voidaan hyödyntää proaktiivisissa sovelluksissa. Tässä väitöskirjassa esitellään kaksi proaktiivista sovellusta, jotka molemmat hyödyntävät tietokonenäköä. Ensimmäinen sovellus on etäopetusjärjestelmä, joka valitsee ja vaihtaa kuvalähteen automaattisesti. Järjestelmään esitellään myös ohjattavaan kameraan perustava laajennus, jonka avulla opettajaa voidaan seurata hänen liikkuessaan eri puolilla luokkahuonetta. Toinen proaktiivisen tekniikan sovellus on tarkoitettu mobiililaitteisiin. Kehitetty järjestelmä kykenee tunnistamaan maisemakuvat, jolloin kameran kuvaustila voidaan asettaa automaattisesti. Monissa sovelluksissa on tarpeen käyttää useampia kameroita. Tämän seurauksena eri puolille ympäristöä sijoitettavat älykkäät kamerat ovat olleet viime vuosina erityisen kiinnostuksen kohteena. Suurin osa kehityksestä on kuitenkin keskittynyt lähinnä eri konenäköalgoritmeihin tai yksittäisiin sovelluksiin. Sen sijaan panostukset yleisiin ja helposti laajennettaviin ratkaisuihin, jotka mahdollistavat erilaisten menetelmien, sensoreiden ja tiedonvälityskanavien käyttämisen, ovat olleet vähäisempiä. Tilanteen parantamiseksi väitöskirjassa esitellään hajautettujen sensoriverkkojen kehitykseen tarkoitettu avoin ja laajennettavissa oleva ohjelmistorunko. Menetelmien osalta tässä väitöskirjassa keskitytään useiden kohteiden seurantaan. Kehitetty seurantamenetelmä yhdistää saadut paikkamittaukset seurattaviin kohteisiin siten, että jokaiselle mittaukselle lasketaan todennäköisyys, jolla se kuuluu jokaiseen yksittäiseen seurattavaan kohteeseen. Seurantaongelman lisäksi työssä esitellään kaksi erilaista tapaa, joilla kohteiden paikka kuvassa voidaan määrittää. Esiteltyä kokonaisuutta voidaan hyödyntää proaktiivisissa sovelluksissa, jotka tarvitsevat usean kohteen paikkatiedon tai kohteiden kulkeman reitin. Kalman filter human-computer interaction object tracking scene classification sensor network shooting mode smart classroom Kalman-suodatin ihminen-tietokone -vuorovaikutus kohteen seuranta kuvaustila näkymän luokittelu sensoriverkko älykäs luokkahuone
162	Fusion en ligne d'algorithmes de suivi visuel d'objet / On-line fusion of visual object tracking algorithms Leang, Isabelle 15 December 2016 (has links) Le suivi visuel d’objet est une fonction élémentaire de la vision par ordinateur ayant fait l’objet de nombreux travaux. La dérive au cours du temps est l'un des phénomènes les plus critiques à maîtriser, car elle aboutit à la perte définitive de la cible suivie. Malgré les nombreuses approches proposées dans la littérature pour contrer ce phénomène, aucune ne surpasse une autre en terme de robustesse face aux diverses sources de perturbations visuelles : variation d'illumination, occultation, mouvement brusque de caméra, changement d'aspect. L’objectif de cette thèse est d’exploiter la complémentarité d’un ensemble d'algorithmes de suivi, « trackers », en développant des stratégies de fusion en ligne capables de les combiner génériquement. La chaîne de fusion proposée a consisté à sélectionner les trackers à partir d'indicateurs de bon fonctionnement, à combiner leurs sorties et à les corriger. La prédiction en ligne de dérive a été étudiée comme un élément clé du mécanisme de sélection. Plusieurs méthodes sont proposées pour chacune des étapes de la chaîne, donnant lieu à 46 configurations de fusion possibles. Évaluées sur 3 bases de données, l’étude a mis en évidence plusieurs résultats principaux : une sélection performante améliore considérablement la robustesse de suivi ; une correction de mise à jour est préférable à une réinitialisation ; il est plus avantageux de combiner un petit nombre de trackers complémentaires et de performances homogènes qu'un grand nombre ; la robustesse de fusion d’un petit nombre de trackers est corrélée à la mesure d’incomplétude, ce qui permet de sélectionner la combinaison de trackers adaptée à un contexte applicatif donné. / Visual object tracking is an elementary function of computer vision that has been the subject of numerous studies. Drift over time is one of the most critical phenomena to master because it leads to the permanent loss of the target being tracked. Despite the numerous approaches proposed in the literature to counter this phenomenon, none outperforms another in terms of robustness to the various sources of visual perturbations: variation of illumination, occlusion, sudden movement of camera, change of aspect. The objective of this thesis is to exploit the complementarity of a set of tracking algorithms by developing on-line fusion strategies capable of combining them generically. The proposed fusion chain consists of selecting the trackers from indicators of good functioning, combining their outputs and correcting them. On-line drift prediction was studied as a key element of the selection mechanism. Several methods are proposed for each step of the chain, giving rise to 46 possible fusion configurations. Evaluated on 3 databases, the study highlighted several key findings: effective selection greatly improves robustness; The correction improves the robustness but is sensitive to bad selection, making updating preferable to reinitialization; It is more advantageous to combine a small number of complementary trackers with homogeneous performances than a large number; The robustness of fusion of a small number of trackers is correlated to the incompleteness measure, which makes it possible to select the appropriate combination of trackers to a given application context. Suivi visuel d'objet Robustesse de suivi Complémentarité des trackers Prédiction de dérive Fusion d'informations Correction de modèles Visual object tracking Fusion of trackers Drift prediction 612.367
163	Rastreamento de objetos baseado em reconhecimento estrutural de padrões / Object tracking based on structural pattern recognition Ana Beatriz Vicentim Graciano 23 March 2007 (has links) Diversos problemas práticos envolvendo sistemas de visão computacional, tais como vigilância automatizada, pesquisas de conteúdo específico em bancos de dados multimídias ou edição de vídeo, requerem a localização e o reconhecimento de objetos dentro de seqüências de imagens ou vídeos digitais. Mais formalmente, denomina-se rastreamento o processo de determinação da posição de certo(s) objeto(s) ao longo do tempo numa seqüência de imagens. Já a tarefa de reconhecimento caracteriza-se pela classificação desses objetos de acordo com algum rótulo pré-estabelecido ou apoiada em conhecimento prévio tipicamente introduzido através de um modelo dos objetos de interesse. No entanto, rastrear e classificar objetos em vídeo digital são tarefas desafiadoras, tanto pelas dificuldades inerentes a esse tipo de elemento pictórico, quanto pelo variável grau de complexidade que os quadros sob análise podem apresentar. Este documento apresenta uma metodologia baseada em modelo para rastrear e reconhecer objetos em vídeo digital através de uma representação por grafos relacionais com atributos (ARGs). Tais estruturas surgiram dentro do paradigma de reconhecimento estrutural de padrões e têm se mostrado bastante flexíveis e poderosas para modelar problemas diversos, pois podem transmitir dados quantitativos, relacionais, estruturais e simbólicos. Como modelo e entrada são descritos através desses grafos, a questão de reconhecimento é interpretada como um problema de casamento inexato entre grafos, que consiste em mapear os vértices do ARG de entrada nos vértices do ARG modelo. Em seguida, é realizado o rastreamento dos objetos de acordo com uma transformação afim derivada de parâmetros obtidos da etapa de reconhecimento. Para validar a metodologia proposta, resultados sobre seqüências de imagens digitais, sintéticas e reais, são apresentados e discutidos. / Several practical problems involving computer vision systems, such as automated surveillance, content-based queries in multimedia databases or video editing require the location and recognition of objects within image sequences or digital video. More formally, the process of determining the position of certain objects in an image sequence throughout time is called tracking, whereas the recognition task is characterized by the classification of such objects according to pre-defined labels or a priori knowledge, typically introduced by means of a model of the target objects. However, tracking and recognition of objects in digital video are not simple tasks, either because of the inherent difficulties of such a pictorial element, or due to the variable level of complexity that the frames under consideration might present. This document presents a model-based methodology for tracking and recognizing objects represented by attributed relational graphs (ARGs) in digital video. These structures have arisen from the paradigm of structural pattern recognition and have proven to be very flexible and powerful for modeling various problems, as they can hold many sorts of data (e.g: quantitative, relational, structural and symbolic). Since both model and input data are described through these graphs, the recognition matter may be interpreted as an inexact graph matching problem, which consists in finding a correspondence between the set of vertices of the input ARG and that of the model ARG. In the next step, object tracking is performed according to an affine transform derived from parameters extracted from the recognition phase. To validate the proposed methodology, results obtained from real and synthetic digital image sequences are presented and discussed. grafos relacionais com atributos processamento de imagens rastreamento reconhecimento estrutural visão computacional attributed relational graph computer vision image processing object tracking structural recognition
164	Anonymizace videa / Video Anonymization Mokrý, Martin January 2019 (has links) The goal of this thesis is to design and create an automatic system for video anonymization. This system makes use of various object detectors on an image to ensure functionality, as well as active tracking of objects detected in this manner. Adjustments are later applied to these detected objects which ensure sufficient level of anonymization. The main asset of this system is speeding up the anonymization process of videos that can be published after.
165	Detekce pohybujících se objektů ve video sekvenci / Moving Objects Detection in Video Sequences Němec, Jiří January 2012 (has links) This thesis deals with methods for the detection of people and tracking objects in video sequences. An application for detection and tracking of players in video recordings of sport activities, e.g. hockey or basketball matches, is proposed and implemented. The designed application uses the combination of histograms of oriented gradients and classification based on SVM (Support Vector Machines) for detecting players in the picture. Moreover, a particle filter is used for tracking detected players. The whole system was fully tested and the results are shown in the graphs and tables with verbal descriptions.
166	Mapování pohybu osob stacionární kamerou / Mapping the Motion of People by a Stationary Camera Bartl, Vojtěch January 2015 (has links) The aim of this diploma thesis is to obtain information on the motion of people in a scene from the record of the stationary camera. The procedure to detect exceptional events in the scene was designed. Exceptional events can be fast-moving persons, or persons moving in di erent places than everyone else in the scene. To trace the motion of persons, two algorithms were applied and tested - Optical flow and CAMSHIFT. The analysis of the resulting motions is performed by monitoring the progress of motion, and its comparison with the other motions in the scene. The analysis result is represented by detected exceptional motions that can be found in the video. The areas where the motion occurs in the scene, and where the motion is the most common are also described together with the motion direction analysis. The exceptional motion parts extracted from the video represent the main result of the work.
167	Sledování objektu ve videu / Object Tracking in Video Sojma, Zdeněk January 2011 (has links) This master's thesis describes principles of the most widely used object tracking systems in video and then mainly focuses on characterization and on implementation of an interactive offline tracking system for generic color objects. The algorithm quality consists in high accuracy evaluation of object trajectory. The system creates the output trajectory from input data specified by user which may be interactively modified and added to improve the system accuracy. The algorithm is based on a detector which uses a color bin features and on the temporal coherence of object motion to generate multiple candidate object trajectories. Optimal output trajectory is then calculated by dynamic programming whose parameters are also interactively modified by user. The system achieves 15-70 fps on a 480x360 video. The thesis describes implementation of an application which purpose is to optimally evaluate the tracker accuracy. The final results are also discussed.
168	Sledování a rozpoznávání lidí na videu / Tracking and Recognition of People in Video Šajboch, Antonín January 2016 (has links) The master's thesis deals with detecting and tracking people in the video. To get optimal recognition was used convolution neural network, which extracts vector features from the enclosed frame the face. The extracted vector is further classified. Recognition process must take place in a real time and also with respect are selected optimal methods. There is a new dataset faces, which was obtained from a video record at the faculty area. Videos and dataset were used for experiments to verify the accuracy of the created system. The recognition accuracy is about 85% . The proposed system can be used, for example, to register people, counting passages or to report the occurrence of an unknown person in a building.
169	Efficient multiple hypothesis tracking using a purely functional array language Nolkrantz, Marcus January 2022 (has links) An autonomous vehicle is a complex system that requires a good perception of the surrounding environment to operate safely. One part of that is multiple object tracking, which is an essential component in camera-based perception whose responsibility is to estimate object motion from a sequence of images. This requires an association problem to be solved where newly estimated object positions are mapped to previously predicted trajectories, for which different solution strategies exist. In this work, a multiple hypothesis tracking algorithm is implemented. The purpose is to demonstrate that measurement associations are improved compared to less compute-intensive alternatives. It was shown that the implemented algorithm performed 13 percent better than an intersection over union tracker when evaluated using a standard evaluation metric. Furthermore, this work also investigates the usage of abstraction layers to accelerate time-critical parallel operations on the GPU. It was found that the execution time of the tracking algorithm could be reduced by 42 percent by replacing four functions with implementations written in the purely functional array language Futhark. Finally, it was shown that a GPU code abstraction layer can reduce the knowledge barrier required to write efficient CUDA kernels. multiple object tracking multiple hypothesis tracking tracking-by-detection GPGPU GPU code abstraction functional programming Futhark Computer and Information Sciences Data- och informationsvetenskap
170	Object representation in local feature spaces : application to real-time tracking and detection / Représentation d'objets dans des espaces de caractéristiques locales : application à la poursuite de cibles temps-réel et à la détection Tran, Antoine 25 October 2017 (has links) La représentation visuelle est un problème fondamental en vision par ordinateur. Le but est de réduire l'information au strict nécessaire pour une tâche désirée. Plusieurs types de représentation existent, comme les caractéristiques de couleur (histogrammes, attributs de couleurs...), de forme (dérivées, points d'intérêt...) ou d'autres, comme les bancs de filtres.Les caractéristiques bas-niveau (locales) sont rapides à calculer. Elles ont un pouvoir de représentation limité, mais leur généricité présente un intérêt pour des systèmes autonomes et multi-tâches, puisque les caractéristiques haut-niveau découlent d'elles.Le but de cette thèse est de construire puis d'étudier l'impact de représentations fondées seulement sur des caractéristiques locales de bas-niveau (couleurs, dérivées spatiales) pour deux tâches : la poursuite d'objets génériques, nécessitant des caractéristiques robustes aux variations d'aspect de l'objet et du contexte au cours du temps; la détection d'objets, où la représentation doit décrire une classe d'objets en tenant compte des variations intra-classe. Plutôt que de construire des descripteurs d'objets globaux dédiés, nous nous appuyons entièrement sur les caractéristiques locales et sur des mécanismes statistiques flexibles visant à estimer leur distribution (histogrammes) et leurs co-occurrences (Transformée de Hough Généralisée). La Transformée de Hough Généralisée (THG), créée pour la détection de formes quelconques, consiste à créer une structure de données représentant un objet, une classe... Cette structure, d'abord indexée par l'orientation du gradient, a été étendue à d'autres caractéristiques. Travaillant sur des caractéristiques locales, nous voulons rester proche de la THG originale.En poursuite d'objets, après avoir présenté nos premiers travaux, combinant la THG avec un filtre particulaire (utilisant un histogramme de couleurs), nous présentons un algorithme plus léger et rapide (100fps), plus précis et robuste. Nous présentons une évaluation qualitative et étudierons l'impact des caractéristiques utilisées (espace de couleur, formulation des dérivées partielles...). En détection, nous avons utilisé l'algorithme de Gall appelé forêts de Hough. Notre but est de réduire l'espace de caractéristiques utilisé par Gall, en supprimant celles de type HOG, pour ne garder que les dérivées partielles et les caractéristiques de couleur. Pour compenser cette réduction, nous avons amélioré deux étapes de l'entraînement : le support des descripteurs locaux (patchs) est partiellement produit selon une mesure géométrique, et l'entraînement des nœuds se fait en générant une carte de probabilité spécifique prenant en compte les patchs utilisés pour cette étape. Avec l'espace de caractéristiques réduit, le détecteur n'est pas plus précis. Avec les mêmes caractéristiques que Gall, sur une même durée d'entraînement, nos travaux ont permis d'avoir des résultats identiques, mais avec une variance plus faible et donc une meilleure répétabilité. / Visual representation is a fundamental problem in computer vision. The aim is to reduce the information to the strict necessary for a query task. Many types of representation exist, like color features (histograms, color attributes...), shape ones (derivatives, keypoints...) or filterbanks.Low-level (and local) features are fast to compute. Their power of representation are limited, but their genericity have an interest for autonomous or multi-task systems, as higher level ones derivate from them. We aim to build, then study impact of low-level and local feature spaces (color and derivatives only) for two tasks: generic object tracking, requiring features robust to object and environment's aspect changes over the time; object detection, for which the representation should describe object class and cope with intra-class variations.Then, rather than using global object descriptors, we use entirely local features and statisticals mecanisms to estimate their distribution (histograms) and their co-occurrences (Generalized Hough Transform).The Generalized Hough Transform (GHT), created for detection of any shape, consists in building a codebook, originally indexed by gradient orientation, then to diverse features, modeling an object, a class. As we work on local features, we aim to remain close to the original GHT.In tracking, after presenting preliminary works combining the GHT with a particle filter (using color histograms), we present a lighter and fast (100 fps) tracker, more accurate and robust.We present a qualitative evaluation and study the impact of used features (color space, spatial derivative formulation).In detection, we used Gall's Hough Forest. We aim to reduce Gall's feature space and discard HOG features, to keep only derivatives and color ones.To compensate the reduction, we enhanced two steps: the support of local descriptors (patches) are partially chosen using a geometrical measure, and node training is done by using a specific probability map based on patches used at this step.With reduced feature space, the detector is less accurate than with Gall's feature space, but for the same training time, our works lead to identical results, but with higher stability and then better repeatability. Vision par ordinateur Espace de caractéristiques locales Poursuite de cibles Détecrtion d'objets Transformée de Hough Computer vision Local feature space Object tracking Object detection Hough Transform 006.4

Search results