Spelling suggestions: "subject:"objecttracking"" "subject:"bactracking""
101 |
Tracking and Measuring Objects in Obscure Image Scenarios Through the Lens of Shot Put in Track and FieldSmith, Ashley Nicole 23 May 2022 (has links)
Object tracking and object measurement are two well-established and prominent concepts within the field of computer vision. While the two techniques are fairly robust in images and videos where the object of interest(s) is clear, there is a significant decrease in performance when objects appear obscured due to a number of factors including motion blur, far distance from the camera, and blending with the background. Additionally, most established object detection models focus on detecting as many objects as possible, rather than striving for high accuracy on a few, predetermined objects. One application of computer vision tracking and measurement in imprecise and single-object scenarios is programmatically measuring the distance of a shot put throw in the sport of track and field. Shot put throws in competition are currently measured by human officials, which is both time-consuming and often erroneous. In this work, a computer vision system is developed that automatically tracks the path of a shot put throw through combining a custom-trained YOLO model and path predictor with kinematic formulas and then measures its distance traveled by triangulation using binocular stereo vision. The final distance measurements produce directionally accurate results with an average error of 82% after removing one outlier, an average detection time of 2.9 ms per frame and a total average run time of 4.5 minutes from the time the shot put leaves the thrower's hand. Shortcomings of tracking and measurement in imperfect or singular object settings are addressed and potential improvements are suggested, while also providing the opportunity to increase the accuracy and efficiency of the sporting event. / Master of Science / Object tracking and object measurement are two well-established and prominent concepts within the field of computer vision. While the two techniques are fairly robust in images and videos where the object of interest(s) is clear, there is a significant decrease in performance when objects appear obscured due to a number of factors including motion blur, far distance from the camera, and blending with the background. Additionally, most established object detection models focus on detecting as many objects as possible, rather than striving for high accuracy on a few, predetermined objects. One application of computer vision tracking and measurement in imprecise and single-object scenarios is programmatically measuring the distance of a shot put throw in the sport of track and field. Shot put throws in competition are currently measured by human officials, which is both time-consuming and often erroneous. In this work, a computer vision system is developed that automatically tracks the path of a shot put throw through combining a custom-trained YOLO model and path predictor with kinematic formulas and then measures its distance traveled by triangulation using binocular stereo vision. The final distance measurements produce directionally accurate results with an average error of 82% after removing one outlier, an average detection time of 2.9 ms per frame and a total average run time of 4.5 minutes from the time the shot put leaves the thrower's hand. Shortcomings of tracking and measurement in imperfect or singular object settings are addressed and potential improvements are suggested, while also providing the opportunity to increase the accuracy and efficiency of the sporting event.
|
102 |
Detecting and tracking moving objects from a moving platformLin, Chung-Ching 04 May 2012 (has links)
Detecting and tracking moving objects are important topics in computer vision research. Classical methods perform well in applications of steady cameras. However, these techniques are not suitable for the applications of moving cameras because the unconstrained nature of realistic environments and sudden camera movement makes cues to object positions rather fickle. A major difficulty is that every pixel moves and new background keeps showing up when a handheld or car-mounted camera moves. In this dissertation, a novel estimation method of camera motion parameters will be discussed first. Based on the estimated camera motion parameters, two detection algorithms are developed using Bayes' rule and belief propagation. Next, an MCMC-based feature-guided particle filtering method is presented to track detected moving objects. In addition, two detection algorithms without using camera motion parameters will be further discussed. These two approaches require no pre-defined class or model to be trained in advance. The experiment results will demonstrate robust detecting and tracking performance in object sizes and positions.
|
103 |
Automatic Classification of Fish in Underwater Video; Pattern Matching - Affine Invariance and Beyondgundam, madhuri, Gundam, Madhuri 15 May 2015 (has links)
Underwater video is used by marine biologists to observe, identify, and quantify living marine resources. Video sequences are typically analyzed manually, which is a time consuming and laborious process. Automating this process will significantly save time and cost. This work proposes a technique for automatic fish classification in underwater video. The steps involved are background subtracting, fish region tracking and classification using features. The background processing is used to separate moving objects from their surrounding environment. Tracking associates multiple views of the same fish in consecutive frames. This step is especially important since recognizing and classifying one or a few of the views as a species of interest may allow labeling the sequence as that particular species. Shape features are extracted using Fourier descriptors from each object and are presented to nearest neighbor classifier for classification. Finally, the nearest neighbor classifier results are combined using a probabilistic-like framework to classify an entire sequence.
The majority of the existing pattern matching techniques focus on affine invariance, mainly because rotation, scale, translation and shear are common image transformations. However, in some situations, other transformations may be modeled as a small deformation on top of an affine transformation. The proposed algorithm complements the existing Fourier transform-based pattern matching methods in such a situation. First, the spatial domain pattern is decomposed into non-overlapping concentric circular rings with centers at the middle of the pattern. The Fourier transforms of the rings are computed, and are then mapped to polar domain. The algorithm assumes that the individual rings are rotated with respect to each other. The variable angles of rotation provide information about the directional features of the pattern. This angle of rotation is determined starting from the Fourier transform of the outermost ring and moving inwards to the innermost ring. Two different approaches, one using dynamic programming algorithm and second using a greedy algorithm, are used to determine the directional features of the pattern.
|
104 |
Etude et optimisation d'algorithmes pour le suivi d'objets couleur / Analysis and optimisation of algorithms for color object trackingLaguzet, Florence 27 September 2013 (has links)
Les travaux de cette thèse portent sur l'amélioration et l'optimisation de l'algorithme de suivi d'objet couleur Mean-Shift à la fois d’un point de vue robustesse du suivi et d’un point de vue architectural pour améliorer la vitesse d’exécution. La première partie des travaux a consisté en l'amélioration de la robustesse du suivi. Pour cela, l'impact des espaces de représentation couleur a été étudié, puis une méthode permettant la sélection de l'espace couleur représentant le mieux l'objet à suivre a été proposée. L'environnement de la cible changeant au cours du temps, une stratégie est mise en place pour resélectionner un espace couleur au moment opportun. Afin d'améliorer la robustesse dans le cas de séquences particulièrement difficile, le Mean-Shift avec stratégie de sélection a été couplé avec un autre algorithme plus coûteux en temps d'exécution : le suivi par covariance. L’objectif de ces travaux est d’obtenir un système complet fonctionnant en temps réel sur processeurs multi-cœurs SIMD. Une phase d’étude et d'optimisation a donc été réalisée afin de rendre les algorithmes paramétrables en complexité pour qu’ils puissent s’exécuter en temps réel sur différentes plateformes, pour différentes tailles d’images et d’objets suivi. Dans cette optique de compromis vitesse / performance, il devient ainsi possible de faire du suivi temps-réel sur des processeurs ARM type Cortex A9. / The work of this thesis focuses on the improvement and optimization of the Mean-Shift color object tracking algorithm, both from a theoretical and architectural point of view to improve both the accuracy and the execution speed. The first part of the work consisted in improving the robustness of the tracking. For this, the impact of color space representation on the quality of tracking has been studied, and a method for the selection of the color space that best represents the object to be tracked has been proposed. The method has been coupled with a strategy determining the appropriate time to recalculate the model. Color space selection method was also used in collaboration with another object tracking algorithm to further improve the tracking robustness for particularly difficult sequences : the covariance tracking which is more time consuming. The objective of this work is to obtain an entire real time system running on multi-core SIMD processors. A study and optimization phase has been made in order to obtain algorithms with a complexity that is configurable so that they can run in real time on different platforms, for various sizes of images and object tracking. In this context of compromise between speed and performance, it becomes possible to do real-time tracking on processors like ARM Cortex A9.
|
105 |
Real-time Detection and Tracking of Moving Objects Using Deep Learning and Multi-threaded Kalman Filtering : A joint solution of 3D object detection and tracking for Autonomous DrivingSöderlund, Henrik January 2019 (has links)
Perception for autonomous drive systems is the most essential function for safe and reliable driving. LiDAR sensors can be used for perception and are vying for being crowned as an essential element in this task. In this thesis, we present a novel real-time solution for detection and tracking of moving objects which utilizes deep learning based 3D object detection. Moreover, we present a joint solution which utilizes the predictability of Kalman Filters to infer object properties and semantics to the object detection algorithm, resulting in a closed loop of object detection and object tracking.On one hand, we present YOLO++, a 3D object detection network on point clouds only. A network that expands YOLOv3, the latest contribution to standard real-time object detection for three-channel images. Our object detection solution is fast. It processes images at 20 frames per second. Our experiments on the KITTI benchmark suite show that we achieve state-of-the-art efficiency but with a mediocre accuracy for car detection, which is comparable to the result of Tiny-YOLOv3 on the COCO dataset. The main advantage with YOLO++ is that it allows for fast detection of objects with rotated bounding boxes, something which Tiny-YOLOv3 can not do. YOLO++ also performs regression of the bounding box in all directions, allowing for 3D bounding boxes to be extracted from a bird's eye view perspective. On the other hand, we present a Multi-threaded Object Tracking (MTKF) solution for multiple object tracking. Each unique observation is associated to a thread with a novel concurrent data association process. Each of the threads contain an Extended Kalman Filter that is used for predicting and estimating an associated object's state over time. Furthermore, a LiDAR odometry algorithm was used to obtain absolute information about the movement of objects, since the movement of objects are inherently relative to the sensor perceiving them. We obtain 33 state updates per second with an equal amount of threads to the number of cores in our main workstation.Even if the joint solution has not been tested on a system with enough computational power, it is ready for deployment. Using YOLO++ in combination with MTKF, our real-time constraint of 10 frames per second is satisfied by a large margin. Finally, we show that our system can take advantage of the predicted semantic information from the Kalman Filters in order to enhance the inference process in our object detection architecture.
|
106 |
A graph-based approach for online multi-object tracking in structured videos with an application to action recognition / Uma abordagem baseada em grafos para rastreamento de múltiplos objetos em vídeos estruturados com um aplicação para o reconhecimento de açõesMorimitsu, Henrique 20 October 2015 (has links)
In this thesis we propose a novel approach for tracking multiple objects using structural information. The objects are tracked by combining particle filter and frame description with Attributed Relational Graphs (ARGs). We start by learning a structural probabilistic model graph from annotated images. The graphs are then used to evaluate the current tracking state and to correct it, if necessary. By doing so, the proposed method is able to deal with challenging situations such as abrupt motion and tracking loss due to occlusion. The main contribution of this thesis is the exploration of the learned probabilistic structural model. By using it, the structural information of the scene itself is used to guide the object detection process in case of tracking loss. This approach differs from previous works, that use structural information only to evaluate the scene, but do not consider it to generate new tracking hypotheses. The proposed approach is very flexible and it can be applied to any situation in which it is possible to find structural relation patterns between the objects. Object tracking may be used in many practical applications, such as surveillance, activity analysis or autonomous navigation. In this thesis, we explore it to track multiple objects in sports videos, where the rules of the game create some structural patterns between the objects. Besides detecting the objects, the tracking results are also used as an input for recognizing the action each player is performing. This step is performed by classifying a segment of the tracking sequence using Hidden Markov Models (HMMs). The proposed tracking method is tested on several videos of table tennis matches and on the ACASVA dataset, showing that the method is able to continue tracking the objects even after occlusion or when there is a camera cut. / Nesta tese, uma nova abordagem para o rastreamento de múltiplos objetos com o uso de informação estrutural é proposta. Os objetos são rastreados usando uma combinação de filtro de partículas com descrição das imagens por meio de Grafos Relacionais com Atributos (ARGs). O processo é iniciado a partir do aprendizado de um modelo de grafo estrutural probabilístico utilizando imagens anotadas. Os grafos são usados para avaliar o estado atual do rastreamento e corrigi-lo, se necessário. Desta forma, o método proposto é capaz de lidar com situações desafiadoras como movimento abrupto e perda de rastreamento devido à oclusão. A principal contribuição desta tese é a exploração do modelo estrutural aprendido. Por meio dele, a própria informação estrutural da cena é usada para guiar o processo de detecção em caso de perda do objeto. Tal abordagem difere de trabalhos anteriores, que utilizam informação estrutural apenas para avaliar o estado da cena, mas não a consideram para gerar novas hipóteses de rastreamento. A abordagem proposta é bastante flexível e pode ser aplicada em qualquer situação em que seja possível encontrar padrões de relações estruturais entre os objetos. O rastreamento de objetos pode ser utilizado para diversas aplicações práticas, tais como vigilância, análise de atividades ou navegação autônoma. Nesta tese, ele é explorado para rastrear diversos objetos em vídeos de esporte, na qual as regras do jogo criam alguns padrões estruturais entre os objetos. Além de detectar os objetos, os resultados de rastreamento também são usados como entrada para reconhecer a ação que cada jogador está realizando. Esta etapa é executada classificando um segmento da sequência de rastreamento por meio de Modelos Ocultos de Markov (HMMs). A abordagem de rastreamento proposta é testada em diversos vídeos de jogos de tênis de mesa e na base de dados ACASVA, demonstrando a capacidade do método de lidar com situações de oclusão ou cortes de câmera.
|
107 |
Détection de changements à partir de nuages de points de cartographie mobile / Change detection from mobile laser scanning point cloudsXiao, Wen 12 November 2015 (has links)
Les systèmes de cartographie mobile sont de plus en plus utilisés pour la cartographie des scènes urbaines. La technologie de scan laser mobile (où le scanner est embarqué sur un véhicule) en particulier permet une cartographie précise de la voirie, la compréhension de la scène, la modélisation de façade, etc. Dans cette thèse, nous nous concentrons sur la détection de changement entre des nuages de points laser de cartographie mobile. Tout d'abord, nous étudions la détection des changements a partir de données RIEGL (scanner laser plan) pour la mise à jour de bases de données géographiques et l'identification d'objet temporaire. Nous présentons une méthode basée sur l'occupation de l'espace qui permet de surmonter les difficultés rencontrées par les méthodes classiques fondées sur la distance et qui ne sont pas robustes aux occultations et à l'échantillonnage anisotrope. Les zones occultées sont identifiées par la modélisation de l'état d'occupation de l'espace balayé par des faisceaux laser. Les écarts entre les points et les lignes de balayage sont interpolées en exploitant la géométrie du capteur dans laquelle la densité d'échantillonnage est isotrope. Malgré quelques limites dans le cas d'objets pénétrables comme des arbres ou des grilles, la méthode basée sur l'occupation est en mesure d'améliorer la méthode basée sur la distance point à triangle de façon significative. La méthode de détection de changement est ensuite appliquée à des données acquises par différents scanners laser et à différentes échelles temporelles afin de démontrer son large champs d'application. La géométrie d'acquisition est adaptée pour un scanner dynamique de type Velodyne. La méthode basée sur l'occupation permet alors la détection des objets en mouvement. Puisque la méthode détecte le changement en chaque point, les objets en mouvement sont détectés au niveau des points. Comme le scanner Velodyne scanne l'environnement de façon continue, les trajectoires des objets en mouvement peut être extraite. Un algorithme de détection et le suivi simultané est proposé afin de retrouver les trajectoires de piétons. Cela permet d'estimer avec précision la circulation des piétons des circulations douces dans les lieux publics. Les changements peuvent non seulement être détectés au niveau du point, mais aussi au niveau de l'objet. Ainsi nous avons pu étudier les changements entre des voitures stationnées dans les rues à différents moments de la journée afin d'en tirer des statistiques utiles aux gestionnaires du stationnement urbain. Dans ce cas, les voitures sont détectés en premier lieu, puis les voitures correspondantes sont comparées entre des passages à différents moments de la journée. Outre les changements de voitures, l'offre de stationnement et les types de voitures l'utilisant sont également des informations importantes pour la gestion du stationnement. Toutes ces informations sont extraites dans le cadre d'un apprentissage supervisé. En outre, une méthode de reconstruction de voiture sur la base d'un modèle déformable générique ajusté aux données est proposée afin de localiser précisément les voitures. Les paramètres du modèle sont également considérés comme caractéristiques de la voiture pour prendre de meilleures décisions. De plus, ces modèles géométriquement précis peuvent être utilisées à des fins de visualisation. Dans cette thèse, certains sujets liés à la détection des changements comme par exemple, suivi, la classification, et la modélisation sont étudiés et illustrés par des applications pratiques. Plus important encore, les méthodes de détection des changements sont appliquées à différentes géométries d'acquisition de données et à de multiples échelles temporelles et au travers de deux stratégies: “bottom-up” (en partant des points) et “top-down” (en partant des objets) / Mobile mapping systems are increasingly used for street environment mapping, especially mobile laser scanning technology enables precise street mapping, scene understanding, facade modelling, etc. In this research, the change detection from laser scanning point clouds is investigated. First of all, street environment change detection using RIEGL data is studied for the purpose of database updating and temporary object identification. An occupancy-based method is presented to overcome the challenges encountered by the conventional distance-based method, such as occlusion, anisotropic sampling. Occluded areas are identified by modelling the occupancy states within the laser scanning range. The gaps between points and scan lines are interpolated under the sensor reference framework, where the sampling density is isotropic. Even there are some conflicts on penetrable objects, e.g. trees, fences, the occupancy-based method is able to enhance the point-to-triangle distance-based method. The change detection method is also applied to data acquired by different laser scanners at different temporal-scales with the intention to have wider range of applications. The local sensor reference framework is adapted to Velodyne laser scanning geometry. The occupancy-based method is implemented to detection moving objects. Since the method detects the change of each point, moving objects are detect at point level. As the Velodyne scanner constantly scans the surroundings, the trajectories of moving objects can be detected. A simultaneous detection and tracking algorithm is proposed to recover the pedestrian trajectories in order to accurately estimate the traffic flow of pedestrian in public places. Changes can be detected not only at point level, but also at object level. The changes of cars parking on street sides at different times are detected to help regulate on-street car parking since the parking duration is limited. In this case, cars are detected in the first place, then they are compared with corresponding ones. Apart from car changes, parking positions and car types are also important information for parking management. All the processes are solved in a supervised learning framework. Furthermore, a model-based car reconstruction method is proposed to precisely locate cars. The model parameters are also treated as car features for better decision making. Moreover, the geometrically accurate models can be used for visualization purposes. Under the theme of change detection, related topics, e.g. tracking, classification, modelling, are also studied for the reason of practical applications. More importantly, the change detection methods are applied to different data acquisition geometries at multiple temporal-scales. Both bottom-up (point-based) and top-down (object-based) change detection strategies are investigated
|
108 |
Rastreamento automático da bola de futebol em vídeosIlha, Gustavo January 2009 (has links)
A localização de objetos em uma imagem e acompanhamento de seu deslocamento numa sequência de imagens são tarefas de interesse teórico e prático. Aplicações de reconhecimento e rastreamento de padrões e objetos tem se difundido ultimamente, principalmente no ramo de controle, automação e vigilância. Esta dissertação apresenta um método eficaz para localizar e rastrear automaticamente objetos em vídeos. Para tanto, foi utilizado o caso do rastreamento da bola em vídeos esportivos, especificamente o jogo de futebol. O algoritmo primeiramente localiza a bola utilizando segmentação, eliminação e ponderação de candidatos, seguido do algoritmo de Viterbi, que decide qual desses candidatos representa efetivamente a bola. Depois de encontrada, a bola é rastreada utilizando o Filtro de Partículas auxiliado pelo método de semelhança de histogramas. Não é necessária inicialização da bola ou intervenção humana durante o algoritmo. Por fim, é feita uma comparação do Filtro de Kalman com o Filtro de Partículas no escopo do rastreamento da bola em vídeos de futebol. E, adicionalmente, é feita a comparação entre as funções de semelhança para serem utilizadas no Filtro de Partículas para o rastreamento da bola. Dificuldades, como a presença de ruído e de oclusão, tanto parcial como total, tiveram de ser contornadas. / The location of objects in an image and tracking its movement in a sequence of images is a task of theoretical and practical interest. Applications for recognition and tracking of patterns and objects have been spread lately, especially in the field of control, automation and vigilance. This dissertation presents an effective method to automatically locate and track objects in videos. Thereto, we used the case of tracking the ball in sports videos, specifically the game of football. The algorithm first locates the ball using segmentation, elimination and the weighting of candidates, followed by a Viterbi algorithm, which decides which of these candidates is actually the ball. Once found, the ball is tracked using the Particle Filter aided by the method of similarity of histograms. It is not necessary to initialize the ball or any human intervention during the algorithm. Next, a comparison of the Kalman Filter to Particle Filter in the scope of tracking the ball in soccer videos is made. And in addition, a comparison is made between the functions of similarity to be used in the Particle Filter for tracking the ball. Difficulties, such as the presence of noise and occlusion, in part or in total, had to be circumvented.
|
109 |
Object Tracking System With Seamless Object Handover Between Stationary And Moving Camera ModesEmeksiz, Deniz 01 November 2012 (has links) (PDF)
As the number of surveillance cameras and mobile platforms with cameras increases, automated detection and tracking of objects on these systems gain importance. There are various tracking methods designed for stationary or moving cameras. For stationary cameras, correspondence based tracking methods along with background subtraction have various advantages such as enabling detection of object entry and exit in a scene. They also provide robust tracking when the camera is static. However, they fail when the camera is moving. Conversely, histogram based methods such as mean shift enables object tracking on moving camera cases.
Though, with mean shift object&rsquo / s entry and exit cannot be detected automatically which means a new object&rsquo / s manual initialization is required.
In this thesis, we propose a dual-mode object tracking system which combines the benefits of correspondence based tracking and mean shift tracking. For each frame, a reliability measure based on background update rate is calculated. Interquartile Range is used for finding outliers on this measure and camera movement is detected. If the camera is stationary, correspondence based tracking is used and when camera is moving, the system switches to the mean shift tracking mode until the reliability of correspondence based tracking is sufficient according to the reliability measure.
The results demonstrate that, in stationary camera mode, new objects can be detected automatically by correspondence based tracking along with background subtraction. When the camera starts to move, generation of false objects by correspondence based tracking is prevented by switching to mean shift tracking mode and handing over the correct bounding boxes with a seamless operation which enables continuous tracking.
|
110 |
Region-based face detection, segmentation and tracking. framework definition and application to other objectsVilaplana Besler, Verónica 17 December 2010 (has links)
One of the central problems in computer vision is the automatic recognition of object classes. In particular, the detection of the class of human faces is a
problem that generates special interest due to the large number of applications that require face detection as a first step.
In this thesis we approach the problem of face detection as a joint detection and segmentation problem, in order to precisely localize faces with pixel
accurate masks. Even though this is our primary goal, in finding a solution we have tried to create a general framework as independent as possible of
the type of object being searched.
For that purpose, the technique relies on a hierarchical region-based image model, the Binary Partition Tree, where objects are obtained by the union of
regions in an image partition. In this work, this model is optimized for the face detection and segmentation tasks. Different merging and stopping criteria
are proposed and compared through a large set of experiments.
In the proposed system the intra-class variability of faces is managed within a learning framework. The face class is characterized using a set of
descriptors measured on the tree nodes, and a set of one-class classifiers. The system is formed by two strong classifiers. First, a cascade of binary
classifiers simplifies the search space, and afterwards, an ensemble of more complex classifiers performs the final classification of the tree nodes.
The system is extensively tested on different face data sets, producing accurate segmentations and proving to be quite robust to variations in scale,
position, orientation, lighting conditions and background complexity.
We show that the technique proposed for faces can be easily adapted to detect other object classes. Since the construction of the image model does
not depend on any object class, different objects can be detected and segmented using the appropriate object model on the same image model. New
object models can be easily built by selecting and training a suitable set of descriptors and classifiers.
Finally, a tracking mechanism is proposed. It combines the efficiency of the mean-shift algorithm with the use of regions to track and segment faces
through a video sequence, where both the face and the camera may move. The method is extended to deal with other deformable objects, using a
region-based graph-cut method for the final object segmentation at each frame. Experiments show that both mean-shift based trackers produce
accurate segmentations even in difficult scenarios such as those with similar object and background colors and fast camera and object movements.
Lloc i / Un dels problemes més importants en l'àrea de visió artificial és el reconeixement automàtic de classes d'objectes. En particular, la detecció de la
classe de cares humanes és un problema que genera especial interès degut al gran nombre d'aplicacions que requereixen com a primer pas detectar
les cares a l'escena.
A aquesta tesis s'analitza el problema de detecció de cares com un problema conjunt de detecció i segmentació, per tal de localitzar de manera precisa
les cares a l'escena amb màscares que arribin a precisions d'un píxel. Malgrat l'objectiu principal de la tesi és aquest, en el procés de trobar una
solució s'ha intentat crear un marc de treball general i tan independent com fos possible del tipus d'objecte que s'està buscant.
Amb aquest propòsit, la tècnica proposada fa ús d'un model jeràrquic d'imatge basat en regions, l'arbre binari de particions (BPT: Binary Partition
Tree), en el qual els objectes s'obtenen com a unió de regions que provenen d'una partició de la imatge. En aquest treball, s'ha optimitzat el model per
a les tasques de detecció i segmentació de cares. Per això, es proposen diferents criteris de fusió i de parada, els quals es comparen en un conjunt
ampli d'experiments.
En el sistema proposat, la variabilitat dins de la classe cara s'estudia dins d'un marc de treball d'aprenentatge automàtic. La classe cara es caracteritza
fent servir un conjunt de descriptors, que es mesuren en els nodes de l'arbre, així com un conjunt de classificadors d'una única classe. El sistema està
format per dos classificadors forts. Primer s'utilitza una cascada de classificadors binaris que realitzen una simplificació de l'espai de cerca i,
posteriorment, s'aplica un conjunt de classificadors més complexes que produeixen la classificació final dels nodes de l'arbre.
El sistema es testeja de manera exhaustiva sobre diferents bases de dades de cares, sobre les quals s'obtenen segmentacions precises provant així la
robustesa del sistema en front a variacions d'escala, posició, orientació, condicions d'il·luminació i complexitat del fons de l'escena.
A aquesta tesi es mostra també que la tècnica proposada per cares pot ser fàcilment adaptable a la detecció i segmentació d'altres classes d'objectes.
Donat que la construcció del model d'imatge no depèn de la classe d'objecte que es pretén buscar, es pot detectar i segmentar diferents classes
d'objectes fent servir, sobre el mateix model d'imatge, el model d'objecte apropiat. Nous models d'objecte poden ser fàcilment construïts mitjançant la
selecció i l'entrenament d'un conjunt adient de descriptors i classificadors.
Finalment, es proposa un mecanisme de seguiment. Aquest mecanisme combina l'eficiència de l'algorisme mean-shift amb l'ús de regions per fer el
seguiment i segmentar les cares al llarg d'una seqüència de vídeo a la qual tant la càmera com la cara es poden moure. Aquest mètode s'estén al cas
de seguiment d'altres objectes deformables, utilitzant una versió basada en regions de la tècnica de graph-cut per obtenir la segmentació final de
l'objecte a cada imatge. Els experiments realitzats mostren que les dues versions del sistema de seguiment basat en l'algorisme mean-shift produeixen
segmentacions acurades, fins i tot en entorns complicats com ara quan l'objecte i el fons de l'escena presenten colors similars o quan es produeix un
moviment ràpid, ja sigui de la càmera o de l'objecte.
|
Page generated in 0.1109 seconds