Spelling suggestions: "subject:"face tracking"" "subject:"race tracking""
11 |
Region-based face detection, segmentation and tracking. framework definition and application to other objectsVilaplana Besler, Verónica 17 December 2010 (has links)
One of the central problems in computer vision is the automatic recognition of object classes. In particular, the detection of the class of human faces is a
problem that generates special interest due to the large number of applications that require face detection as a first step.
In this thesis we approach the problem of face detection as a joint detection and segmentation problem, in order to precisely localize faces with pixel
accurate masks. Even though this is our primary goal, in finding a solution we have tried to create a general framework as independent as possible of
the type of object being searched.
For that purpose, the technique relies on a hierarchical region-based image model, the Binary Partition Tree, where objects are obtained by the union of
regions in an image partition. In this work, this model is optimized for the face detection and segmentation tasks. Different merging and stopping criteria
are proposed and compared through a large set of experiments.
In the proposed system the intra-class variability of faces is managed within a learning framework. The face class is characterized using a set of
descriptors measured on the tree nodes, and a set of one-class classifiers. The system is formed by two strong classifiers. First, a cascade of binary
classifiers simplifies the search space, and afterwards, an ensemble of more complex classifiers performs the final classification of the tree nodes.
The system is extensively tested on different face data sets, producing accurate segmentations and proving to be quite robust to variations in scale,
position, orientation, lighting conditions and background complexity.
We show that the technique proposed for faces can be easily adapted to detect other object classes. Since the construction of the image model does
not depend on any object class, different objects can be detected and segmented using the appropriate object model on the same image model. New
object models can be easily built by selecting and training a suitable set of descriptors and classifiers.
Finally, a tracking mechanism is proposed. It combines the efficiency of the mean-shift algorithm with the use of regions to track and segment faces
through a video sequence, where both the face and the camera may move. The method is extended to deal with other deformable objects, using a
region-based graph-cut method for the final object segmentation at each frame. Experiments show that both mean-shift based trackers produce
accurate segmentations even in difficult scenarios such as those with similar object and background colors and fast camera and object movements.
Lloc i / Un dels problemes més importants en l'àrea de visió artificial és el reconeixement automàtic de classes d'objectes. En particular, la detecció de la
classe de cares humanes és un problema que genera especial interès degut al gran nombre d'aplicacions que requereixen com a primer pas detectar
les cares a l'escena.
A aquesta tesis s'analitza el problema de detecció de cares com un problema conjunt de detecció i segmentació, per tal de localitzar de manera precisa
les cares a l'escena amb màscares que arribin a precisions d'un píxel. Malgrat l'objectiu principal de la tesi és aquest, en el procés de trobar una
solució s'ha intentat crear un marc de treball general i tan independent com fos possible del tipus d'objecte que s'està buscant.
Amb aquest propòsit, la tècnica proposada fa ús d'un model jeràrquic d'imatge basat en regions, l'arbre binari de particions (BPT: Binary Partition
Tree), en el qual els objectes s'obtenen com a unió de regions que provenen d'una partició de la imatge. En aquest treball, s'ha optimitzat el model per
a les tasques de detecció i segmentació de cares. Per això, es proposen diferents criteris de fusió i de parada, els quals es comparen en un conjunt
ampli d'experiments.
En el sistema proposat, la variabilitat dins de la classe cara s'estudia dins d'un marc de treball d'aprenentatge automàtic. La classe cara es caracteritza
fent servir un conjunt de descriptors, que es mesuren en els nodes de l'arbre, així com un conjunt de classificadors d'una única classe. El sistema està
format per dos classificadors forts. Primer s'utilitza una cascada de classificadors binaris que realitzen una simplificació de l'espai de cerca i,
posteriorment, s'aplica un conjunt de classificadors més complexes que produeixen la classificació final dels nodes de l'arbre.
El sistema es testeja de manera exhaustiva sobre diferents bases de dades de cares, sobre les quals s'obtenen segmentacions precises provant així la
robustesa del sistema en front a variacions d'escala, posició, orientació, condicions d'il·luminació i complexitat del fons de l'escena.
A aquesta tesi es mostra també que la tècnica proposada per cares pot ser fàcilment adaptable a la detecció i segmentació d'altres classes d'objectes.
Donat que la construcció del model d'imatge no depèn de la classe d'objecte que es pretén buscar, es pot detectar i segmentar diferents classes
d'objectes fent servir, sobre el mateix model d'imatge, el model d'objecte apropiat. Nous models d'objecte poden ser fàcilment construïts mitjançant la
selecció i l'entrenament d'un conjunt adient de descriptors i classificadors.
Finalment, es proposa un mecanisme de seguiment. Aquest mecanisme combina l'eficiència de l'algorisme mean-shift amb l'ús de regions per fer el
seguiment i segmentar les cares al llarg d'una seqüència de vídeo a la qual tant la càmera com la cara es poden moure. Aquest mètode s'estén al cas
de seguiment d'altres objectes deformables, utilitzant una versió basada en regions de la tècnica de graph-cut per obtenir la segmentació final de
l'objecte a cada imatge. Els experiments realitzats mostren que les dues versions del sistema de seguiment basat en l'algorisme mean-shift produeixen
segmentacions acurades, fins i tot en entorns complicats com ara quan l'objecte i el fons de l'escena presenten colors similars o quan es produeix un
moviment ràpid, ja sigui de la càmera o de l'objecte.
|
12 |
ROBOMIRROR: A SIMULATED MIRROR DISPLAY WITH A ROBOTIC CAMERAZhang, Yuqi 01 January 2014 (has links)
Simulated mirror displays have a promising prospect in applications, due to its capability for virtual visualization. In most existing mirror displays, cameras are placed on top of the displays and unable to capture the person in front of the display at the highest possible resolution. The lack of a direct frontal capture of the subject's face and the geometric error introduced by image warping techniques make realistic mirror image rendering a challenging problem. The objective of this thesis is to explore the use of a robotic camera in tracking the face of the subject in front of the display to obtain a high-quality image capture. Our system uses a Bislide system to control a camera for face capture, while using a separate color-depth camera for accurate face tracking. We construct an optical device in which a one-way mirror is used so that the robotic camera behind can capture the subject while the rendered images can be displayed by reflecting off the mirror from an overhead projector. A key challenge of the proposed system is the reduction of light due to the one-way mirror. The optimal 2D Wiener filter is selected to enhance the low contrast images captured by the camera.
|
13 |
Facial Features Tracking using Active Appearance ModelsFanelli, Gabriele January 2006 (has links)
This thesis aims at building a system capable of automatically extracting and parameterizing the position of a face and its features in images acquired from a low-end monocular camera. Such a challenging task is justified by the importance and variety of its possible applications, ranging from face and expression recognition to animation of virtual characters using video depicting real actors. The implementation includes the construction of Active Appearance Models of the human face from training images. The existing face model Candide-3 is used as a starting point, making the translation of the tracking parameters to standard MPEG-4 Facial Animation Parameters easy. The Inverse Compositional Algorithm is employed to adapt the models to new images, working on a subspace where the appearance is "projected out" and thus focusing only on shape. The algorithm is tested on a generic model, aiming at tracking different people’s faces, and on a specific model, considering one person only. In the former case, the need for improvements in the robustness of the system is highlighted. By contrast, the latter case gives good results regarding both quality and speed, with real time performance being a feasible goal for future developments.
|
14 |
Utveckling av ett active vision system för demonstration av EDSDK++ i tillämpningar inom datorseendeKargén, Rolf January 2014 (has links)
Datorseende är ett snabbt växande, tvärvetenskapligt forskningsområde vars tillämpningar tar en allt mer framskjutande roll i dagens samhälle. Med ett ökat intresse för datorseende ökar också behovet av att kunna kontrollera kameror kopplade till datorseende system. Vid Linköpings tekniska högskola, på avdelningen för datorseende, har ramverket EDSDK++ utvecklats för att fjärrstyra digitala kameror tillverkade av Canon Inc. Ramverket är mycket omfattande och innehåller en stor mängd funktioner och inställningsalternativ. Systemet är därför till stor del ännu relativt oprövat. Detta examensarbete syftar till att utveckla ett demonstratorsystem till EDSDK++ i form av ett enkelt active vision system, som med hjälp av ansiktsdetektion i realtid styr en kameratilt, samt en kamera monterad på tilten, till att följa, zooma in och fokusera på ett ansikte eller en grupp av ansikten. Ett krav var att programbiblioteket OpenCV skulle användas för ansiktsdetektionen och att EDSDK++ skulle användas för att kontrollera kameran. Dessutom skulle ett API för att kontrollera kameratilten utvecklas. Under utvecklingsarbetet undersöktes bl.a. olika metoder för ansiktsdetektion. För att förbättra prestandan användes multipla ansiktsdetektorer, som med hjälp av multitrådning avsöker en bild parallellt från olika vinklar. Såväl experimentella som teoretiska ansatser gjordes för att bestämma de parametrar som behövdes för att kunna reglera kamera och kameratilt. Resultatet av arbetet blev en demonstrator, som uppfyllde samtliga krav. / Computer vision is a rapidly growing, interdisciplinary field whose applications are taking an increasingly prominent role in today's society. With an increased interest in computer vision there is also an increasing need to be able to control cameras connected to computer vision systems. At the division of computer vision, at Linköping University, the framework EDSDK++ has been developed to remotely control digital cameras made by Canon Inc. The framework is very comprehensive and contains a large amount of features and configuration options. The system is therefore largely still relatively untested. This thesis aims to develop a demonstrator to EDSDK++ in the form of a simple active vision system, which utilizes real-time face detection in order to control a camera tilt, and a camera mounted on the tilt, to follow, zoom in and focus on a face or a group of faces. A requirement was that the OpenCV library would be used for face detection and EDSDK++ would be used to control the camera. Moreover, an API to control the camera tilt was to be developed. During development, different methods for face detection were investigated. In order to improve performance, multiple, parallel face detectors using multithreading, were used to scan an image from different angles. Both experimental and theoretical approaches were made to determine the parameters needed to control the camera and camera tilt. The project resulted in a fully functional demonstrator, which fulfilled all requirements.
|
15 |
[en] COLLABORATIVE FACE TRACKING: A FRAMEWORK FOR THE LONG-TERM FACE TRACKING / [pt] RASTREAMENTO DE FACES COLABORATIVO: UMA METODOLOGIA PARA O RASTREAMENTO DE FACES AO LONGO PRAZOVICTOR HUGO AYMA QUIRITA 22 March 2021 (has links)
[pt] O rastreamento visual é uma etapa essencial em diversas aplicações
de visão computacional. Em particular, o rastreamento facial é considerado
uma tarefa desafiadora devido às variações na aparência da face, devidas
à etnia, gênero, presença de bigode ou barba e cosméticos, além de variações
na aparência ao longo da sequência de vídeo, como deformações,
variações em iluminação, movimentos abruptos e oclusões. Geralmente, os
rastreadores são robustos a alguns destes fatores, porém não alcançam resultados
satisfatórios ao lidar com múltiplos fatores ao mesmo tempo. Uma
alternativa é combinar as respostas de diferentes rastreadores para alcançar
resultados mais robustos. Este trabalho se insere neste contexto e propõe
um novo método para a fusão de rastreadores escalável, robusto, preciso
e capaz de manipular rastreadores independentemente de seus modelos. O
método prevê ainda a integração de detectores de faces ao modelo de fusão
de forma a aumentar a acurácia do rastreamento. O método proposto foi
implementado para fins de validação, tendo sido testado em diversas configurações
que combinaram até cinco rastreadores distintos e um detector de
faces. Em testes realizados a partir de quatro sequências de vídeo que apresentam
condições diversas de imageamento o método superou em acurácia
os rastreadores utilizados individualmente. / [en] Visual tracking is fundamental in several computer vision applications.
In particular, face tracking is challenging because of the variations in facial
appearance, due to age, ethnicity, gender, facial hair, and cosmetics, as well
as appearance variations in long video sequences caused by facial deformations,
lighting conditions, abrupt movements, and occlusions. Generally,
trackers are robust to some of these factors but do not achieve satisfactory
results when dealing with combined occurrences. An alternative is to combine
the results of different trackers to achieve more robust outcomes. This
work fits into this context and proposes a new method for scalable, robust
and accurate tracker fusion able to combine trackers regardless of their models.
The method further provides the integration of face detectors into the
fusion model to increase the tracking accuracy. The proposed method was
implemented for validation purposes and was tested in different configurations
that combined up to five different trackers and one face detector. In
tests on four video sequences that present different imaging conditions the
method outperformed the trackers used individually.
|
16 |
Model-Based Eye Detection and AnimationTrejo Guerrero, Sandra January 2006 (has links)
<p>In this thesis we present a system to extract the eye motion from a video stream containing a human face and applying this eye motion into a virtual character. By the notation eye motion estimation, we mean the information which describes the location of the eyes in each frame of the video stream. Applying this eye motion estimation into a virtual character, we achieve that the virtual face moves the eyes in the same way than the human face, synthesizing eye motion into a virtual character. In this study, a system capable of face tracking, eye detection and extraction, and finally iris position extraction using video stream containing a human face has been developed. Once an image containing a human face is extracted from the current frame of the video stream, the detection and extraction of the eyes is applied. The detection and extraction of the eyes is based on edge detection. Then the iris center is determined applying different image preprocessing and region segmentation using edge features on the eye picture extracted.</p><p>Once, we have extracted the eye motion, using MPEG-4 Facial Animation, this motion is translated into the Facial Animation arameters (FAPs). Thus we can improve the quality and quantity of Facial Animation expressions that we can synthesize into a virtual character.</p>
|
17 |
Face recognition from videoHarguess, Joshua David 30 January 2012 (has links)
While the area of face recognition has been extensively studied in recent years, it remains a largely open problem, despite what movie and television studios would leave you to believe. Frontal, still face recognition research has seen a lot of success in recent years from any different researchers. However,the accuracy of such systems can be greatly diminished in cases such as increasing the variability of the database,occluding the face, and varying the illumination of the face. Further varying the pose of the face (yaw, pitch, and roll) and the face expression (smile, frown, etc.) adds even more complexity to the face recognition task, such as in the case of face recognition from video. In a more realistic video surveillance setting, a face recognition system should be robust to scale, pose, resolution, and occlusion as well as successfully track the face between frames. Also, a more advanced face recognition system should be able to improve the face recognition result by utilizing the information present in multiple video cameras.
We approach the problem of face recognition from video in the following manner. We assume that the training data for the system consists of only still image data, such as passport photos or mugshots in a real-world system. We then transform the problem of face recognition from video to a still face recognition problem. Our research focuses on solutions to detecting, tracking and extracting face information from video frames so that they may be utilized effectively in a still face recognition system.
We have developed four novel methods that assist in face recognition from video and multiple
cameras. The first uses a patch-based method to handle the face recognition task when only patches, or parts, of the face are seen in a video, such as when occlusion of the face happens often. The second uses multiple cameras to fuse the recognition results of multiple cameras to improve the recognition accuracy. In the third solution, we utilize multiple overlapping video cameras to improve the face tracking result which thus improves
the face recognition accuracy of the system. We additionally implement a methodology to detect and handle occlusion so that unwanted information is not used in the tracking algorithm. Finally, we introduce the average-half-face, which is shown to improve the results of still face recognition by utilizing the symmetry of the face. In one attempt to understand the use of the average-half-face in face recognition, an analysis of the effect of face symmetry on face recognition results is shown. / text
|
18 |
Model-Based Eye Detection and AnimationTrejo Guerrero, Sandra January 2006 (has links)
In this thesis we present a system to extract the eye motion from a video stream containing a human face and applying this eye motion into a virtual character. By the notation eye motion estimation, we mean the information which describes the location of the eyes in each frame of the video stream. Applying this eye motion estimation into a virtual character, we achieve that the virtual face moves the eyes in the same way than the human face, synthesizing eye motion into a virtual character. In this study, a system capable of face tracking, eye detection and extraction, and finally iris position extraction using video stream containing a human face has been developed. Once an image containing a human face is extracted from the current frame of the video stream, the detection and extraction of the eyes is applied. The detection and extraction of the eyes is based on edge detection. Then the iris center is determined applying different image preprocessing and region segmentation using edge features on the eye picture extracted. Once, we have extracted the eye motion, using MPEG-4 Facial Animation, this motion is translated into the Facial Animation arameters (FAPs). Thus we can improve the quality and quantity of Facial Animation expressions that we can synthesize into a virtual character.
|
19 |
A Software Framework for Facial Modelling and TrackingStrand, Mattias January 2010 (has links)
The WinCandide application, a platform for face tracking and model based coding, had become out of date and needed to be upgraded. This report is based on the work of investigating possible open source GUIs and computer vision tool kits that could replace the old ones that are unsupported. Multi platform GUIs are of special interest.
|
20 |
Facial Expressions as Indicator for Discomfort in Automated DrivingBeggiato, Matthias, Rauh, Nadine, Krems, Josef 26 August 2021 (has links)
Driving comfort is considered a key factor for broad public acceptance of automated driving. Based on continuous driver/passenger monitoring,
potential discomfort could be avoided by adapting automation features such as the driving style. The EU-project MEDIATOR (mediatorproject.eu) aims at developing a mediating system in automated vehicles by constantly evaluating the performance of driver and automation. As facial expressions could be an indicator of discomfort, a driving simulator study has been carried out to investigate this relationship. A total of 41 participants experienced three potentially uncomfortable automated approach situations to a truck driving ahead. The face video of four cameras was analyzed with the Visage facial feature detection and face analysis software, extracting 23 Action Units (AUs). Situation-specific effects showed that the eyes were kept open and eye blinks were reduced (AU43). Inner brows (AU1) as well as upper lids (AU5) raised, indicating surprise. Lips were pressed (AU24) and stretched (AU20) as sign for tension. Overall, facial expression analysis could contribute to detect discomfort in automated driving.
|
Page generated in 0.0535 seconds