Spelling suggestions: "subject:"face detection"" "subject:"race detection""
91 |
Detekce obličeje / Face DetectionŠašinka, Ondřej January 2009 (has links)
This MSc Thesis deals with face detection in image. In this approach, facial features (eyes, nose, mouth corners) are detected first and then joined to the whole face. For the facial features detection, classifiers trained with AdaBoost algorithm are used. Haar wavelets are used as features for classification.
|
92 |
AdaBoost v počítačovém vidění / AdaBoost in Computer VisionHradiš, Michal Unknown Date (has links)
In this thesis, we present the local rank differences (LRD). These novel image features are invariant to lighting changes and are suitable for object detection in programmable hardware, such as FPGA. The performance of AdaBoost classifiers with the LRD was tested on a face detection dataset with results which are similar to the Haar-like features which are the state of the art in real-time object detection. These results together with the fact that the LRD are evaluated much faster in FPGA then the Haar-like features are very encouraging and suggest that the LRD may be a solution for future hardware object detectors. We also present a framework for experiments with boosting methods in computer vision. This framework is very flexible and, at the same time, offers high learning performance and a possibility for future parallelization. The framework is available as open source software and we hope that it will simplify work for other researchers.
|
93 |
IR-Depth Face Detection and Lip Localization Using Kinect V2Fong, Katherine Kayan 01 June 2015 (has links) (PDF)
Face recognition and lip localization are two main building blocks in the development of audio visual automatic speech recognition systems (AV-ASR). In many earlier works, face recognition and lip localization were conducted in uniform lighting conditions with simple backgrounds. However, such conditions are seldom the case in real world applications. In this paper, we present an approach to face recognition and lip localization that is invariant to lighting conditions. This is done by employing infrared and depth images captured by the Kinect V2 device. First we present the use of infrared images for face detection. Second, we use the face’s inherent depth information to reduce the search area for the lips by developing a nose point detection. Third, we further reduce the search area by using a depth segmentation algorithm to separate the face from its background. Finally, with the reduced search range, we present a method for lip localization based on depth gradients. Experimental results demonstrated an accuracy of 100% for face detection, and 96% for lip localization.
|
94 |
Face Detection and Lip LocalizationHusain, Benafsh Nadir 01 August 2011 (has links) (PDF)
Integration of audio and video signals for automatic speech recognition has become an important field of study. The Audio-Visual Speech Recognition (AVSR) system is known to have accuracy higher than audio-only or visual-only system. The research focused on the visual front end and has been centered around lip segmentation. Experiments performed for lip feature extraction were mainly done in constrained environment with controlled background noise. In this thesis we focus our attention to a database collected in the environment of a moving car which hampered the quality of the imagery.
We first introduce the concept of illumination compensation, where we try to reduce the dependency of light from over- or under-exposed images. As a precursor to lip segmentation, we focus on a robust face detection technique which reaches an accuracy of 95%. We have detailed and compared three different face detection techniques and found a successful way of concatenating them in order to increase the overall accuracy. One of the detection techniques used was the object detection algorithm proposed by Viola-Jones. We have experimented with different color spaces using the Viola-Jones algorithm and have reached interesting conclusions.
Following face detection we implement a lip localization algorithm based on the vertical gradients of hybrid equations of color. Despite the challenging background and image quality, success rate of 88% was achieved for lip segmentation.
|
95 |
Modelling and Recognition of Manuals and Non-manuals in American Sign LanguageDing, Liya 26 June 2009 (has links)
No description available.
|
96 |
Digital image processing via combination of low-level and high-level approachesWang, Dong January 2011 (has links)
With the growth of computer power, Digital Image Processing plays a more and more important role in the modern world, including the field of industry, medical, communications, spaceflight technology etc. There is no clear definition how to divide the digital image processing, but normally, digital image processing includes three main steps: low-level, mid-level and highlevel processing. Low-level processing involves primitive operations, such as: image preprocessing to reduce the noise, contrast enhancement, and image sharpening. Mid-level processing on images involves tasks such as segmentation (partitioning an image into regions or objects), description of those objects to reduce them to a form suitable for computer processing, and classification (recognition) of individual objects. Finally, higher-level processing involves "making sense" of an ensemble of recognised objects, as in image analysis. Based on the theory just described in the last paragraph, this thesis is organised in three parts: Colour Edge and Face Detection; Hand motion detection; Hand Gesture Detection and Medical Image Processing. II In Colour Edge Detection, two new images G-image and R-image are built through colour space transform, after that, the two edges extracted from G-image and R-image respectively are combined to obtain the final new edge. In Face Detection, a skin model is built first, then the boundary condition of this skin model can be extracted to cover almost all of the skin pixels. After skin detection, the knowledge about size, size ratio, locations of ears and mouth is used to recognise the face in the skin regions. In Hand Motion Detection, frame differe is compared with an automatically chosen threshold in order to identify the moving object. For some special situations, with slow or smooth object motion, the background modelling and frame differencing are combined in order to improve the performance. In Hand Gesture Recognition, 3 features of every testing image are input to Gaussian Mixture Model (GMM), and then the Expectation Maximization algorithm (EM)is used to compare the GMM from testing images and GMM from training images in order to classify the results. In Medical Image Processing (mammograms), the Artificial Neural Network (ANN) and clustering rule are applied to choose the feature. Two classifier, ANN and Support Vector Machine (SVM), have been applied to classify the results, in this processing, the balance learning theory and optimized decision has been developed are applied to improve the performance.
|
97 |
Analýza emocionálních stavů na základě obrazových předloh / Emotional State Analysis Upon Image PatternsPřinosil, Jiří January 2009 (has links)
This dissertation thesis deals with the automatic system for basic emotional facial expressions recognition from static images. Generally the system is divided into the three independent parts, which are linked together in some way. The first part deals with automatic face detection from color images. In this part they were proposed the face detector based on skin color and the methods for eyes and lips position localization from detected faces using color maps. A part of this is modified Viola-Jones face detector, which was even experimentally used for eyes detection. The both face detectors were tested on the Georgia Tech Face Database. Another part of the automatic system is features extraction process, which consists of two statistical methods and of one method based on image filtering using set of Gabor’s filters. For purposes of this thesis they were experimentally used some combinations of features extracted using these methods. The last part of the automatic system is mathematical classifier, which is represented by feed-forward neural network. The automatic system is utilized by adding an accurate particular facial features localization using active shape model. The whole automatic system was benchmarked on recognizing of basic emotional facial expressions using the Japanese Female Facial Expression database.
|
98 |
Contributions in face detection with deep neural networksPaula, Thomas da Silva 28 March 2017 (has links)
Submitted by Caroline Xavier (caroline.xavier@pucrs.br) on 2017-07-04T12:23:43Z
No. of bitstreams: 1
DIS_THOMAS_DA_SILVA_PAULA_COMPLETO.pdf: 10601063 bytes, checksum: f63f9b6e33e22c4a2553f784a3a029e1 (MD5) / Made available in DSpace on 2017-07-04T12:23:44Z (GMT). No. of bitstreams: 1
DIS_THOMAS_DA_SILVA_PAULA_COMPLETO.pdf: 10601063 bytes, checksum: f63f9b6e33e22c4a2553f784a3a029e1 (MD5)
Previous issue date: 2017-03-28 / Reconhecimento facial ? um dos assuntos mais estudos no campo de Vis?o Computacional.
Dada uma imagem arbitr?ria ou um frame arbitr?rio, o objetivo do reconhecimento facial ?
determinar se existem faces na imagem e, se existirem, obter a localiza??o e a extens?o de cada
face encontrada. Tal detec??o ? facilmente feita por seres humanos, por?m continua sendo um
desafio em Vis?o Computacional. O alto grau de variabilidade e a dinamicidade da face humana
tornam-a dif?cil de detectar, principalmente em ambientes complexos. Recentementemente, abordagens
de Aprendizado Profundo come?aram a ser utilizadas em tarefas de Vis?o Computacional
com bons resultados. Tais resultados abriram novas possibilidades de pesquisa em diferentes aplica??es,
incluindo Reconhecimento Facial. Embora abordagens de Aprendizado Profundo tenham
sido aplicadas com sucesso para tal tarefa, a maior parte das implementa??es estado da arte utilizam
detectores faciais off-the-shelf e n?o avaliam as diferen?as entre eles. Em outros casos, os
detectores faciais s?o treinados para m?ltiplas tarefas, como detec??o de pontos fiduciais, detec??o
de idade, entre outros. Portanto, n?s temos tr?s principais objetivos. Primeiramente, n?s resumimos
e explicamos alguns avan?os do Aprendizado Profundo, detalhando como cada arquitetura e
implementa??o funcionam. Depois, focamos no problema de detec??o facial em si, realizando uma
rigorosa an?lise de alguns dos detectores existentes assim como algumas implementa??es nossas.
N?s experimentamos e avaliamos varia??es de alguns hiper-par?metros para cada um dos detectores
e seu impacto em diferentes bases de dados. N?s exploramos tanto implementa??es tradicionais
quanto mais recentes, al?m de implementarmos nosso pr?prio detector facial. Por fim, n?s implementamos,
testamos e comparamos uma abordagem de meta-aprendizado para detec??o facial, que
visa aprender qual o melhor detector facial para uma determinada imagem. Nossos experimentos
contribuem para o entendimento do papel do Aprendizado Profundo em detec??o facial, assim como
os detalhes relacionados a mudan?a de hiper-par?metros dos detectores faciais e seu impacto no resultado
da detec??o facial. N?s tamb?m mostramos o qu?o bem features obtidas com redes neurais
profundas ? treinadas em bases de dados de prop?sito geral ? combinadas com uma abordagem de
meta-aprendizado, se aplicam a detec??o facial. Nossos experimentos e conclus?es mostram que o
aprendizado profundo possui de fato um papel not?vel em detec??o facial. / Face Detection is one of the most studied subjects in the Computer Vision field. Given
an arbitrary image or video frame, the goal of face detection is to determine whether there are any
faces in the image and, if present, return the image location and the extent of each face. Such a
detection is easily done by humans, but it is still a challenge within Computer Vision. The high
degree of variability and the dynamicity of the human face makes it an object very difficult to
detect, mainly in complex environments. Recently, Deep Learning approaches started to be applied
for Computer Vision tasks with great results. They opened new research possibilities in different
applications, including Face Detection. Even though Deep Learning has been successfully applied for
such a task, most of the state-of-the-art implementations make use of off-the-shelf face detectors
and do not evaluate differences among them. In other cases, the face detectors are trained in a
multitask manner that includes face landmark detection, age detection, and so on. Hence, our goal
is threefold. First, we summarize and explain many advances of deep learning, detailing how each
different architecture and implementation work. Second, we focus on the face detection problem
itself, performing a rigorous analysis of some of the existing face detectors as well as implementations
of our own. We experiment and evaluate variations of hyper-parameters for each of the detectors
and their impact in different datasets. We explore both traditional and more recent approaches,
as well as implementing our own face detectors. Finally, we implement, test, and compare a meta
learning approach for face detection, which aims to learn the best face detector for a given image.
Our experiments contribute in understanding the role of deep learning in face detection as well as
the subtleties of changing hyper-parameters of the face detectors and their impact in face detection.
We also show how well features obtained with deep neural networks trained on a general-purpose
dataset perform on a meta learning approach for face detection. Our experiments and conclusions
show that deep learning has indeed a notable role in face detection.
|
99 |
Detecção de faces humanas em imagens coloridas utilizando redes neurais artificiais / Detection of human faces in color images using artificial neural networksGouveia, Wellington da Rocha 28 January 2010 (has links)
A tarefa de encontrar faces em imagens é extremamente complexa, pois pode ocorrer variação de luminosidade, fundos extremamente complexos e objetos que podem se sobrepor parcialmente à face que será localizada, entre outros problemas. Com o avanço na área de visão computacional técnicas mais recentes de processamento de imagens e inteligência artificial têm sido combinadas para desenvolver algoritmos mais eficientes para a tarefa de detecção de faces. Este trabalho apresenta uma metodologia de visão computacional que utiliza redes neurais MLP (Perceptron Multicamadas) para segmentar a cor da pele e a textura da face, de outros objetos presentes em uma imagem de fundo complexo. A imagem resultante é dividida em regiões, e para cada região são extraídas características que são aplicadas em outra rede neural MLP para identificar se naquela região contem face ou não. Para avaliação do software implementado foram utilizados dois banco de imagens, um com imagens padronizadas (Banco AR) e outro banco com imagens adquiridas na Internet contendo faces com diferentes tons de pele e fundo complexo. Os resultados finais obtidos foram de 83% de faces detectadas para o banco de imagens da Internet e 88% para o Banco AR, evidenciando melhores resultados para as imagens deste banco, pelo fato de serem padronizadas, não conterem faces inclinadas e fundo complexo. A etapa de segmentação apesar de reduzir a quantidade de informação a ser processada para os demais módulos foi a que contribuiu para o maior número de falsos negativos. / The task of finding faces in images is extremely complex, as there is variation in brightness, backgrounds and highly complex objects that may overlap partially in the face to be found, among other problems. With the advancement in the field of computer vision techniques latest image processing and artificial intelligence have been combined to develop more efficient algorithms for the task of face detection. This work presents a methodology for computer vision using neural networks MLP (Multilayer Perceptron) to segment the skin color and texture of the face, from other objects present in a complex background image. The resulting image is divided into regions and from each region are extracted features that are applied in other MLP neural network to identify whether this region contains the face or not. To evaluate the software two sets of images were used, images with a standard database (AR) and another database with images acquired from the Internet, containing faces with different skin tones and complex background. The final results were 83% of faces detected in the internet database of images and 88% for the database AR. These better results for the database AR is due to the fact that they are standardized, are not rotated and do not contain complex background. The segmentation step, despite reducing the amount of information being processed for the other modules contributed to the higher number of false negatives.
|
100 |
Segmentação de imagens de pessoas em tempo real para videoconferênciasParolin, Alessandro 22 March 2011 (has links)
Submitted by Mariana Dornelles Vargas (marianadv) on 2015-03-16T14:26:47Z
No. of bitstreams: 1
segmentacao_imagens.pdf: 6472132 bytes, checksum: b5a25706eff2375403bc63c7d6a89f0d (MD5) / Made available in DSpace on 2015-03-16T14:26:47Z (GMT). No. of bitstreams: 1
segmentacao_imagens.pdf: 6472132 bytes, checksum: b5a25706eff2375403bc63c7d6a89f0d (MD5)
Previous issue date: 2011 / HP - Hewlett-Packard Brasil Ltda / Milton Valente / Segmentação de objetos em imagens e vídeos é uma área relativamente antiga na área de processamento de imagens e visão computacional. De fato, recentemente, devido à grande evolução dos sistemas computacionais em termos de hardware e à popularização da internet, uma aplicação de segmentação de imagens de pessoas que vem ganhando grande destaque na área acadêmica e comercial são as videoconferências. Esse tipo de aplicação traz benefícios a diferentes áreas, como telemedicina, educação à distância, e principalmente empresarial. Diversas empresas utilizam esse tipo de recurso para realizar reuniões/conferências a nível global economizando quantias consideráveis de recursos. No entanto, videoconferências ainda não proporcionam a mesma experiência que as pessoas têm quando estão num mesmo ambiente. Portanto, esse trabalho propõe o desenvolvimento de um sistema de segmentação da imagem do locutor, específico para videoconferências, a fim de permitir futuros processamentos que aumentem a sensação de imersão dos participantes, como por exemplo, a substituição do fundo da imagem por um fundo padrão em todos ambientes. O sistema proposto utiliza basicamente um algoritmo de programação dinâmica guiado por energias extraídas da imagem, envolvendo informações de borda, movimento e probabilidade. Através de diversos testes realizados, observou-se que o sistema apresenta resultados equiparáveis aos do estado da arte do tema, sendo capaz de ser executado em tempo real a uma taxa de 8 FPS, mesmo com um código não otimizado. O grande diferencial do sistema proposto é que nenhum tipo de treinamento prévio é necessário para efetuar a segmentação / Object segmentation has been discussed on Computer Vision and Image processing fields for quite some time. Recently, given the hardware evolution and popularization of the World Wide Web, videoconferences have been the main discussion in this area. This technique brings advantages to many fields, such as telemedicine, education (distance learning), and mainly to the business world. Many companies use videoconferences for worldwide meetings, in order to save a substantial amount o
f resources. However, videoconferences still do not provide the same experience a
s people have when they are in the same room. Therefore, in this paper we propose the development of a system to segment the image of a person who is attending the videoconference, in order to allow future processing that may increase the experience of being in the same room. For instance, the background of the scene could be replaced by a standard one for all participants. The proposed system uses a dynamic programming algorithm guided by energies, such as image edges, motion and probabilistic information. After extensive tests, we could conclude that the results obtained are comparable to other state of the art works and the system is able to execute in real time at 8 FPS. The advantage of the proposed system when compared to others is that no previous training is required in order to perform the segmentation
|
Page generated in 0.1007 seconds