Global ETD Search

11	Mistura de cores: uma nova abordagem para processamento de cores e sua aplicação na segmentação de imagens / Colors mixture: a new approach for color processing and its application in image segmentation Severino Junior, Osvaldo 28 May 2009 (has links) Inspirado nas técnicas utilizadas por pintores que sobrepõem camadas de tintas de diversos matizes na geração de uma tela artística e também observando-se a distribuição da quantidade dos cones na retina do olho humano na interpretação destas cores, este trabalho propõe uma técnica de processamento de imagens baseada na mistura de cores. Trata-se de um método de quantização de cores estático que expressa a proporção das cores preto, azul, verde, ciano, vermelho, magenta, amarelo e branco obtida pela representação binária da cor que compõe os pixels de uma imagem RGB com 8 bits por canal. O histograma da mistura é denominado de misturograma e gera planos que interceptam o espaço RGB, definindo o espaço de cor HSM (Hue, Saturation and Mixture). A posição destes planos dentro do cubo RGB é modelada por meio da distribuição dos cones sensíveis aos comprimentos de onda curta (Short), média (Middle) e longa (Long) consideradas para a retina humana. Para demonstrar a aplicabilidade do espaço de cor HSM, é proposta, neste trabalho, a segmentação dos pixels de uma imagem digital em pele humana ou não pele com o uso dessa nova abordagem. Para análise de desempenho da mistura de cores foi implementado um método tradicional no espaço de cor RGB e também usando uma distribuição Gaussiana nos espaços de cores HSV e HSM. Os resultados obtidos demonstram o potencial da técnica que emprega a mistura de cores para a segmentação de imagens digitais coloridas. Verificou-se também que, baseando-se apenas na camada mais significativa da mistura de cores, gera-se a imagem esboço de uma imagem facial denominada esboço da face. Os resultados obtidos comprovam o bom desempenho do esboço da face em aplicações CBIR. / Inspired on the techniques used by painters to overlap layers of various hues of paint to create oil paintings, and also on observations of the distribution of cones in human retina for the interpretation of these colors, this thesis proposes an image processing technique based on color mixing. This is a static color quantization method that expresses the mixture of black, blue, green, cyan, red, magenta, yellow and white colors quantified by the binary weight of the color that makes up the pixels of an RGB image with 8 bits per channel. The mixture histogram, called a mixturegram, generates planes that intersect the RGB color space, defining the HSM (Hue, Saturation and Mixture) color space. The position of these planes inside the RGB cube is modeled by the distribution of cones sensitive to the short (S), middle (M) and long (L) wave lengths of the human retina. To demonstrate the applicability of the HSM color space, this thesis proposes the segmentation of the pixels of a digital image of human skin or non-skin using this new approach. The performance of the color mixture is analyzed by implementing a traditional method in the RGB color space and by a Gaussian distribution in the HSV and HSM color spaces. The results demonstrate the potential of the proposed technique for color image segmentation. It was also noted that, based only on the most significant layer of the colors mixture, it is possible generates the face sketch image. The results show the performance of the face sketch image in CBIR applications. Color images processing Color quantization Espaço de cor HSM HSM color space Human skin segmentation Misturograma Mixturogram Processamento de imagens coloridas Quantização de cor Segmentação de pele humana
12	Sistema de reconhecimento automático de Língua Brasileira de Sinais / Automatic Recognition System of Brazilian Sign Language Beatriz Tomazela Teodoro 23 October 2015 (has links) O reconhecimento de língua de sinais é uma importante área de pesquisa que tem como objetivo atenuar os obstáculos impostos no dia a dia das pessoas surdas e/ou com deficiência auditiva e aumentar a integração destas pessoas na sociedade majoritariamente ouvinte em que vivemos. Baseado nisso, esta dissertação de mestrado propõe o desenvolvimento de um sistema de informação para o reconhecimento automático de Língua Brasileira de Sinais (LIBRAS), que tem como objetivo simplificar a comunicação entre surdos conversando em LIBRAS e ouvintes que não conheçam esta língua de sinais. O reconhecimento é realizado por meio do processamento de sequências de imagens digitais (vídeos) de pessoas se comunicando em LIBRAS, sem o uso de luvas coloridas e/ou luvas de dados e sensores ou a exigência de gravações de alta qualidade em laboratórios com ambientes controlados, focando em sinais que utilizam apenas as mãos. Dada a grande dificuldade de criação de um sistema com este propósito, foi utilizada uma abordagem para o seu desenvolvimento por meio da divisão em etapas. Considera-se que todas as etapas do sistema proposto são contribuições para trabalhos futuros da área de reconhecimento de sinais, além de poderem contribuir para outros tipos de trabalhos que envolvam processamento de imagens, segmentação de pele humana, rastreamento de objetos, entre outros. Para atingir o objetivo proposto foram desenvolvidas uma ferramenta para segmentar sequências de imagens relacionadas à LIBRAS e uma ferramenta para identificar sinais dinâmicos nas sequências de imagens relacionadas à LIBRAS e traduzi-los para o português. Além disso, também foi construído um banco de imagens de 30 palavras básicas escolhidas por uma especialista em LIBRAS, sem a utilização de luvas coloridas, laboratórios com ambientes controlados e/ou imposição de exigências na vestimenta dos indivíduos que executaram os sinais. O segmentador implementado e utilizado neste trabalho atingiu uma taxa média de acurácia de 99,02% e um índice overlap de 0,61, a partir de um conjunto de 180 frames pré-processados extraídos de 18 vídeos gravados para a construção do banco de imagens. O algoritmo foi capaz de segmentar pouco mais de 70% das amostras. Quanto à acurácia para o reconhecimento das palavras, o sistema proposto atingiu 100% de acerto para reconhecer as 422 amostras de palavras do banco de imagens construído, as quais foram segmentadas a partir da combinação da técnica de distância de edição e um esquema de votação com um classificador binário para realizar o reconhecimento, atingindo assim, o objetivo proposto neste trabalho com êxito. / The recognition of sign language is an important research area that aims to mitigate the obstacles in the daily lives of people who are deaf and/or hard of hearing and increase their integration in the majority hearing society in which we live. Based on this, this dissertation proposes the development of an information system for automatic recognition of Brazilian Sign Language (BSL), which aims to simplify the communication between deaf talking in BSL and listeners who do not know this sign language. The recognition is accomplished through the processing of digital image sequences (videos) of people communicating in BSL without the use of colored gloves and/or data gloves and sensors or the requirement of high quality recordings in laboratories with controlled environments focusing on signals using only the hands. Given the great difficulty of setting up a system for this purpose, an approach divided in several stages was used. It considers that all stages of the proposed system are contributions for future works of sign recognition area, and can contribute to other types of works involving image processing, human skin segmentation, object tracking, among others. To achieve this purpose we developed a tool to segment sequences of images related to BSL and a tool for identifying dynamic signals in the sequences of images related to the BSL and translate them into portuguese. Moreover, it was also built an image bank of 30 basic words chosen by a BSL expert without the use of colored gloves, laboratory-controlled environments and/or making of the dress of individuals who performed the signs. The segmentation algorithm implemented and used in this study had a average accuracy rate of 99.02% and an overlap of 0.61, from a set of 180 preprocessed frames extracted from 18 videos recorded for the construction of database. The segmentation algorithm was able to target more than 70% of the samples. Regarding the accuracy for recognizing words, the proposed system reached 100% accuracy to recognize the 422 samples from the database constructed (the ones that were segmented), using a combination of the edit distance technique and a voting scheme with a binary classifier to carry out the recognition, thus reaching the purpose proposed in this work successfully. LIBRAS língua brasileira de sinais processamento de imagens Reconhecimento de língua de sinais segmentação de pele humana brazilian sign language BSL human skin segmentation image processing Sign language recognition
13	Recognition Of Human Face Expressions Ener, Emrah 01 September 2006 (has links) (PDF) In this study a fully automatic and scale invariant feature extractor which does not require manual initialization or special equipment is proposed. Face location and size is extracted using skin segmentation and ellipse fitting. Extracted face region is scaled to a predefined size, later upper and lower facial templates are used for feature extraction. Template localization and template parameter calculations are carried out using Principal Component Analysis. Changes in facial feature coordinates between analyzed image and neutral expression image are used for expression classification. Performances of different classifiers are evaluated. Performance of proposed feature extractor is also tested on sample video sequences. Facial features are extracted in the first frame and KLT tracker is used for tracking the extracted features. Lost features are detected using face geometry rules and they are relocated using feature extractor. As an alternative to feature based technique an available holistic method which analyses face without partitioning is implemented. Face images are filtered using Gabor filters tuned to different scales and orientations. Filtered images are combined to form Gabor jets. Dimensionality of Gabor jets is decreased using Principal Component Analysis. Performances of different classifiers on low dimensional Gabor jets are compared. Feature based and holistic classifier performances are compared using JAFFE and AF facial expression databases.
14	Human skin segmentation using correlation rules on dynamic color clustering / Segmentação de pele humana usando regras de correlação baseadas em agrupamento dinâmico de cores Rodrigo Augusto Dias Faria 31 August 2018 (has links) Human skin is made of a stack of different layers, each of which reflects a portion of impinging light, after absorbing a certain amount of it by the pigments which lie in the layer. The main pigments responsible for skin color origins are melanin and hemoglobin. Skin segmentation plays an important role in a wide range of image processing and computer vision applications. In short, there are three major approaches for skin segmentation: rule-based, machine learning and hybrid. They differ in terms of accuracy and computational efficiency. Generally, machine learning and hybrid approaches outperform the rule-based methods but require a large and representative training dataset and, sometimes, costly classification time as well, which can be a deal breaker for real-time applications. In this work, we propose an improvement, in three distinct versions, of a novel method for rule-based skin segmentation that works in the YCbCr color space. Our motivation is based on the hypotheses that: (1) the original rule can be complemented and, (2) human skin pixels do not appear isolated, i.e. neighborhood operations are taken into consideration. The method is a combination of some correlation rules based on these hypotheses. Such rules evaluate the combinations of chrominance Cb, Cr values to identify the skin pixels depending on the shape and size of dynamically generated skin color clusters. The method is very efficient in terms of computational effort as well as robust in very complex images. / A pele humana é constituída de uma série de camadas distintas, cada uma das quais reflete uma porção de luz incidente, depois de absorver uma certa quantidade dela pelos pigmentos que se encontram na camada. Os principais pigmentos responsáveis pela origem da cor da pele são a melanina e a hemoglobina. A segmentação de pele desempenha um papel importante em uma ampla gama de aplicações em processamento de imagens e visão computacional. Em suma, existem três abordagens principais para segmentação de pele: baseadas em regras, aprendizado de máquina e híbridos. Elas diferem em termos de precisão e eficiência computacional. Geralmente, as abordagens com aprendizado de máquina e as híbridas superam os métodos baseados em regras, mas exigem um conjunto de dados de treinamento grande e representativo e, por vezes, também um tempo de classificação custoso, que pode ser um fator decisivo para aplicações em tempo real. Neste trabalho, propomos uma melhoria, em três versões distintas, de um novo método de segmentação de pele baseado em regras que funciona no espaço de cores YCbCr. Nossa motivação baseia-se nas hipóteses de que: (1) a regra original pode ser complementada e, (2) pixels de pele humana não aparecem isolados, ou seja, as operações de vizinhança são levadas em consideração. O método é uma combinação de algumas regras de correlação baseadas nessas hipóteses. Essas regras avaliam as combinações de valores de crominância Cb, Cr para identificar os pixels de pele, dependendo da forma e tamanho dos agrupamentos de cores de pele gerados dinamicamente. O método é muito eficiente em termos de esforço computacional, bem como robusto em imagens muito complexas. Agrupamento dinâmico de cores Detecção de pele Modelo de cores YCbCr Regras de correlação Segmentação de pele humana Correlation rules Dynamic color clustering Human skin segmentation Skin detection YCbCr color model
15	Investigation of New Techniques for Face detection Abdallah, Abdallah Sabry 18 July 2007 (has links) The task of detecting human faces within either a still image or a video frame is one of the most popular object detection problems. For the last twenty years researchers have shown great interest in this problem because it is an essential pre-processing stage for computing systems that process human faces as input data. Example applications include face recognition systems, vision systems for autonomous robots, human computer interaction systems (HCI), surveillance systems, biometric based authentication systems, video transmission and video compression systems, and content based image retrieval systems. In this thesis, non-traditional methods are investigated for detecting human faces within color images or video frames. The attempted methods are chosen such that the required computing power and memory consumption are adequate for real-time hardware implementation. First, a standard color image database is introduced in order to accomplish fair evaluation and benchmarking of face detection and skin segmentation approaches. Next, a new pre-processing scheme based on skin segmentation is presented to prepare the input image for feature extraction. The presented pre-processing scheme requires relatively low computing power and memory needs. Then, several feature extraction techniques are evaluated. This thesis introduces feature extraction based on Two Dimensional Discrete Cosine Transform (2D-DCT), Two Dimensional Discrete Wavelet Transform (2D-DWT), geometrical moment invariants, and edge detection. It also attempts to construct a hybrid feature vector by the fusion between 2D-DCT coefficients and edge information, as well as the fusion between 2D-DWT coefficients and geometrical moments. A self organizing map (SOM) based classifier is used within all the experiments to distinguish between facial and non-facial samples. Two strategies are tried to make the final decision from the output of a single SOM or multiple SOM. Finally, an FPGA based framework that implements the presented techniques, is presented as well as a partial implementation. Every presented technique has been evaluated consistently using the same dataset. The experiments show very promising results. The highest detection rate of 89.2% was obtained when using a fusion between DCT coefficients and edge information to construct the feature vector. A second highest rate of 88.7% was achieved by using a fusion between DWT coefficients and geometrical moments. Finally, a third highest rate of 85.2% was obtained by calculating the moments of edges. / Master of Science Self Organized Map (SOM) Edge Detection Hybrid Feature Vector Fusion Discrete Cosine Transform (DCT) Geometrical Moments Face Detection Skin Segmentation Discrete Wavelet Transform (DWT)
16	Sistema de visão computacional para detecção do uso de telefones celulares ao dirigir / A computer vision system tor detecting use of mobile phones while driving Berri, Rafael Alceste 21 February 2014 (has links) Made available in DSpace on 2016-12-12T20:22:52Z (GMT). No. of bitstreams: 1 RAFAEL ALCESTE BERRI.pdf: 28428368 bytes, checksum: 667b9facc9809bfd5e0847e15279b0e6 (MD5) Previous issue date: 2014-02-21 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / In this work, three proposals of systems have been developed using a frontal camera to monitor the driver and enabling to identificate if a cell phone is being used while driving the vehicle. It is estimated that 80% of crashes and 65% of near collisions involved drivers who were inattentive in traffic for three seconds before the event. Five videos in real environment were generated to test the systems. The pattern recognition system (RP) uses adaptive skin segmentation, feature extraction, and machine learning to detect cell phone usage on each frame. The cell phone detection happens when, in periods of 3 seconds, 60% (threshold) of frames or more are identified as a cell phone use, individually. The average accuracy on videos achieved was 87.25% with Multilayer Perceptron (MLP), Gaussian activation function, and two neurons of the intermediate layer. The movement detection system (DM) uses optical flow, filtering the most relevant movements of the scene, and three successive frames for detecting the movements to take the phone to the ear and take it off. The DM proposal was not demonstrated as being an effective solution for detecting cell phone use, reaching an accuracy of 52.86%. The third solution is a hybrid system. It uses the RP system for classification and the DM for choosing the RP parameters. The parameters chosen for RP are the threshold and the classification system. The definition of these two parameters occurs at the end of each period, based on movement detected by the DM. Experimentally it was established that, when the movement induces to use cell phone, it is proper to use the threshold of 60%, and the classifier as MLP/Gaussian with seven neurons of the intermediate layer; otherwise, it is used threshold 85%, and MLP/Gaussian with two neurons of the intermediate layer for classification. The hybrid solution is the most robust system with average accuracy of 91.68% in real environment. / Neste trabalho, são desenvolvidas três propostas de sistemas que permitem identificar o uso de celular, durante o ato de dirigir um veículo, utilizando imagens capturadas de uma câmera posicionada em frente ao motorista. Estima-se que 80% das colisões e 65% das quase colisões envolveram motoristas que não estavam prestando a devida atenção ao trânsito por três segundos antes do evento. Cinco vídeos em ambiente real foram gerados com o intuito de testar os sistemas. A proposta de reconhecimento de padrões (RP) emprega segmentação de pele adaptativa, extração de características e aprendizado de máquina (classificador) na detecção do celular em cada quadro processado. A detecção do uso do celular ocorre quando, em períodos de 3 segundos, ao menos em 60% dos quadros (corte) são identificados com celular. A acurácia média nos vídeos alcançou 87, 25% ao utilizar Perceptron Multi-camadas (MLP) com função de ativação gaussiana e dois neurônios na camada intermediária como classificador. A proposta de detecção de movimento (DM) utiliza o fluxo ótico, filtragem dos movimentos mais relevantes da cena e três quadros consecutivos para detectar os momentos de levar o celular ao ouvido e o retirá-lo. A aplicação do DM, como solução para detectar o uso do celular, não se demostrou eficaz atingindo uma acurácia de 52, 86%. A terceira proposta, uma solução híbrida, utiliza o sistema RP como classificador e o de DM como seu parametrizador. Os parâmetros escolhidos para o sistema de RP são o corte e o sistema classificador. A definição desses dois parâmetros ocorre ao final de cada período, baseada na movimentação detectada pela DM. Com experimentações definiu-se que, caso a movimentação induza ao uso do celular, é adequado o uso do corte de 60% e o classificador MLP/Gaussiana com sete neurônios na camada intermediária, caso contrário, utiliza-se o corte de 85% e classificador MLP/Gaussiana com dois neurônios na mesma camada. A versão híbrida é a solução desenvolvida mais robusta, atingindo a melhor acurácia média de 91, 68% em ambiente real. Algoritmo genético Aprendizado de máquina Distração do motorista Fluxo ótico Máquina de vetor de suporte Perceptron multicamada Reconhecimento de gestos Segmentação de pele Telefones celulares Visão computacional Cell phones Computer vision Driver distraction Genetic algorithm Gesture recognition Machine learning Multilayer perceptron Optical flow Skin segmentation Support vector machines

Page generated in 0.1151 seconds