41 |
Um sistema para detecção e reconhecimento de face em vídeo utilizando a transformada cosseno discretaOmaia, Derzu 27 August 2009 (has links)
Made available in DSpace on 2015-05-14T12:36:43Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 2151124 bytes, checksum: ffc486a2022781c4365766e4bf1e7054 (MD5)
Previous issue date: 2009-08-27 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Human face has a very complex and variable pattern, which makes the face detection
and recognition operations a challenging problem. The scope of these operations is quite
comprehensive, involving mainly security applications, such as authorization for physical and
logical access, people tracking, and real time authentication. In addition to security
applications, face detection and recognition can also be associated with other applications,
such as human-computer interaction and virtual reality.
Several studies of face detection and recognition have been proposed and developed
by researchers, pursuing greater precision and efficiency. Currently there are face detectors
and recognizers with accuracy exceeding 95%. Commercial systems are available as well.
This work presents a study on several face detection and recognition methods. Also
was discussed the possibility of developing a new face detection method using Prediction by
Partial Match (PPM), Entropy and Discrete Cosine Transform (DCT). It is further proposed a
new face recognition method based on DCT. Finally, is proposed an architecture for a face
detection and recognition system in video. To validate the architecture, the proposed system
was implemented using one of the best detectors in the literature and the recognizer produced
in this work.
Several experiments were performed, and both the face detector used as the
recognizer developed were effective, achieving success rates compatible with most current
methods / A face humana possui um padrão bastante complexo e variável, o que torna as
operações de detecção e reconhecimento de face um problema desafiador. O campo de
aplicação dessas operações é bastante abrangente, envolvendo principalmente aplicações de
segurança, como autorização de acesso físico e lógico, rastreamento de pessoas e autenticação
em tempo real. Além de aplicações de segurança, a detecção e o reconhecimento de faces
também pode ser associado a outras aplicações, como interação homem-máquina e realidade
virtual.
Diversos trabalhos de detecção e reconhecimento de face vêm sendo propostos e
desenvolvidos pela comunidade científica, buscando continuamente uma maior precisão e
eficiência. Atualmente já estão disponíveis detectores e reconhecedores de face com precisão
superior a 95%. Sistemas comerciais também já estão disponíveis no mercado.
Este trabalho apresenta um estudo sobre os diversos métodos de detecção e
reconhecimento de face existentes. Também foi analisada a possibilidade de desenvolvimento
de um novo método de detecção de face utilizando Predição por Casamento Parcial
(Prediction by Partial Match, PPM), Entropia e Transformada Cosseno Discreta (Discrete
Cosine Transform, DCT). Propõe-se ainda, um novo método de reconhecimento de face
baseado na DCT. Por fim, apresenta-se a arquitetura de um sistema de detecção e
reconhecimento de face em vídeo. Para validação desta arquitetura, o sistema proposto foi
implementado utilizando um dos melhores detectores encontrados na literatura e o
reconhecedor produzido neste trabalho.
Diversos experimentos foram realizados e tanto o detector de face utilizado, quanto o
reconhecedor desenvolvido mostraram-se eficientes, atingindo taxas de acerto compatíveis
com os métodos mais atuais.
|
42 |
Characterization of the Voice Source by the DCT for Speaker InformationAbhiram, B January 2014 (has links) (PDF)
Extracting speaker-specific information from speech is of great interest to both researchers and developers alike, since speaker recognition technology finds application in a wide range of areas, primary among them being forensics and biometric security systems.
Several models and techniques have been employed to extract speaker information from the speech signal. Speech production is generally modeled as an excitation source followed by a filter. Physiologically, the source corresponds to the vocal fold vibrations and the filter corresponds to the spectrum-shaping vocal tract. Vocal tract-based features like the melfrequency cepstral coefficients (MFCCs) and linear prediction cepstral coefficients have been shown to contain speaker information. However, high speed videos of the larynx show that the vocal folds of different individuals vibrate differently. Voice source (VS)-based features have also been shown to perform well in speaker recognition tasks, thereby revealing that the VS does contain speaker information. Moreover, a combination of the vocal tract and VS-based features has been shown to give an improved performance, showing that the latter contains supplementary speaker information.
In this study, the focus is on extracting speaker information from the VS. The existing techniques for the same are reviewed, and it is observed that the features which are obtained by fitting a time-domain model on the VS perform poorly than those obtained by simple transformations of the VS. Here, an attempt is made to propose an alternate way of characterizing the VS to extract speaker information, and to study the merits and shortcomings of the proposed speaker-specific features.
The VS cannot be measured directly. Thus, to characterize the VS, we first need an estimate of the VS, and the integrated linear prediction residual (ILPR) extracted from the speech signal is used as the VS estimate in this study. The voice source linear prediction model, which was proposed in an earlier study to obtain the ILPR, is used in this work.
It is hypothesized here that a speaker’s voice may be characterized by the relative proportions of the harmonics present in the VS. The pitch synchronous discrete cosine transform (DCT) is shown to capture these, and the gross shape of the ILPR in a few coefficients. The ILPR and hence its DCT coefficients are visually observed to distinguish between speakers. However, it is also observed that they do have intra-speaker variability, and thus it is hypothesized that the distribution of the DCT coefficients may capture speaker information, and this distribution is modeled by a Gaussian mixture model (GMM).
The DCT coefficients of the ILPR (termed the DCTILPR) are directly used as a feature vector in speaker identification (SID) tasks. Issues related to the GMM, like the type of covariance matrix, are studied, and it is found that diagonal covariance matrices perform better than full covariance matrices. Thus, mixtures of Gaussians having diagonal covariances are used as speaker models, and by conducting SID experiments on three standard databases, it is found that the proposed DCTILPR features fare comparably with the existing VS-based features. It is also found that the gross shape of the VS contains most of the speaker information, and the very fine structure of the VS does not help in distinguishing speakers, and instead leads to more confusion between speakers. The major drawbacks of the DCTILPR are the session and handset variability, but they are also present in existing state-of-the-art speaker-specific VS-based features and the MFCCs, and hence seem to be common problems. There are techniques to compensate these variabilities, which need to be used when the systems using these features are deployed in an actual application.
The DCTILPR is found to improve the SID accuracy of a system trained with MFCC features by 12%, indicating that the DCTILPR features capture speaker information which is missed by the MFCCs. It is also found that a combination of MFCC and DCTILPR features on a speaker verification task gives significant performance improvement in the case of short test utterances. Thus, on the whole, this study proposes an alternate way of extracting speaker information from the VS, and adds to the evidence for speaker information present in the VS.
|
43 |
Modèles géométriques avec defauts pour la fabrication additive / Skin Model Shapes for Additive ManufacturingZhu, Zuowei 10 July 2019 (has links)
Les différentes étapes et processus de la fabrication additive (FA) induisent des erreurs de sources multiples et complexes qui soulèvent des problèmes majeurs au niveau de la qualité géométrique du produit fabriqué. Par conséquent, une modélisation effective des écarts géométriques est essentielle pour la FA. Le paradigme Skin Model Shapes (SMS) offre un cadre intégral pour la modélisation des écarts géométriques des produits manufacturés et constitue ainsi une solution efficace pour la modélisation des écarts géométriques en FA.Dans cette thèse, compte tenu de la spécificité de fabrication par couche en FA, un nouveau cadre de modélisation à base de SMS est proposé pour caractériser les écarts géométriques en FA en combinant une approche dans le plan et une approche hors plan. La modélisation des écarts dans le plan vise à capturer la variabilité de la forme 2D de chaque couche. Une méthode de transformation des formes est proposée et qui consiste à représenter les effets de variations sous la forme de transformations affines appliquées à la forme nominale. Un modèle paramétrique des écarts est alors établi dans un système de coordonnées polaires, quelle que soit la complexité de la forme. Ce modèle est par la suite enrichi par un apprentissage statistique permettant la collecte simultanée de données des écarts de formes multiples et l'amélioration des performances de la méthode.La modélisation des écarts hors plan est réalisée par la déformation de la couche dans la direction de fabrication. La modélisation des écarts hors plan est effectuée à l'aide d'une méthode orientée données. Sur la base des données des écarts obtenues à partir de simulations par éléments finis, deux méthodes d'analyse modale: la transformée en cosinus discrète (DCT) et l'analyse statistique des formes (SSA) sont exploitées. De plus, les effets des paramètres des pièces et des procédés sur les modes identifiés sont caractérisés par le biais d'un modèle à base de processus Gaussien.Les méthodes présentées sont finalement utilisées pour obtenir des SMSs haute-fidélité pour la fabrication additive en déformant les contours de la couche nominale avec les écarts prédits et en reconstruisant le modèle de surface non idéale complet à partir de ces contours déformés. Une toolbox est développée dans l'environnement MATLAB pour démontrer l'efficacité des méthodes proposées. / The intricate error sources within different stages of the Additive Manufacturing (AM) process have brought about major issues regarding the dimensional and geometrical accuracy of the manufactured product. Therefore, effective modeling of the geometric deviations is critical for AM. The Skin Model Shapes (SMS) paradigm offers a comprehensive framework aiming at addressing the deviation modeling problem at different stages of product lifecycle, and is thus a promising solution for deviation modeling in AM. In this thesis, considering the layer-wise characteristic of AM, a new SMS framework is proposed which characterizes the deviations in AM with in-plane and out-of-plane perspectives. The modeling of in-plane deviation aims at capturing the variability of the 2D shape of each layer. A shape transformation perspective is proposed which maps the variational effects of deviation sources into affine transformations of the nominal shape. With this assumption, a parametric deviation model is established based on the Polar Coordinate System which manages to capture deviation patterns regardless of the shape complexity. This model is further enhanced with a statistical learning capability to simultaneously learn from deviation data of multiple shapes and improve the performance on all shapes.Out-of-plane deviation is defined as the deformation of layer in the build direction. A layer-level investigation of out-of-plane deviation is conducted with a data-driven method. Based on the deviation data collected from a number of Finite Element simulations, two modal analysis methods, Discrete Cosine Transform (DCT) and Statistical Shape Analysis (SSA), are adopted to identify the most significant deviation modes in the layer-wise data. The effect of part and process parameters on the identified modes is further characterized with a Gaussian Process (GP) model. The discussed methods are finally used to obtain high-fidelity SMSs of AM products by deforming the nominal layer contours with predicted deviations and rebuilding the complete non-ideal surface model from the deformed contours. A toolbox is developed in the MATLAB environment to demonstrate the effectiveness of the proposed methods.
|
44 |
Porovnání možností komprese multimediálních signálů / Comparison of Multimedia Signal Compression PossibilitiesŠpaček, Milan January 2013 (has links)
Thesis deals with multimedia signal comparison of compression options focused on video and advanced codecs. Specifically it describes the encoding and decoding of video recordings according to the MPEG standard. The theoretical part of the thesis describes characteristic properties of the video signal and justification for the need to use recording and transmission compression. There are also described methods for elimination of encoded video signal redundancy and irrelevance. Further on are discussed ways of measuring the video signal quality. A separate chapter is focused on the characteristics of currently used and promising codecs. In the practical part of the thesis were created functions in Matlab environment. These functions were implemented into graphic user interface that simulates the activity of functional blocks of the encoder and decoder. Based on user-specified input parameters it performs encoding and decoding of any given picture, composed of images in RGB format, and displays the outputs of individual functional blocks. There are implemented algorithms for the initial processing of the input sequence including sub-sampling, as well as DCT, quantization, motion compensation and their inverse operations. Separate chapters are dedicated to the realisation of codec description in the Matlab environment and to the individual processing steps output. Further on are mentioned compress algorithm comparisons and the impact of parameter change onto the final signal. The findings are summarized in conclusion.
|
45 |
Digitální vodoznačení obrazu / Digital image watermarkingČíka, Petr January 2009 (has links)
Digital image watermarking has developed for the purpose of protecting intellectual property rights to multimedia data. The focus of this thesis is searching for an alternative solution of digital image watermarking methods. A detailed analysis of watermarking methods particularly in the frequency domain, and the modification of these methods are the main aim of this work. Improved performance in watermark extraction is one of the main goals. First, the common static image watermarking methods, possible attacks on the watermarked data and techniques for objective measurement of watermarked image quality are shortly introduced. Techniques which use the space domain for watermarking ar described in the next part of this work. It is about techniques which insert the watermark into the least significant bits of an image both in the RGB domain and in the YUV domain. The main part of the thesis depicts modified and newly developed static image watermarking methods in the frequency domain. These methods use various transforms and error-correction codes, by means of which the watermark robustness increases. All the methods developed are tested in MATLAB. Results together with tables and graphs are one part of work. The end of the thesis is devoted to a comparison of all the developed methods and their evaluation.
|
46 |
Komprese dat / Data compressionKrejčí, Michal January 2009 (has links)
This thesis deals with lossless and losing methods of data compressions and their possible applications in the measurement engineering. In the first part of the thesis there is a theoretical elaboration which informs the reader about the basic terminology, the reasons of data compression, the usage of data compression in standard practice and the division of compression algorithms. The practical part of thesis deals with the realization of the compress algorithms in Matlab and LabWindows/CVI.
|
47 |
Výukový video kodek / Educational video codecDvořák, Martin January 2012 (has links)
The first goal of diploma thesis is to study the basic principles of video signal compression. Introduction to techniques used to reduce irrelevancy and redundancy in the video signal. The second goal is, on the basis of information about compression tools, implement the individual compression tools in the programming environment of Matlab and assemble simple model of the video codec. Diploma thesis contains a description of the three basic blocks, namely - interframe coding, intraframe coding and coding with variable length word - according the standard MPEG-2.
|
48 |
Wireless Networking in Future Factories: Protocol Design and Evaluation StrategiesNaumann, Roman 17 January 2020 (has links)
Industrie-4.0 bringt eine wachsende Nachfrage an Netzwerkprotokollen mit sich, die es erlauben, Informationen vom Produktionsprozess einzelner Maschinen zu erfassen und verfügbar zu machen. Drahtlose Übertragung erfüllt hierbei die für industrielle Anwendungen benötigte Flexibilität, kann in herausfordernden Industrieumgebungen aber nicht immer zeitnahe und zuverlässige Übertragung gewährleisten. Die Beiträge dieser Arbeit behandeln schwerpunktmäßig Protokollentwurf und Protokollevaluation für industrielle Anwendungsfälle. Zunächst identifizieren wir Anforderungen für den industriellen Anwendungsfall und leiten daraus konkrete Entwufskriterien ab, die Protokolle erfüllen sollten. Anschließend schlagen wir Protokollmechanismen vor, die jene Entwurfskriterien für unterschiedliche Arten von Protokollen umsetzen, und die in verschiedenem Maße kompatibel zu existierenden Netzwerken und existierender Hardware sind: Wir zeigen, wie anwendungsfallspezifische Priorisierung von Netzwerkdaten dabei hilft, zuverlässige Übertragung auch unter starken Störeinflüssen zu gewährleisten, indem zunächst eine akkurate Vorschau von Prozessinformationen übertragen wird. Für deren Fehler leiten wir präziser Schranken her. Ferner zeigen wir, dass die Fairness zwischen einzelnen Maschinen durch Veränderung von Warteschlangen verbessert werden kann, wobei hier ein Teil der Algorithmen von Knoten innerhalb des Netzwerks durchgeführt wird. Ferner zeigen wir, wie Network-Coding zu unserem Anwendungsfall beitragen kann, indem wir spezialisierte Kodierungs- und Dekodierungsverfahren einführen. Zuletzt stellen wir eine neuartige Softwarearchitektur und Evaluationstechnik vor, die es erlaubt, potentiell proprietäre Protokollimplementierungen innerhalb moderner diskreter Ereignissimulatoren zu verwenden. Wir zeigen, dass unser vorgeschlagener Ansatz ausreichend performant für praktische Anwendungen ist und, darüber hinaus, die Validität von Evaluationsergebnissen gegenüber existierenden Ansätzen verbessert. / As smart factory trends gain momentum, there is a growing need for robust information transmission protocols that make available sensor information gathered by individual machines. Wireless transmission provides the required flexibility for industry adoption but poses challenges for timely and reliable information delivery in challenging industrial environments. This work focuses on to protocol design and evaluation aspects for industrial applications. We first introduce the industrial use case, identify requirements and derive concrete design principles that protocols should implement. We then propose mechanisms that implement these principles for different types of protocols, which retain compatibility with existing networks and hardware to varying degrees: we show that use-case tailored prioritization at the source is a powerful tool to implement robustness against challenged connectivity by conveying an accurate preview of information from the production process. We also derive precise bounds for the quality of that preview. Moving parts of the computational work into the network, we show that reordering queues in accordance with our prioritization scheme improves fairness among machines. We also demonstrate that network coding can benefit our use case by introducing specialized encoding and decoding mechanisms. Last, we propose a novel architecture and evaluation techniques that allows incorporating possibly proprietary networking protocol implementations with modern discrete event network simulators, rendering, among others, the adaption of protocols to specific industrial use cases more cost efficient. We demonstrate that our approach provides sufficient performance and improves the validity of evaluation results over the state of the art.
|
Page generated in 0.0946 seconds