1 |
Alignement du chant par rapport à une référence audio en temps réelJulien, Eric January 2013 (has links)
Dans l'optique de créer un système de karaoké qui modifie une interprétation chantée à capella en temps réel, il est nécessaire de pouvoir localiser l'interprète par rapport à une référence afin de pouvoir déterminer quelle serait la cible d'un algorithme de modification de la voix. Pour qu'un tel système fonctionne bien, il est nécessaire que l'algorithme d'alignement exploite au maximum les spécificités de la voix, qu'il utilise l'information liée au texte prononcé plutôt qu'aux aspects artistiques du chant, qu'il soit à temps réel et qu'il offr la plus faible latence possible. Afin d'atteindre ces objectifs, un système d'alignement basé sur le Dynamic Time Warping (DTW) a été développé. Une adaptation temps réel simple de l'algorithme ordinaire de la DTW qui permet d'atteindre les objectifs énumérés est proposée et comparée à d'autres approches répertoriées dans la littérature. Cette adaptation a permis d'obtenir de meilleurs résultats que les autres techniques testées. Une étude comparative de trois types d'analyses spectrales couramment utilisées dans des systèmes de reconnaissance automatique de la voix a été réalisée, dans le cadre spécifique d'un algorithme d'alignement de la voix chantée. Les coefficients évalués sont les Mel-frquency Cepstrum Coefficients (MFCC), les Warped Discrete Cosine Transform Coefficients (WDCTC) et les coefficients de l'analyse Perceptual Linear Prediction (PLP). Les résultats obtenus indiquent une meilleure performance pour l'analyse PLP. L'utilisation d'une fonction de transformation linéaire par morceaux, appliquée aux matrices de coûts instantanés obtenues, permet de rendre l'alignement le plus facilement distinguable dans les matrices de coûts cumulés calculées. Les paramètres de la fonction de transformation peuvent être obtenus par l'optimisation en boucle fermée par recherche directe par motif. Une fonction-objectif permettant d'éviter les discontinuités de l'écart quadratique moyen sur l'alignement est développée. Plusieurs matrices de coûts peuvent être combinées entre elles en effectuant une somme pondérée des matrices de coûts instantanées transformées de chacun des paramètres considérés. La pondération est également obtenue par optimisation. Plusieurs assemblages sont comparés : les meilleurs résultats sont obtenus avec une combinaison de l'analyse PLP et du niveau d'énergie et des dérivées de ceux-ci. L'écart moyen sur l'alignement de référence est de l'ordre de 50 ms, avec un écart-type d'environ 75 ms pour les séquences testées. Des perspectives permettant d'améliorer la convergence de l'algorithme pour les paires de séquences audio difficiles à aligner, d'obtenir de meilleures matrices de coûts en utilisant d'autres contraintes locales, en considérant l'intégration de nouveaux paramètres tels le pitch ou en utilisant une base de données de voix chantée segmentée pour optimiser une mesure de distance sont données.
|
2 |
Design and Realization of the Gesture-Interaction System Based on KinectXu, Jie January 2014 (has links)
In the past 20 years humans have mostly used a mouse to interact with computers. However, with the rapidly growing use of computers, a need for alternative means of interaction has emerged. With the advent of Kinect, a brand-new way of human- computer interaction has been introduced. It allows the use of gestures - the most natural body-language - to communicate with computers, helping us get rid of traditional constraints and providing an intuitive method for executing operations. This thesis presents how to design and implement a program to help people interact with computers, without the traditional mouse, and with the support and help of a Kinect device (an XNA Game framework with Microsoft Kinect SDK v1.7). For dynamic gesture recognition, the Hidden Markov Model (HMM) and Dynamic Time Warping (DTW), are suggested. The use of DTW is being motivated by experimental analysis. A dynamic-gesture-recognition program is developed, based on DTW, to help computers recognize customized gestures by users. The experiment also shows that DTW can have rather good performance. As for further development, the use of the XNA Game 4.0 framework, which integrates the Kinect body tracking into DTW gesture recognition technologies, is introduced. Finally, a functional test is conducted on the interaction system. In addition to summarizing the results, the thesis also discusses what can be improved in the future.
|
3 |
From Time series signal matching to word spotting in multilingual historical document images / De la mise en correspondance de séries temporelles au word spotting dans les images de documents historiques multilinguesMondal, Tanmoy 18 December 2015 (has links)
Cette thèse traite dela mise en correspondance de séquences appliquée au word spotting (localisation de motsclés dans des images de documents sans en interpréter le contenu). De nombreux algorithmes existent mais très peu d’entre eux ont été évalués dans ce contexte. Nous commençons donc par une étude comparative de ces méthodes sur plusieurs bases d’images de documents historiques. Nous proposons ensuite un nouvel algorithme réunissant la plupart des possibilités offertes séparément dans les autres algorithmes. Ainsi, le FSM (Flexible Sequence Matching) permet de réaliser des correspondances multiples sans considérer des éléments bruités dans la séquence cible, qu’ils se situent au début, à la fin ou bien au coeur de la correspondance. Nous étendons ensuite ces possibilités à la séquence requête en définissant un nouvel algorithme (ESC : Examplary Sequence Cardinality). Finalement, nous proposons une méthode d’appariement alternative utilisant une mise en correspondance inexacte de chaines de codes (shape code) décrivant les mots. / This thesis deals with sequence matching techniques, applied to word spotting (locating keywords in document images without interpreting the content). Several sequence matching techniques exist in the literature but very few of them have been evaluated in the context of word spotting. This thesis begins by a comparative study of these methods for word spotting on several datasets of historical images. After analyzing these approaches, we then propose a new algorithm, called as Flexible Sequence Matching (FSM) which combines most of the advantages offered separately by several other previously explored sequence matching algorithms. Thus, FSM is able to skip outliers from target sequence, which can be present at the beginning, at the end or in the middle of the target sequence. Moreover it can perform one-to-one, one-to-many and many-to-one correspondences between query and target sequence without considering noisy elements in the target sequence. We then also extend these characteristics to the query sequence by defining a new algorithm (ESC : Examplary Sequence Cardinality). Finally, we propose an alternative word matching technique by using an inexact chain codes (shape code), describing the words.
|
4 |
Detekce klíčových slov v mluvené řeči / Keyword spottingZemánek, Tomáš January 2011 (has links)
This thesis is aimed on design keyword detector. The work contains a description of the methods that are used for these purposes and design of algorithm for keyword detection. The proposed detector is based on the method of DTW (Dynamic Time Warping). Analysis of the problem was performed on the module programmed in ANSI C, which was created within the thesis. The results of the detector were evaluated using the metrics WER (word error rate) and AUC (area under curve).
|
5 |
Rastreamento e reconhecimento de movimentos de punho na execução de excertos musicais ao piano: uma abordagem com MD-DTW (Multi-Dimensional Dynamic Time Warping) / Tracking and recognition of movements of fist in the execution of excerpts musical at the pianoCarvalho, Thyago Peres 09 October 2015 (has links)
Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2016-02-18T08:08:52Z
No. of bitstreams: 2
Dissertação - Thyago Peres Carvalho - 2015.pdf: 7596482 bytes, checksum: ecea89258dff2ed3c5eb4038bb4f3967 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2016-02-18T08:10:31Z (GMT) No. of bitstreams: 2
Dissertação - Thyago Peres Carvalho - 2015.pdf: 7596482 bytes, checksum: ecea89258dff2ed3c5eb4038bb4f3967 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) / Made available in DSpace on 2016-02-18T08:10:31Z (GMT). No. of bitstreams: 2
Dissertação - Thyago Peres Carvalho - 2015.pdf: 7596482 bytes, checksum: ecea89258dff2ed3c5eb4038bb4f3967 (MD5)
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Previous issue date: 2015-10-09 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / This paper proposes a method to support the teaching-learning fist gestures in the piano
music performance, using tracking and recognition of fist movements in executions
of musical piano excerpts. For this, a system was built by means of computer vision
techniques, aiming to present to the student videos produced to verify the execution
of the exercise by the learner and aims to provide data related to performance. The
system also uses the same computer vision techniques for the generation of the proposed
exercises to class by the tutor in order to support the production of educational material
as well. To recognize gestures, the system uses a regular low cost webcam, and from a
colored marker on the back of the musician’s hand, the wrist movements are detected and
tracked. A multidimensional dynamic time warping algorithm (MD-DTW) was used in
order to develop this tool, which is an n-dimensional version of Dynamic Time Warping
(DTW). In the work sequence, three rounds of experiments were performed, being the
first of which to adjust the system parameters from video excerpts performed by an
expert trainer. The second and third step assessed, respectively, the learning gain of piano
students to the proposed method and system usability. The experiments were performed
on volunteers with musical reading skills, however, without requiring minimum technical
domain while playing the piano. The results of these tests showed that in addition to the
method being able to detect and recognize successful gestures, the volunteers presented
learning gain within middle range, which shows that this is a very promising method.
In addition, usability testing revealed that the implemented interface, is well suited and
has reached good satisfaction results among the volunteers. As a result, it can be said
that the method and the proposed prototype demonstrate the potential of these tools
in transferring techniques, such as musical performance gestures in a piano teachinglearning
environment. / Este trabalho propõe um método de apoio ao ensino-aprendizagem de gestos de punho na
execução musical ao piano, utilizando o rastreamento e reconhecimento de movimentos
de punho na execução de excertos musicais ao piano. Para isso, um sistema foi construído,
por meio de técnicas de visão computacional, visando apresentar ao aluno vídeos produzidos
para verificar a execução do exercício pelo aprendiz, bem como visa fornecer dados
relacionados ao desempenho. O sistema também utiliza as mesmas técnicas de visão
computacional para a geração, pelo tutor, dos exercícios propostos para a aula, de modo a
apoiar também na produção de material didático. Para reconhecer gestos, o sistema utiliza
uma câmera regular de baixo custo, webcam, e, a partir de um marcador colorido no dorso
da mão do músico, os movimentos de punho são detectados e rastreados. Para desenvolver
essa ferramenta, foi utilizado o algoritmo Multidimensional Dynamic Time Warping
(MD-DTW), que é uma versão n-dimensional do Dynamic Time Warping (DTW). Na
sequência do trabalho, foram realizadas três etapas de experimentos, sendo que a primeira
foi para ajustar os parâmetros do sistema a partir de vídeos dos excertos realizados
por um instrutor especialista. A segunda e terceira etapa avaliam, respectivamente, o ganho
de aprendizagem dos estudantes de piano com o método proposto e a usabilidade do
sistema. Os experimentos foram realizados com voluntários com conhecimentos de leitura
musical, porém, sem exigir limite mínimo de domínio de técnica ao tocar o piano. Os
resultados desses testes mostraram que, além de o método ser capaz de detectar e reconhecer
gestos com sucesso, os voluntários apresentaram ganho de aprendizagem na faixa
média, o que demonstra ser esse um método bastante promissor. Além disso, o teste de
usabilidade revelou que a interface implementada, é adequada e obteve bons resultados de
satisfação entre os voluntários. Em virtude disso, pode-se afirmar que o método e o protótipo
propostos demonstram o potencial dessas ferramentas no repasse de técnicas, como
as de gestos de execução musical, em um ambiente de ensino-aprendizagem de piano.
|
6 |
Robustní detekce klíčových slov v řečovém signálu / Robust detection of keywords in speech signalVrba, Václav January 2014 (has links)
The master thesis is divided into two parts theoretical and practical. The theoretical part is focused on methods of analysis and detection of speech signals. In the practical part the system for isolated word recognition was created in Matlab. The system is speaker independent separately for men and women. Also two speech databases were created for further use in the aircraft cockpit. Tests and evaluations were performed even with added noise.
|
7 |
Méthodologie pour la détection de défaillance des procédés de fabrication par ACP : application à la production de dispositifs semi-conducteurs / PCA Methodology for Production Process Fault Detection : Application to Semiconductor Manufacturing ProcessesThieullen, Alexis 09 July 2014 (has links)
L'objectif de cette thèse est le développement d'une méthodologie pour la détection de défauts appliquée aux équipements de production de semi-conducteurs. L'approche proposée repose sur l'Analyse en Composantes Principales (ACP) pour construire un modèle représentatif du fonctionnement nominal d'un équipement. Pour cela, notre méthodologie consiste à exploiter l'ensemble des mesures disponibles, collectées via les capteurs internes et externes au cours desopérations de fabrication pour chaque plaque manufacturée. Nous avons développé un module de pré-traitement permettant de transformer les mesures collectées en données interprétables par l'ACP, tout en filtrant l'information considérée comme non-désirable induite par la présence de valeurs aberrantes et perturbant la construction du modèle. Nous avons combiné des extensions de l'ACP linéaire et notamment l'ACP multiway, l'ACP filtrée ainsi que l'ACP récursive, de façon à adapter la modélisation aux caractéristiques des systèmes. L'utilisation d'un filtre par moyenne mobile exponentielle nous permet de considéré la dynamique du système au cours de la réalisation d'une opération. L'ACP récursive est employée pour adapter le modèle aux changements de comportement du système après certains événements (maintenance, redémarrage, etc.).Les différentes méthodes sont illustrées à l'aide de données réelles, collectées sur un équipement actuellement exploité par STMicroelectronics Rousset. Nous proposons également une application plus générale de la méthode pour différents types d'équipement et sur une période plus importante, de façon à montrer l'intérêt industriel et la performance de cette approche. / This thesis focus on developping a fault detection methodology for semiconductor manufacturing equipment. The proposed approach is based on Principal Components Analysis (PCA) to build a representative model of equipment in adequat operating conditions. Our method exploits collected measurements from equipement sensors, for each processed wafer. However, regarding the industrial context and processes, we have to consider additional problems: collected signals from sensors exhibit different length, or durations. This is a limitation considering PCA. We have also to consider synchronization and alignment problems; semiconductor manufacturing equipment are almost dynamic, with strong temporal correlations between sensor measurements all along processes. To solve the first point, we developped a data preprocessing module to transform raw data from sensors into a convenient dataset for PCA application. The interest is to identify outliers data and products, that can affect PCA modelling. This step is based on expert knowledge, statistical analysis, and Dynamic Time Warping, a well-known algorithm from signal processing. To solve the second point, we propose a combination multiway PCA with the use of an EWMA filter to consider process dynamic. A recursive approach is employed to adapt our PCA model to specific events that can occur on equipment, e.g. maintenance, restart, etc.All the steps of our methodology are illustrated with data from a chemical vapor deposition tool exploited in STMicroelectroics Rousset fab. Finally, the efficiency and industrial interest of the proposed methodologies are verified by considering multiple equipment types on longer operating periods.
|
8 |
Generalized k-means-based clustering for temporal data under time warp / Alignement temporel généralisé pour la classification non supervisée de séries temporellesSoheily-Khah, Saeid 07 October 2016 (has links)
L’alignement de multiples séries temporelles est un problème important non résolu dans de nombreuses disciplines scientifiques. Les principaux défis pour l’alignement temporel de multiples séries comprennent la détermination et la modélisation des caractéristiques communes et différentielles de classes de séries. Cette thèse est motivée par des travaux récents portant sur l'extension de la DTW pour l’alignement de séries multiples issues d’applications diverses incluant la reconnaissance vocale, l'analyse de données micro-array, la segmentation ou l’analyse de mouvements humain. Ces travaux fondés sur l’extension de la DTW souffrent cependant de plusieurs limites : 1) Ils se limitent au problème de l'alignement par pair de séries 2) Ils impliquent uniformément les descripteurs des séries 3) Les alignements opérés sont globaux. L'objectif de cette thèse est d'explorer de nouvelles approches d’alignement temporel pour la classification non supervisée de séries. Ce travail comprend d'abord le problème de l'extraction de prototypes, puis de l'alignement de séries multiples multidimensionnelles. / Temporal alignment of multiple time series is an important unresolved problem in many scientific disciplines. Major challenges for an accurate temporal alignment include determining and modeling the common and differential characteristics of classes of time series. This thesis is motivated by recent works in extending Dynamic time warping for aligning multiple time series from several applications including speech recognition, curve matching, micro-array data analysis, temporal segmentation or human motion. However these DTW-based works suffer of several limitations: 1) They address the problem of aligning two time series regardless of the remaining time series, 2) They involve uniformly the features of the multiple time series, 3) The time series are aligned globally by including the whole observations. The aim of this thesis is to explore a generalized dynamic time warping for time series clustering. This work includes first the problem of prototype extraction, then the alignment of multiple and multidimensional time series.
|
9 |
Desenvolvimento de uma técnica computacional de processamento espaço-temporal aplicada em séries de precipitaçãoGuarienti, Gracyeli Santos Souza 27 May 2015 (has links)
Submitted by Jordan (jordanbiblio@gmail.com) on 2017-05-04T13:38:27Z
No. of bitstreams: 1
DISS_2015_Gracyeli Santos Souza Guarienti.pdf: 4160382 bytes, checksum: 066e507b4df1c012a091983043416a9b (MD5) / Approved for entry into archive by Jordan (jordanbiblio@gmail.com) on 2017-05-04T15:41:01Z (GMT) No. of bitstreams: 1
DISS_2015_Gracyeli Santos Souza Guarienti.pdf: 4160382 bytes, checksum: 066e507b4df1c012a091983043416a9b (MD5) / Made available in DSpace on 2017-05-04T15:41:01Z (GMT). No. of bitstreams: 1
DISS_2015_Gracyeli Santos Souza Guarienti.pdf: 4160382 bytes, checksum: 066e507b4df1c012a091983043416a9b (MD5)
Previous issue date: 2015-05-27 / CAPES / Variáveis climatológicas podem ser estudadas a partir de seu comportamento temporal.
Nesse sentido, este trabalho desenvolveu uma técnica computacional de processamento
espaço-temporal de variáveis climatológicas que utiliza busca por similaridade e a
possibilidade de comparação em várias resoluções temporais. Para demonstração do uso
da técnica e verificação dos resultados, sequências de processamento foram aplicadas
em séries de precipitação de um período de quinze anos usando os algoritmos Dynamic
Time Warping (DTW) e wavelet em quatro biomas: Amazônia, Cerrado, Pantanal e
Mata Atlântica. A técnica foi aplicada nas séries originais e em suas wavelets, com
resoluções temporais mensal, semestral, anual e quinze anos de forma a permitir que
análises específicas em cada resolução possam ser aplicadas. A flexibilidade e a
variedade de resoluções temporais permitidas pela técnica torna possível acrescentar aos
processos de monitoramento ambiental novas perspectivas em tomadas de decisão. / Climatic variables can be studied from its temporal behavior. In this sense, this study
developed a temporal analysis technique for climatological variables using similarity
search and the possibility of comparison in various temporal resolution levels. For the
income statement, several processing sequences were applied in series of precipitation a
period of fifteen years using the Dynamic Time Warping algorithm (DTW) and wavelet
on four biomes: Amazon, Cerrado, Pantanal and Atlantic Forest. The technique was
applied to the original data and wavelets, in the temporal resolution of time monthly,
semi-annual, annual and fifteen years enable visualization and comparison of data on
these different scales. Application the technique developed in this study, provide new
perspectives to decision-making in environmental monitoring processes.
|
10 |
Shluková analýza v oblasti biosignálů / Cluster analysis in biosignal processingKalous, Stanislav January 2008 (has links)
This diploma thesis deals with cluster analysis for long-term electrocardiograms (ECG) clustering. The linear filtration is used for ECG preprocessing. The ECG sign segmenting in single heart cycles is based on the detection QRS complex and consequently to an application of dynamic time warping algorithms. To an application of all these mentioned processes and to results interpretation, a program called Cluster analysis has been created in the Matlab background. The results of this diploma thesis confirm that cluster analysis is able to distinguish cardiac arrhythmias which are typical with their shape distinctness of normal heart cycles.
|
Page generated in 0.0373 seconds