Global ETD Search

71	Query-by-Example Keyword Spotting / Query-by-Example Keyword Spotting Skácel, Miroslav January 2015 (has links) Tato diplomová práce se zabývá moderními přístupy detekce klíčových slov a detekce frází v řečových datech. V úvodní části je seznámení s problematikou a teoretický popis metod pro detekci. Následuje popis reprezentace vstupních datových sad použitých při experimentech a evaluaci. Dále jsou uvedeny metody pro detekci klíčových slov definovaných vzorem. Následně jsou popsány evaluační metody a techniky použité pro skórování. Po provedení experimentů na datových sadách a po evaluaci jsou diskutovány výsledky. V dalším kroku jsou navrženy a poté implementovány moderní postupy vedoucí k vylepšení systému pro detekci a opět je provedena evaluace a diskuze dosažených výsledků. V závěrečné části je práce zhodnocena a jsou zde navrženy další směy vývoje našeho systému. Příloha obsahuje manuál pro používání implementovaných skriptů.
72	Sada JavaAppletů pro demonstraci zpracování řeči / Set of JavaApplets Demonstrations for Speech Processing Kudr, Michal January 2010 (has links) The goal of the thesis is being familiar with methods a techniques used in speech processing. Using the obtained knowledge I propose three JavaApplets demonstrating selected methods. In this thesis we can find the theoretical analysis of selected problems.
73	Vyhodnocení příbuznosti organismů pomocí číslicového zpracování genomických dat / Evaluation of Organisms Relationship by Genomic Signal Processing Škutková, Helena January 2016 (has links) This dissertation deals with alternative techniques for analysis of genetic information of organisms. The theoretical part presents two different approaches for evaluation of relationship between organisms based on mutual similarity of genetic information contained in their DNA sequences. The first approach is currently standardized phylogenetics analysis of character based records of DNA sequences. Although this approach is computationally expensive due to the need of multiple sequence alignment, it allows evaluation of global and local similarity of DNA sequences. The second approach is represented by techniques for classification of DNA sequences in a form of numerical vectors representing characteristic features of their genetic information. These methods known as „alignment free“ allow fast evaluation of global similarity but cannot evaluate local changes. The new method presented in this dissertation combines the advantages of both approaches. It utilizes numerical representation similar to 1D digital signal, i.e. representation that contains specific trend along x-axis. The experimental part of dissertation deals with design of a set of appropriate tools for genomic signal processing to allow evaluation mutual similarity of taxonomically specific trends. On the basis of the mutual similarity of genomic signals, the classification in the form of dendrogram is applied. It corresponds to phylogenetic trees used in standard phylogenetics.
74	Identifikace cover verzí skladeb pomocí harmonických příznaků, modelu harmonie a harmonické složitosti / Cover Song Identification using Music Harmony Features, Model and Complexity Analysis Maršík, Ladislav January 2019 (has links) Title: Cover Song Identification using Music Harmony Features, Model and Complexity Analysis Author: Ladislav Maršík Department: Department of Software Engineering Supervisor: Prof. RNDr. Jaroslav Pokorný, CSc., Department of Software Engineering Abstract: Analysis of digital music and its retrieval based on the audio fe- atures is one of the popular topics within the music information retrieval (MIR) field. Every musical piece has its characteristic harmony structure, but harmony analysis is seldom used for retrieval. Retrieval systems that do not focus on similarities in harmony progressions may consider two versions of the same song different, even though they differ only in instrumentation or a singing voice. This thesis takes various paths in exploring, how music harmony can be used in MIR, and in particular, the cover song identification (CSI) task. We first create a music harmony model based on the knowledge of music theory. We define novel concepts: a harmonic complexity of a mu- sical piece, as well as the chord and chroma distance features. We show how these concepts can be used for retrieval, complexity analysis, and how they compare with the state-of-the-art of music harmony modeling. An extensive comparison of harmony features is then performed, using both the novel fe- atures and the...
75	Automatic Annotation of Speech: Exploring Boundaries within Forced Alignment for Swedish and Norwegian / Automatisk Anteckning av Tal: Utforskning av Gränser inom Forced Alignment för Svenska och Norska Biczysko, Klaudia January 2022 (has links) In Automatic Speech Recognition, there is an extensive need for time-aligned data. Manual speech segmentation has been shown to be more laborious than manual transcription, especially when dealing with tens of hours of speech. Forced alignment is a technique for matching a signal with its orthographic transcription with respect to the duration of linguistic units. Most forced aligners, however, are language-dependent and trained on English data, whereas under-resourced languages lack the resources to develop an acoustic model required for an aligner, as well as manually aligned data. An alternative solution to the training of new models can be cross-language forced alignment, in which an aligner trained on one language is used for aligning data in another language. This thesis aimed to evaluate state-of-the-art forced alignment algorithms available for Swedish and test whether a Swedish model could be applied for aligning Norwegian. Three approaches for forced aligners were employed: (1) one forced aligner based on Dynamic Time Warping and text-to-speech synthesis Aeneas, (2) two forced aligners based on Hidden Markov Models, namely the Munich AUtomatic Segmentation System (WebMAUS) and the Montreal Forced Aligner (MFA) and (3) Connectionist Temporal Classification (CTC) segmentation algorithm with two pre-trained and fine-tuned Wav2Vec2 Swedish models. First, small speech test sets for Norwegian and Swedish, covering different types of spontaneousness in the speech, were created and manually aligned to create gold-standard alignments. Second, the performance of the Swedish dataset was evaluated with respect to the gold standard. Finally, it was tested whether Swedish forced aligners could be applied for aligning Norwegian data. The performance of the aligners was assessed by measuring the difference between the boundaries set in the gold standard from that of the comparison alignment. The accuracy was estimated by calculating the proportion of alignments below a particular threshold proposed in the literature. It was found that the performance of the CTC segmentation algorithm with Wav2Vec2 (VoxRex) was superior to other forced alignment systems. The differences between the alignments of two Wav2Vec2 models suggest that the training data may have a larger influence on the alignments, than the architecture of the algorithm. In lower thresholds, the traditional HMM approach outperformed the deep learning models. Finally, findings from the thesis have demonstrated promising results for cross-language forced alignment using Swedish models to align related languages, such as Norwegian. forced alignment automatic speech recognition ASR natural language processing under-resourced languages Swedish Norwegian CTC segmentation wav2vec2 kaldi HTK dynamic time warping
76	Classification de transcrits d’ARN à partir de données brutes générées par le séquençage par nanopores Atanasova, Kristina 12 1900 (has links) Le rythme impressionnant auquel les technologies de séquençage progressent est alimenté par leur promesse de révolutionner les soins de santé et la recherche biomédicale. Le séquençage par nanopores est devenu une technologie attrayante pour résoudre des lacunes des technologies précédentes, mais aussi pour élargir nos connaissances sur le transcriptome en générant des lectures longues qui simplifient l’assemblage et la détection de grandes variations structurelles. Au cours du processus de séquençage, les nanopores mesurent les signaux de courant électrique représentant les bases (A, C, G, T) qui se déplacent à travers chaque nanopore. Tous les nanopores produisent simultanément des signaux qui peuvent être analysés en temps réel et traduits en bases par le processus d’appel de bases. Malgré la réduction du coût de séquençage et la portabilité des séquenceurs, le taux d’erreur de l’appel de base entrave leur mise en oeuvre dans la recherche biomédicale. Le but de ce mémoire est de classifier des séquences d’ARNm individuelles en différents groupes d’isoformes via l’élucidation de motifs communs dans leur signal brut. Nous proposons d’utiliser l’algorithme de déformation temporelle dynamique (DTW) pour l’alignement de séquences combiné à la technologie nanopore afin de contourner directement le processus d’appel de base. Nous avons exploré de nouvelles stratégies pour démontrer l’impact de différents segments du signal sur la classification des signaux. Nous avons effectué des analyses comparatives pour suggérer des paramètres qui augmentent la performance de classification et orientent les analyses futures sur les données brutes du séquençage par nanopores. / The impressive rate at which sequencing technologies are progressing is fueled by their promise to revolutionize healthcare and biomedical research. Nanopore sequencing has become an attractive technology to address shortcomings of previous technologies, but also to expand our knowledge of the transcriptome by generating long reads that simplify assembly and detection of large structural variations. During the sequencing process, the nanopores measure electrical current signals representing the bases (A, C, G, T) moving through each nanopore. All nanopores simultaneously produce signals that can be analyzed in real time and translated into bases by the base calling process. Despite the reduction in sequencing cost and the portability of sequencers, the base call error rate hampers their implementation in biomedical research. The aim of this project is to classify individual mRNA sequences into different groups of isoforms through the elucidation of common motifs in their raw signal. We propose to use the dynamic time warping (DTW) algorithm for sequence alignment combined with nanopore technology to directly bypass the basic calling process. We explored new strategies to demonstrate the impact of different signal segments on signal classification. We performed comparative analyzes to suggest parameters that increase classification performance and guide future analyzes on raw nanopore sequencing data. Transcriptomique ARN synthétique Séquençage par nanopores Déformation temporelle dynamique Alignement de signaux Bio-informatique Transcriptomics Synthetic RNA Nanopore sequencing Dynamic time warping (DTW) Signal alignment Bioinformatics
77	Détection des événements rares dans des vidéos / Detecting rare events in video sequences Pop, Ionel 23 September 2010 (has links) Le travail présenté dans cette étude se place dans le contexte de l’analyse automatique des vidéos. A cause du nombre croissant des données vidéo, il est souvent difficile, voire impossible qu’un ou plusieurs opérateurs puissent les regarder toutes. Une demande récurrente est d’identifier les moments dans la vidéo quand il y a quelque chose d’inhabituel qui se passe, c’est-à-dire la détection des événements anormaux.Nous proposons donc plusieurs algorithmes permettant d’identifier des événements inhabituels, en faisant l’hypothèse que ces événements ont une faible probabilité. Nous abordons plusieurs types d’événements, de l’analyse des zones en mouvement à l’analyse des trajectoires des objets suivis.Après avoir dédié une partie de la thèse à la construction d’un système de suivi,nous proposons plusieurs mesures de similarité entre des trajectoires. Ces mesures, basées sur DTW (Dynamic Time Warping), estiment la similarité des trajectoires prenant en compte différents aspects : spatial, mais aussi temporel, pour pouvoir - par exemple - faire la différence entre des trajectoires qui ne sont pas parcourues de la même façon (en termes de vitesse de déplacement). Ensuite, nous construisons des modèles de trajectoires, permettant de représenter les comportements habituels des objets pour pouvoir ensuite détecter ceux qui s’éloignent de la normale.Pour pallier les défauts de suivi qui apparaissent dans la pratique, nous analysons les vecteurs de flot optique et nous construisons une carte de mouvement. Cette carte modélise sous la forme d’un codebook les directions privilégiées qui apparaissent pour chaque pixel, permettant ainsi d’identifier tout déplacement anormal, sans avoir pour autant la notion d’objet suivi. En utilisant la cohérence temporelle, nous pouvons améliorer encore plus le taux de détection, affecté par les erreurs d’estimation de flot optique. Dans un deuxième temps, nous changeons la méthode de constructions de cette carte de mouvements, pour pouvoir extraire des caractéristiques de plus haut niveau — l’équivalent des trajectoires, mais toujours sans nécessiter le suivi des objets. Nous pouvons ainsi réutiliser partiellement l’analyse des trajectoires pour détecter des événements rares.Tous les aspects présentés dans cette thèse ont été implémentés et nous avons construit certaines applications, comme la prédiction des déplacements des objets ou la mémorisation et la recherche des objets suivis. / The growing number of video data makes often difficult, even impossible, any attemptof watching them entirely. In the context of automatic analysis of videos, a recurring request is to identify moments in the video when something unusual happens.We propose several algorithms to identify unusual events, making the hypothesis that these events have a low probability. We address several types of events, from those generates by moving areas to the trajectories of objects tracked. In the first part of the study, we build a simple tracking system. We propose several measures of similarity between trajectories. These measures give an estimate of the similarity of trajectories by taking into account both spatial and/or temporal aspects. It is possible to differentiate between objects moving on the same path, but with different speeds. Based on these measures, we build models of trajectories representing the common behavior of objects, so that we can identify those that are abnormal.We noticed that the tracking yields bad results, especially in crowd situations. Therefore, we use the optical flow vectors to build a movement model based on a codebook. This model stores the preferred movement directions for each pixel. It is possible to identify abnormal movement at pixel-level, without having to use a tracker. By using temporal coherence, we can further improve the detection rate, affected by errors of estimation of optic flow. In a second step, we change the method of construction of this model. With the new approach, we can extract higher-level features — the equivalent trajectories, but still without the notion of object tracking. In this situation, we can reuse partial trajectory analysis to detect rare events.All aspects presented in this study have been implemented. In addition, we have design some applications, like predicting the trajectories of visible objects or storing and retrieving tracked objects in a database. Événement rare Vidéo Trajectoire Carte de mouvement Comptage de personnes Modèle de mouvement Programmation dynamique Suivi Modèle de fond Flot optique Apprentissage incrémental Rare event Video Trajectory Mouvement map Person counting Mouvement model Dynamic programming Dynamic Time Warping Tracking Background model Optical flow Incremental learning
78	Intégration d'images multimodales pour la caractérisation de cardiomyopathies hypertrophiques et d'asynchronismes cardiaques / Multimodal image registration for the characterization of the hypertrophic cardiomyopathy and the cardiac asynchronism Betancur Acevedo, Julian Andrés 27 May 2014 (has links) Cette thèse porte sur la caractérisation cardiaque, qui représente un enjeu méthodologique et clinique important, à la fois pour améliorer le diagnostic des pathologies et optimiser les moyens de traitement. Des méthodes de recalage et de fusion de données sont proposées pour amener dans un même référentiel des images IRM, scanner, échographiques et électro-anatomiques et ainsi décrire le cœur suivant des caractéristiques anatomiques, électriques, mécaniques et tissulaires. Les méthodes proposées pour recaler des données multimodales reposent sur deux processus principaux : l'alignement temporel et le recalage spatial. Les dimensions temporelles des images considérées sont mises en synchronisées par une méthode de déformation temporelle dynamique adaptative. Celle-ci permet de compenser les modifications temporelles non-linéaires entre les différentes acquisitions. Pour le recalage spatial, des méthodes iconiques ont été développées pour corriger les artefacts de mouvements dans les séquences ciné-IRM, pour recaler les séquences ciné-IRM avec les séquences d'IRM de rehaussement tardif et pour recaler les ciné-IRM avec les images scanner. D'autre part, une méthode basée contours, développée dans un précédent travail, a été améliorée pour prendre en compte des acquisitions échographiques multi-vues. Ces méthodes ont été évaluées sur données réelles pour sélectionner les métriques les plus adaptées et pour quantifier les performances des approches iconiques et pour estimer la précision du recalage entre échographies et ciné-IRM. Ces méthodes sont appliquées à la caractérisation de cardiomyopathies hypertrophiques (CMH) et d'asynchronismes cardiaques. Pour la CMH, l'objectif était de mieux interpréter les données échographiques par la fusion de l'information de fibrose issue de l'IRM de rehaussement tardif avec l'information mécanique issue de l'échographie de speckle tracking. Cette analyse a permis d'évaluer le strain régional en tant qu'indicateur de la présence locale de fibrose. Concernant l'asynchronisme cardiaque, nous avons établi une description du couplage électromécanique local du ventricule gauche par la fusion de données échographiques, électro-anatomiques, scanner et, dans les cas appropriés, d'IRM de rehaussement tardif. Cette étude de faisabilité ouvre des perspectives pour l'utilisation de nouveaux descripteurs pour la sélection des sites de stimulation optimaux pour la thérapie de resynchronisation cardiaque. / This work concerns cardiac characterization, a major methodological and clinical issue, both to improve disease diagnostic and to optimize its treatment. Multisensor registration and fusion methods are proposed to bring into a common referential data from cardiac magnetic resonance (CMRI), dynamic cardiac X-ray computed tomography (CT), speckle tracking echocardiography (STE) and electro-anatomical mappings of the inner left ventricular chamber (EAM). These data is used to describe the heart by its anatomy, electrical and mechanical function, and the state of the myocardial tissue. The methods proposed to register the multimodal datasets rely on two main processes: temporal registration and spatial registration. The temporal dimensions of input data (images) are warped with an adaptive dynamic time warping (ADTW) method. This method allowed to handle the nonlinear temporal relationship between the different acquisitions. Concerning the spatial registration, iconic methods were developed, on the one hand, to correct for motion artifacts in cine acquisition, to register cine-CMRI and late gadolinium CMRI (LGE-CMRI), and to register cine-CMRI with dynamic CT. On the other hand, a contour-based method developed in a previous work was enhanced to account for multiview STE acquisitions. These methods were evaluated on real data in terms of the best metrics to use and of the accuracy of the iconic methods, and to assess the STE to cine-CMRI registration. The fusion of these multisensor data enabled to get insights about the diseased heart in the context of hypertrophic cardiomyopathy (HCM) and cardiac asynchronism. For HCM, we aimed to improve the understanding of STE by fusing fibrosis from LGE-CMRI with strain from multiview 2D STE. This analysis allowed to assess the significance of regional STE strain as a surrogate of the presence of regional myocardial fibrosis. Concerning cardiac asynchronism, we aimed to describe the intra-segment electro-mechanical coupling of the left ventricle using fused data from STE, EAM, CT and, if relevant, from LGE-CMRI. This feasibility study provided new elements to select the optimal sites for LV stimulation. Fusion de données Recalage d'images multicapteurs Déformation temporelle dynamique Recalage dynamique Cardiomyopathie hypertrophique Asynchronismes cardiaques Rehaussement tardif Scanner multi-Barrette, échographie Carte electro-Anatomique Data fusion Dynamic registration Hyperthrophic cardiomyopathy Cardiac asynchronism Cardiac dyssynchrony Cardiac resynchronization therapy Cardiac magnetic resonance Late gadolinium enhancement Computed tomography Echocardiography Electro-Anatomical mapping
79	Rozpoznáváni standardních PILOT-CONTROLLER řídicích povelů v hlasové podobě / Voice recognition of standard PILOT-CONTROLLER control commands Kufa, Tomáš January 2009 (has links) The subject of this graduation thesis is an application of speech recognition into ATC commands. The selection of methods and approaches to automatic recognition of ATC commands rises from detailed air traffic studies. By the reason that there is not any definite solution in such extensive field like speech recognition, this diploma work is focused just on speech recognizer based on comparison with templates (DTW). This recognizor is in this thesis realized and compared with freely accessible HTK system from Cambrige University based on statistic methods making use of Hidden Markov models. The usage propriety of both methods is verified by practical testing and results evaluation.
80	Identifikace osob pomocí otisku hlasu / Identification of persons via voice imprint Mekyska, Jiří January 2010 (has links) This work deals with the text-dependent speaker recognition in systems, where just a few training samples exist. For the purpose of this recognition, the voice imprint based on different features (e.g. MFCC, PLP, ACW etc.) is proposed. At the beginning, there is described the way, how the speech signal is produced. Some speech characteristics important for speaker recognition are also mentioned. The next part of work deals with the speech signal analysis. There is mentioned the preprocessing and also the feature extraction methods. The following part describes the process of speaker recognition and mentions the evaluation of the used methods: speaker identification and verification. Last theoretically based part of work deals with the classifiers which are suitable for the text-dependent recognition. The classifiers based on fractional distances, dynamic time warping, dispersion matching and vector quantization are mentioned. This work continues by design and realization of system, which evaluates all described classifiers for voice imprint based on different features.

Search results