• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 32
  • 11
  • 7
  • 6
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 81
  • 81
  • 81
  • 20
  • 18
  • 15
  • 15
  • 14
  • 13
  • 12
  • 12
  • 12
  • 12
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Identifikace cover verzí skladeb pomocí harmonických příznaků, modelu harmonie a harmonické složitosti / Cover Song Identification using Music Harmony Features, Model and Complexity Analysis

Maršík, Ladislav January 2019 (has links)
Title: Cover Song Identification using Music Harmony Features, Model and Complexity Analysis Author: Ladislav Maršík Department: Department of Software Engineering Supervisor: Prof. RNDr. Jaroslav Pokorný, CSc., Department of Software Engineering Abstract: Analysis of digital music and its retrieval based on the audio fe- atures is one of the popular topics within the music information retrieval (MIR) field. Every musical piece has its characteristic harmony structure, but harmony analysis is seldom used for retrieval. Retrieval systems that do not focus on similarities in harmony progressions may consider two versions of the same song different, even though they differ only in instrumentation or a singing voice. This thesis takes various paths in exploring, how music harmony can be used in MIR, and in particular, the cover song identification (CSI) task. We first create a music harmony model based on the knowledge of music theory. We define novel concepts: a harmonic complexity of a mu- sical piece, as well as the chord and chroma distance features. We show how these concepts can be used for retrieval, complexity analysis, and how they compare with the state-of-the-art of music harmony modeling. An extensive comparison of harmony features is then performed, using both the novel fe- atures and the...
72

Automatic Annotation of Speech: Exploring Boundaries within Forced Alignment for Swedish and Norwegian / Automatisk Anteckning av Tal: Utforskning av Gränser inom Forced Alignment för Svenska och Norska

Biczysko, Klaudia January 2022 (has links)
In Automatic Speech Recognition, there is an extensive need for time-aligned data. Manual speech segmentation has been shown to be more laborious than manual transcription, especially when dealing with tens of hours of speech. Forced alignment is a technique for matching a signal with its orthographic transcription with respect to the duration of linguistic units. Most forced aligners, however, are language-dependent and trained on English data, whereas under-resourced languages lack the resources to develop an acoustic model required for an aligner, as well as manually aligned data. An alternative solution to the training of new models can be cross-language forced alignment, in which an aligner trained on one language is used for aligning data in another language.  This thesis aimed to evaluate state-of-the-art forced alignment algorithms available for Swedish and test whether a Swedish model could be applied for aligning Norwegian. Three approaches for forced aligners were employed: (1) one forced aligner based on Dynamic Time Warping and text-to-speech synthesis Aeneas, (2) two forced aligners based on Hidden Markov Models, namely the Munich AUtomatic Segmentation System (WebMAUS) and the Montreal Forced Aligner (MFA) and (3) Connectionist Temporal Classification (CTC) segmentation algorithm with two pre-trained and fine-tuned Wav2Vec2 Swedish models. First, small speech test sets for Norwegian and Swedish, covering different types of spontaneousness in the speech, were created and manually aligned to create gold-standard alignments. Second, the performance of the Swedish dataset was evaluated with respect to the gold standard. Finally, it was tested whether Swedish forced aligners could be applied for aligning Norwegian data. The performance of the aligners was assessed by measuring the difference between the boundaries set in the gold standard from that of the comparison alignment. The accuracy was estimated by calculating the proportion of alignments below a particular threshold proposed in the literature. It was found that the performance of the CTC segmentation algorithm with Wav2Vec2 (VoxRex) was superior to other forced alignment systems. The differences between the alignments of two Wav2Vec2 models suggest that the training data may have a larger influence on the alignments, than the architecture of the algorithm. In lower thresholds, the traditional HMM approach outperformed the deep learning models. Finally, findings from the thesis have demonstrated promising results for cross-language forced alignment using Swedish models to align related languages, such as Norwegian.
73

Classification de transcrits d’ARN à partir de données brutes générées par le séquençage par nanopores

Atanasova, Kristina 12 1900 (has links)
Le rythme impressionnant auquel les technologies de séquençage progressent est alimenté par leur promesse de révolutionner les soins de santé et la recherche biomédicale. Le séquençage par nanopores est devenu une technologie attrayante pour résoudre des lacunes des technologies précédentes, mais aussi pour élargir nos connaissances sur le transcriptome en générant des lectures longues qui simplifient l’assemblage et la détection de grandes variations structurelles. Au cours du processus de séquençage, les nanopores mesurent les signaux de courant électrique représentant les bases (A, C, G, T) qui se déplacent à travers chaque nanopore. Tous les nanopores produisent simultanément des signaux qui peuvent être analysés en temps réel et traduits en bases par le processus d’appel de bases. Malgré la réduction du coût de séquençage et la portabilité des séquenceurs, le taux d’erreur de l’appel de base entrave leur mise en oeuvre dans la recherche biomédicale. Le but de ce mémoire est de classifier des séquences d’ARNm individuelles en différents groupes d’isoformes via l’élucidation de motifs communs dans leur signal brut. Nous proposons d’utiliser l’algorithme de déformation temporelle dynamique (DTW) pour l’alignement de séquences combiné à la technologie nanopore afin de contourner directement le processus d’appel de base. Nous avons exploré de nouvelles stratégies pour démontrer l’impact de différents segments du signal sur la classification des signaux. Nous avons effectué des analyses comparatives pour suggérer des paramètres qui augmentent la performance de classification et orientent les analyses futures sur les données brutes du séquençage par nanopores. / The impressive rate at which sequencing technologies are progressing is fueled by their promise to revolutionize healthcare and biomedical research. Nanopore sequencing has become an attractive technology to address shortcomings of previous technologies, but also to expand our knowledge of the transcriptome by generating long reads that simplify assembly and detection of large structural variations. During the sequencing process, the nanopores measure electrical current signals representing the bases (A, C, G, T) moving through each nanopore. All nanopores simultaneously produce signals that can be analyzed in real time and translated into bases by the base calling process. Despite the reduction in sequencing cost and the portability of sequencers, the base call error rate hampers their implementation in biomedical research. The aim of this project is to classify individual mRNA sequences into different groups of isoforms through the elucidation of common motifs in their raw signal. We propose to use the dynamic time warping (DTW) algorithm for sequence alignment combined with nanopore technology to directly bypass the basic calling process. We explored new strategies to demonstrate the impact of different signal segments on signal classification. We performed comparative analyzes to suggest parameters that increase classification performance and guide future analyzes on raw nanopore sequencing data.
74

Détection des événements rares dans des vidéos / Detecting rare events in video sequences

Pop, Ionel 23 September 2010 (has links)
Le travail présenté dans cette étude se place dans le contexte de l’analyse automatique des vidéos. A cause du nombre croissant des données vidéo, il est souvent difficile, voire impossible qu’un ou plusieurs opérateurs puissent les regarder toutes. Une demande récurrente est d’identifier les moments dans la vidéo quand il y a quelque chose d’inhabituel qui se passe, c’est-à-dire la détection des événements anormaux.Nous proposons donc plusieurs algorithmes permettant d’identifier des événements inhabituels, en faisant l’hypothèse que ces événements ont une faible probabilité. Nous abordons plusieurs types d’événements, de l’analyse des zones en mouvement à l’analyse des trajectoires des objets suivis.Après avoir dédié une partie de la thèse à la construction d’un système de suivi,nous proposons plusieurs mesures de similarité entre des trajectoires. Ces mesures, basées sur DTW (Dynamic Time Warping), estiment la similarité des trajectoires prenant en compte différents aspects : spatial, mais aussi temporel, pour pouvoir - par exemple - faire la différence entre des trajectoires qui ne sont pas parcourues de la même façon (en termes de vitesse de déplacement). Ensuite, nous construisons des modèles de trajectoires, permettant de représenter les comportements habituels des objets pour pouvoir ensuite détecter ceux qui s’éloignent de la normale.Pour pallier les défauts de suivi qui apparaissent dans la pratique, nous analysons les vecteurs de flot optique et nous construisons une carte de mouvement. Cette carte modélise sous la forme d’un codebook les directions privilégiées qui apparaissent pour chaque pixel, permettant ainsi d’identifier tout déplacement anormal, sans avoir pour autant la notion d’objet suivi. En utilisant la cohérence temporelle, nous pouvons améliorer encore plus le taux de détection, affecté par les erreurs d’estimation de flot optique. Dans un deuxième temps, nous changeons la méthode de constructions de cette carte de mouvements, pour pouvoir extraire des caractéristiques de plus haut niveau — l’équivalent des trajectoires, mais toujours sans nécessiter le suivi des objets. Nous pouvons ainsi réutiliser partiellement l’analyse des trajectoires pour détecter des événements rares.Tous les aspects présentés dans cette thèse ont été implémentés et nous avons construit certaines applications, comme la prédiction des déplacements des objets ou la mémorisation et la recherche des objets suivis. / The growing number of video data makes often difficult, even impossible, any attemptof watching them entirely. In the context of automatic analysis of videos, a recurring request is to identify moments in the video when something unusual happens.We propose several algorithms to identify unusual events, making the hypothesis that these events have a low probability. We address several types of events, from those generates by moving areas to the trajectories of objects tracked. In the first part of the study, we build a simple tracking system. We propose several measures of similarity between trajectories. These measures give an estimate of the similarity of trajectories by taking into account both spatial and/or temporal aspects. It is possible to differentiate between objects moving on the same path, but with different speeds. Based on these measures, we build models of trajectories representing the common behavior of objects, so that we can identify those that are abnormal.We noticed that the tracking yields bad results, especially in crowd situations. Therefore, we use the optical flow vectors to build a movement model based on a codebook. This model stores the preferred movement directions for each pixel. It is possible to identify abnormal movement at pixel-level, without having to use a tracker. By using temporal coherence, we can further improve the detection rate, affected by errors of estimation of optic flow. In a second step, we change the method of construction of this model. With the new approach, we can extract higher-level features — the equivalent trajectories, but still without the notion of object tracking. In this situation, we can reuse partial trajectory analysis to detect rare events.All aspects presented in this study have been implemented. In addition, we have design some applications, like predicting the trajectories of visible objects or storing and retrieving tracked objects in a database.
75

Intégration d'images multimodales pour la caractérisation de cardiomyopathies hypertrophiques et d'asynchronismes cardiaques / Multimodal image registration for the characterization of the hypertrophic cardiomyopathy and the cardiac asynchronism

Betancur Acevedo, Julian Andrés 27 May 2014 (has links)
Cette thèse porte sur la caractérisation cardiaque, qui représente un enjeu méthodologique et clinique important, à la fois pour améliorer le diagnostic des pathologies et optimiser les moyens de traitement. Des méthodes de recalage et de fusion de données sont proposées pour amener dans un même référentiel des images IRM, scanner, échographiques et électro-anatomiques et ainsi décrire le cœur suivant des caractéristiques anatomiques, électriques, mécaniques et tissulaires. Les méthodes proposées pour recaler des données multimodales reposent sur deux processus principaux : l'alignement temporel et le recalage spatial. Les dimensions temporelles des images considérées sont mises en synchronisées par une méthode de déformation temporelle dynamique adaptative. Celle-ci permet de compenser les modifications temporelles non-linéaires entre les différentes acquisitions. Pour le recalage spatial, des méthodes iconiques ont été développées pour corriger les artefacts de mouvements dans les séquences ciné-IRM, pour recaler les séquences ciné-IRM avec les séquences d'IRM de rehaussement tardif et pour recaler les ciné-IRM avec les images scanner. D'autre part, une méthode basée contours, développée dans un précédent travail, a été améliorée pour prendre en compte des acquisitions échographiques multi-vues. Ces méthodes ont été évaluées sur données réelles pour sélectionner les métriques les plus adaptées et pour quantifier les performances des approches iconiques et pour estimer la précision du recalage entre échographies et ciné-IRM. Ces méthodes sont appliquées à la caractérisation de cardiomyopathies hypertrophiques (CMH) et d'asynchronismes cardiaques. Pour la CMH, l'objectif était de mieux interpréter les données échographiques par la fusion de l'information de fibrose issue de l'IRM de rehaussement tardif avec l'information mécanique issue de l'échographie de speckle tracking. Cette analyse a permis d'évaluer le strain régional en tant qu'indicateur de la présence locale de fibrose. Concernant l'asynchronisme cardiaque, nous avons établi une description du couplage électromécanique local du ventricule gauche par la fusion de données échographiques, électro-anatomiques, scanner et, dans les cas appropriés, d'IRM de rehaussement tardif. Cette étude de faisabilité ouvre des perspectives pour l'utilisation de nouveaux descripteurs pour la sélection des sites de stimulation optimaux pour la thérapie de resynchronisation cardiaque. / This work concerns cardiac characterization, a major methodological and clinical issue, both to improve disease diagnostic and to optimize its treatment. Multisensor registration and fusion methods are proposed to bring into a common referential data from cardiac magnetic resonance (CMRI), dynamic cardiac X-ray computed tomography (CT), speckle tracking echocardiography (STE) and electro-anatomical mappings of the inner left ventricular chamber (EAM). These data is used to describe the heart by its anatomy, electrical and mechanical function, and the state of the myocardial tissue. The methods proposed to register the multimodal datasets rely on two main processes: temporal registration and spatial registration. The temporal dimensions of input data (images) are warped with an adaptive dynamic time warping (ADTW) method. This method allowed to handle the nonlinear temporal relationship between the different acquisitions. Concerning the spatial registration, iconic methods were developed, on the one hand, to correct for motion artifacts in cine acquisition, to register cine-CMRI and late gadolinium CMRI (LGE-CMRI), and to register cine-CMRI with dynamic CT. On the other hand, a contour-based method developed in a previous work was enhanced to account for multiview STE acquisitions. These methods were evaluated on real data in terms of the best metrics to use and of the accuracy of the iconic methods, and to assess the STE to cine-CMRI registration. The fusion of these multisensor data enabled to get insights about the diseased heart in the context of hypertrophic cardiomyopathy (HCM) and cardiac asynchronism. For HCM, we aimed to improve the understanding of STE by fusing fibrosis from LGE-CMRI with strain from multiview 2D STE. This analysis allowed to assess the significance of regional STE strain as a surrogate of the presence of regional myocardial fibrosis. Concerning cardiac asynchronism, we aimed to describe the intra-segment electro-mechanical coupling of the left ventricle using fused data from STE, EAM, CT and, if relevant, from LGE-CMRI. This feasibility study provided new elements to select the optimal sites for LV stimulation.
76

Rozpoznáváni standardních PILOT-CONTROLLER řídicích povelů v hlasové podobě / Voice recognition of standard PILOT-CONTROLLER control commands

Kufa, Tomáš January 2009 (has links)
The subject of this graduation thesis is an application of speech recognition into ATC commands. The selection of methods and approaches to automatic recognition of ATC commands rises from detailed air traffic studies. By the reason that there is not any definite solution in such extensive field like speech recognition, this diploma work is focused just on speech recognizer based on comparison with templates (DTW). This recognizor is in this thesis realized and compared with freely accessible HTK system from Cambrige University based on statistic methods making use of Hidden Markov models. The usage propriety of both methods is verified by practical testing and results evaluation.
77

Identifikace osob pomocí otisku hlasu / Identification of persons via voice imprint

Mekyska, Jiří January 2010 (has links)
This work deals with the text-dependent speaker recognition in systems, where just a few training samples exist. For the purpose of this recognition, the voice imprint based on different features (e.g. MFCC, PLP, ACW etc.) is proposed. At the beginning, there is described the way, how the speech signal is produced. Some speech characteristics important for speaker recognition are also mentioned. The next part of work deals with the speech signal analysis. There is mentioned the preprocessing and also the feature extraction methods. The following part describes the process of speaker recognition and mentions the evaluation of the used methods: speaker identification and verification. Last theoretically based part of work deals with the classifiers which are suitable for the text-dependent recognition. The classifiers based on fractional distances, dynamic time warping, dispersion matching and vector quantization are mentioned. This work continues by design and realization of system, which evaluates all described classifiers for voice imprint based on different features.
78

Smart Sheet Music Reader for Android / Smart Sheet Music Reader for Android

Smejkal, Vojtěch January 2014 (has links)
Oblasti jako automatické otáčení stránek nebo automatický hudební doprovod jsou studovány již několik desetiletí. Tato práce shrnuje současné metody pro počítačové sledování not v reálném čase. Zabývá se také hudebními příznaky jako jsou chroma třídy a syntetizované spektrální šablony. Dále popisuje klíčové části systému jako krátkodobou Fourierovu transformaci a Dynamické borcení času. V rámci projektu byl navrhnut a vyvinut vlastní systém pro sledování pozice hráče v notách, který byl následně implementován jako mobilní aplikace. Výsledný systém dokáže sledovat i skladby s výrazně odlišným tempem, pauzami během hry nebo drobnými odchylkami od předepsaných not.
79

Machine-Vision-Based Activity, Mobility and Motion Analysis for Assistance Systems in Human Health Care

Richter, Julia 18 April 2019 (has links)
Due to the continuous ageing of our society, both the care and the health sector will encounter challenges in maintaining the quality of human care and health standards. While the number of people with diseases such as dementia and physical illness will be rising, we are simultaneously recording a lack of medical personnel such as caregivers and therapists. One possible approach that tackles the described problem is the employment of technical assistance systems that support both medical personnel and elderly living alone at home. This thesis presents approaches to provide assistance for these target groups. In this work, algorithms that are integrated in prototypical assistance systems for vision-based human daily activity, mobility and motion analysis have been developed. The developed algorithms process 3-D point clouds as well as skeleton joint positions to generate meta information concerning activities and the mobility of elderly persons living alone at home. Such type of information was not accessible so far and is now available for monitoring. By generating this meta information, a basis for the detection of long-term and short-term health changes has been created. Besides monitoring meta information, mobilisation for maintaining physical capabilities, either ambulatory or at home, is a further focus of this thesis. Algorithms for the qualitative assessment of physical exercises were therefore investigated. Thereby, motion sequences in the form of skeleton joint trajectories as well as the heat development in active muscles were considered. These algorithms enable an autonomous physical training under the supervision of a virtual therapist even at home. / Aufgrund der voranschreitenden Überalterung unserer Gesellschaft werden sowohl der Pflege- als auch der Gesundheitssektor vor enorme Herausforderungen gestellt. Während die Zahl an vorrangig altersbedingten Erkrankungen, wie Demenz oder physische Erkrankungen des Bewegungsapparates, weiterhin zunehmen wird, stagniert die Zahl an medizinischem Fachpersonal, wie Therapeuten und Pflegekräften. An dieser Stelle besteht das Ziel, die Qualität medizinischer Leistungen auf hohem Niveau zu halten und dabei die Einhaltung von Pflege- und Gesundheitsstandards sicherzustellen. Ein möglicher Ansatz hierfür ist der Einsatz technischer Assistenzsysteme, welche sowohl das medizinische Personal und Angehörige entlasten als auch ältere, insbesondere allein lebende Menschen zu Hause unterstützen können. Die vorliegende Arbeit stellt Ansätze zur Unterstützung der genannten Zielgruppen vor, die prototypisch in Assistenzsystemen zur visuellen, kamerabasierten Analyse von täglichen Aktivitäten, von Mobilität und von Bewegungen bei Trainingsübungen integriert sind. Die entwickelten Algorithmen verarbeiten dreidimensionale Punktwolken und Gelenkpositionen des menschlichen Skeletts, um sogenannte Meta-Daten über tägliche Aktivitäten und die Mobilität einer allein lebenden Person zu erhalten. Diese Informationen waren bis jetzt nicht verfügbar, können allerdings für den Patienten selbst, für medizinisches Personal und Angehörige aufschlussreich sein, denn diese Meta-Daten liefern die Grundlage für die Detektion kurz- und langfristiger Veränderungen im Verhalten oder in der Mobilität, die ansonsten wahrscheinlich unbemerkt geblieben wären. Neben der Erfassung solcher Meta-Informationen liegt ein weiterer Fokus der Arbeit in der Mobilisierung von Patienten durch angeleitetes Training, um ihre Mobilität und körperliche Verfassung zu stärken. Dabei wurden Algorithmen zur qualitativen Bewertung und Vermittlung von Korrekturhinweisen bei physischen Trainingsübungen entwickelt, die auf Trajektorien von Gelenkpositionen und der Wärmeentwicklung in Muskeln beruhen. Diese Algorithmen ermöglichen aufgrund der Nachahmung eines durch den Therapeuten gegebenen Feedbacks ein autonomes Training.
80

Investigating the Portuguese-English Bilingual Mental Lexicon: Crosslinguistic Orthographic and Phonological Overlap in Cognates and False Friends

Alves-Soares, Leonardo 01 October 2020 (has links)
This dissertation investigates how cognates are organized in the bilingual mental lexicon and examines whether orthography in one language, via phonological representations, influences the processing of cognates and false friends in the other language. In light of the framework of two well-known models of bilingual visual word recognition, the Bilingual Interactive Activation (BIA) and the Bilingual Interactive Activation Plus (BIA+), the premise is that there is activation from orthography to phonology across a bilingual’s two languages and that this activation is modulated by the degree of orthographic and phonological code overlap. Two objective metrics were used to assess crosslinguistic similarity of Portuguese-English cognates and false friends that were selected for a cross-language lexical decision task with masked priming. Dynamic time warping (DTW), an algorithm that was originally conceived to compare different speech patterns in automatic speech recognition and to measure acoustic similarity between two time-dependent sequences, was used to compute crosslinguistic phonological similarity. The Normalized Levenshtein Distance (NLD), an algorithm that calculates the minimum number of single-character insertions, deletions or substitutions required to change one word into another and normalizes the result by their lengths, was used to compute crosslinguistic orthographic similarity. Portuguese-English bilinguals who acquired their second language after reaching puberty, and English functional monolinguals who grew up speaking primarily English were recruited to participate in the experimental task. Based on collected reaction time and accuracy data, mixed-effects models analyses are used to estimate the individual effects of crosslinguistic orthographic, phonological and semantic similarity and the role each of them, along with English proficiency, word frequency and length play in the organization of the Portuguese-English bilingual mental lexicon.

Page generated in 0.2015 seconds