• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 51
  • 10
  • 8
  • 4
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 102
  • 102
  • 25
  • 24
  • 20
  • 17
  • 17
  • 17
  • 16
  • 14
  • 13
  • 12
  • 12
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

MÃtodos estatÃsticos multi-percursos para a identificaÃÃo cega de canais da fonte de aplicaÃÃes Ãs comunicaÃÃes sem fio / High-order statistical methods for blind channel identification and source detection with applications to wireless communications

Carlos EstevÃo Rolim Fernandes 30 May 2008 (has links)
Laboratoire I3S/CNRS / Os sistemas de telecomunicaÃÃes atuais oferecem servios que demandam taxas de transmissÃo muito elevadas. O problema da identificaÃÃo de canal aparece nesse contexto com um problema da maior importÃncia. O uso de tÃcnicas cegas tem sido de grande interesse na busca por um melhor compromisso entre uma taxas binÃria adequada e a qualidade da informaÃÃo recuperada. Apoiando-se em propriedades especiais dos cumulantes de 4a ordem dos sinais à saÃda do canal, esta tese introduz novas ferramentas de processamento de sinais com aplicaÃÃes em sistemas de comunicaÃÃo rÃdio-mÃveis. Explorando a estrutura simÃtrica dos cumulantes de saÃda, o problema da identificaÃÃo cega de canais à abordado a partir de um modelo multilinear do tensor de cumulantes 4a ordem, baseado em uma decomposiÃÃo em fatores paralelos (Parafac). No caso SISO, os componentes do novo modelo tensorial apresentam uma estrutura Hankel. No caso de canais MIMO sem memÃria, a redundÃncia dos fatores tensoriais à explorada na estimaÃÃo dos coeficientes dos canal. Neste contexto, novos algoritmos de identificaÃÃo cega de canais sÃo desenvolvidos nesta tese com base em um problema de otimizaÃÃo de mÃnimos quadrados de passo Ãnico (SS-LS). Os mÃtodos propostos exploram plenamente a estrutura multilinear do tensor de cumulantes bem como suas simetrias e redundÃncias, evitando assim qualquer forma de prÃ-processamento. Com efeito, a abordagem SS-LS induz uma soluÃÃo baseada em um Ãnico procedimento de minimizaÃÃo, sem etapas intermediÃrias, contrariamente ao que ocorre na maior parte dos mÃtodos existentes na literatura. Utilizando apenas os cumulantes de ordem 4 e explorando o conceito de Arranjo Virtual, trata-se tambÃm o problema da localizaÃÃo de fontes, num contexto multiusuÃrio. Uma contribuÃÃo original consiste em aumentar o nÃmero de sensores virtuais com base em uma decomposiÃÃo particular do tensor de cumulantes, melhorando assim a resoluÃÃo do arranjo, cuja estrutura à tipicamente obtida quando se usa estatÃsticas de ordem 6. Considera-se ainda a estimaÃÃo dos parÃmetros fÃsicos de um canal de comunicaÃÃo MIMO com muti-percursos. AtravÃs de uma abordagem completamente cega, o canal multi-percurso à primeiramente tratado como um modelo convolutivo e uma nova tÃcnica à proposta para estimar seus coeficientes. Esta tÃcnica nÃo-paramÃtrica generaliza os mÃtodos previamente propostos para os casos SISO e MIMO (sem memÃria). Fazendo uso de um formalismo tensorial para representar o canal de multi-percursos MIMO, seus parÃmetros fÃsicos podem ser obtidos atravÃs de uma tÃcnica combinada de tipo ALS-MUSIC, baseada em um algoritmo de subespaÃo. Por fim, serà considerado o problema da determinaÃÃo de ordem de canais FIR, particularmente no caso de sistemas MISO. Um procedimento completo à introduzido para a detecÃÃo e estimaÃÃo de canais de comunicaÃÃo MISO seletivos em freqÃÃncia. O novo algoritmo, baseado em uma abordagem de deflaÃÃo, detecta sucessivamente cada fonte de sinal, determina a ordem de seu canal de transmissÃo individual e estima os coeficientes associados. / Les systÃmes de tÃlÃcommunications modernes exigent des dÃbits de transmission trÃs ÃlevÃs. Dans ce cadre, le problÃme dâidentification de canaux est un enjeu majeur. Lâutilisation de techniques aveugles est dâun grand intÃrÃt pour avoir le meilleur compromis entre un taux binaire adÃquat et la qualità de lâinformation rÃcupÃrÃe. En utilisant les propriÃtÃs des cumulants dâordre 4 des signaux de sortie du canal, cette thÃse introduit de nouvelles mÃthodes de traitement du signal tensoriel avec des applications pour les systÃmes de communication radio-mobiles. En utilisant la structure symÃtrique des cumulants de sortie, nous traitons le problÃme de lâidentification aveugle de canaux en introduisant un mod`ele multilinÃaire pour le tenseur des cumulants dâordre 4, basà sur une dÃcomposition de type Parafac. Dans le cas SISO, les composantes du modÃle tensoriel ont une structure de Hankel. Dans le cas de canaux MIMO instantanÃs, la redondance des facteurs tensoriels est exploitÃe pour lâestimation des coefficients du canal. Dans ce contexte, nous dÃveloppons des algorithmes dâidentification aveugle basÃs sur une minimisation de type moindres carrÃs à pas unique (SS-LS). Les mÃthodes proposÃes exploitent la structure multilinÃaire du tenseur de cumulants aussi bien que les relations de symÃtrie et de redondance, ce qui permet dâÃviter toute sorte de traitement au prÃalable. En effet, lâapproche SS-LS induit une solution basÃe sur une seule et unique procÃdure dâoptimisation, sans les Ãtapes intermÃdiaires requises par la majorità des mÃthodes existant dans la littÃrature. En exploitant seulement les cumulants dâordre 4 et le concept de rÃseau virtuel, nous abordons aussi le problÃme de la localisation de sources dans le cadre dâun rÃseau dâantennes multiutilisateur. Une contribution originale consiste à augmenter le nombre de capteurs virtuels en exploitant un arrangement particulier du tenseur de cumulants, de maniÃre à amÃliorer la rÃsolution du rÃseau, dont la structure Ãquivaut à celle qui est typiquement issue de lâutilisation des statistiques dâordre 6. Nous traitons par ailleurs le problÃme de lâestimation des paramÃtres physiques dâun canal de communication de type MIMO à trajets multiples. Dans un premier temps, nous considÂerons le canal à trajets multiples comme un modÃle MIMO convolutif et proposons une nouvelle technique dâestimation des coefficients. Cette technique non-paramÃtrique gÃnÃralise les mÃthodes proposÃes dans les chapitres prÃcÃdents pour les cas SISO et MIMO instantanÃ. En reprÃsentant le canal multi-trajet à lâaide dâun formalisme tensoriel, les paramÃtres physiques sont obtenus en utilisant une technique combinÃe de type ALS-MUSIC, basÃe sur un algorithme de sous-espaces. Enfin, nous considÃrons le problÃme de la dÂetermination dâordre de canaux de type RIF, dans le contexte des systÃmes MISO. Nous introduisons une procÃdure complÃte qui combine la dÃtection des signaux avec lâestimation des canaux de communication MISO sÃlectifs en frÃquence. Ce nouvel algorithme, basà sur une technique de dÃflation, est capable de dÃtecter successivement les sources, de dÃterminer lâordre de chaque canal de transmission et dâestimer les coefficients associÂes.
82

Propriétés de moyennage d'ensemble des signaux acoustiques en milieu réverbérant et applications potentielles au contrôle et à la caractérisation des structures. / Ensemble-averaging properties of acoustic signals in reverberant media and potential applications to control and characterization of structures

Achdjian, Hossep 05 December 2014 (has links)
La propagation des ondes acoustiques ou élastiques dans un milieu fini à faible atténuation se traduit par des signaux mesurés de longue durée (réverbération). Dans les techniques de contrôle non destructif et imagerie conventionnelles, seuls les premiers paquets d’ondes sont ordinairement exploités et l’information potentiellement contenue dans les codas de réverbération est alors perdue. Le travail présenté dans cette thèse a pour objectif d’exploiter le comportement d’ensemble des codas enregistrées dans des structures de type plaques, afin d’extraire le maximum d’information à partir d’un nombre limité de capteurs et traitement simple. Nous avons développé des modèles statistiques permettant de prévoir le comportement des ondes acoustiques réverbérantes dans une plaque (sous la forme de moyennes d’ensemble), à partir d’un ensemble limité de paramètres accessibles expérimentalement. Ainsi, il est montré que les moyennes des enveloppes, des corrélations ou de l’intégrale dite de Schroeder des signaux de réverbération reçus par quelques points contiennent des informations potentiellement utiles sur les propriétés structurelles du milieu, des sources ou des défauts. Après une validation numérique et expérimentale des modèles, des applications potentielles sont présentées telles que l’estimation de propriétés structurelles d’une plaque ou la localisation d’une source. La particularité de ces estimations est qu’elles ne nécessitent pas de mesure de temps, ni de synchronisation entre les capteurs, ce qui pourrait autoriser une implémentation avec peu de ressources embarquées. Ce type de méthode pourrait également être utilisé pour caractériser un défaut dans une structure réverbérante, de façon éventuellement complémentaire aux techniques classiques de CND et contrôle-santé de structures. / The propagation of acoustic and elastic waves in a finite media with low attenuation leads to measured signals of long durations (reverberation). In conventional techniques for non-destructive testing and imaging, only the first wave packets are usually exploited, and the information potentially contained in reverberation codas is lost. The work presented in this thesis aims to exploit the overall behavior of codas recorded in plate-like structures, in order to extract the maximum information from a limited number of sensors and simple processing. We have developed statistical models to predict the behavior of reverberant acoustic waves in a plate (in the form of ensemble-averages), from a limited set of experimentally accessible parameters. Thus, it is shown that theoretical expressions for the mathematical expectations of the envelopes, the correlation functions or the so-called Schroeder’s integral of reverberant signals received at some points contain potentially useful information about the structural properties of the medium, the sources or the defects. After numerical and experimental validation, potential applications are presented, such as the estimate of structural properties of a plate or source location. In addition, these techniques do not require any time measurement or even trigger synchronization between the input channels of instrumentation, thus implying low hardware constraints. Such methods could also be used to characterize a defect in a reverberant structure and can be considered as complementary to conventional NDT techniques and structural health monitoring.
83

Ανάπτυξη και υλοποίηση τεχνικής εντοπισμού θέσεων πολλαπλών πηγών από δίκτυα τυχαία διασκορπισμένων αισθητήρων

Μαυροκεφαλίδης, Χρήστος 12 September 2007 (has links)
Με τα δίκτυα αισθητήρων μπορούμε να παρακολουθούμε το περιβάλλον και να εξάγουμε χρήσιμη πληροφορία με αυτόματο τρόπο. Τα τελευταία χρόνια, λόγω και της ανάπτυξης κατάλληλων ολοκληρωμένων κυκλωμάτων, έχουν εμφανιστεί κόμβοι αισθητήρων σε πολύ μικρό μέγεθος. Αυτοί οι κόμβοι έχουν την δυνατότητα να επεξεργάζονται δεδομένα, να επικοινωνούν μεταξύ τους και να περιέχουν περισσότερα από ένα είδη αισθητήρων. Η συγκεκριμένη εργασία ασχολείται με δίκτυα τυχαία διασκορπισμένων αισθητήρων. Το πρόβλημα που μελετήθηκε είναι ο εντοπισμός της θέσης πολλαπλών πηγών από το δίκτυο. Οι πηγές εκπέμπουν ευρείας ζώνης σήματα που μοντελοποιούνται ως διαδικασίες AR. Η τεχνική λειτουργεί με έναν σειριακό τρόπο. Επιλέγει μια πηγή, εκτιμά τις διαφορές χρόνων άφιξης του σήματός της και υπολογίζει την θέση της πηγής χρησιμοποιώντας το κριτήριο ελαχίστων τετραγώνων. Στην συνέχεια, ακυρώνει το σήμα της πηγής από τα σήματα που έχουν λάβει οι κόμβοι του δικτύου και η όλη διαδικασία ξεκινάει από την αρχή. Παρουσιάζονται πειραματικά αποτελέσματα που δείχνουν την επιτυχή λειτουργία της στην περίπτωση που υπάρχει στην περιοχή του δικτύου μια, δυο ή τρεις πηγές. / Sensor networks are used for monitoring an environment and extracting useful information in an automated way. In recent years, mostly because of the development of suitable integrated circuits, sensor nodes, in small sizes, have emerged. These nodes are capable of processing data, communicating with each other and multi-modal sensing. The thesis is concerned with ad-hoc sensor networks. The problem, that is tackled, is the estimation of position of sources in a multi-source environment. The signals, that are emitted, are modelled as AR processes. The proposed method works in a serial manner. Firstly, one of the sources is selected and the time differences of arrival among the sensor nodes are computed. Then, the position of the source is estimated using the least squares criterion. Finally, the signal of the source is cancelled from the sensor nodes’ received signals and the whole procedure starts over. Experimental results show the functionality of the method when one, two or three sources are present in the environment.
84

Αλγοριθμικές τεχνικές εντοπισμού και παρακολούθησης πολλαπλών πηγών από ασύρματα δίκτυα αισθητήρων

Αμπελιώτης, Δημήτριος 12 April 2010 (has links)
Οι πρόσφατες εξελίξεις στις ασύρματες επικοινωνίες και στα ηλεκτρονικά κυκλώματα έχουν επιτρέψει την ανάπτυξη υπολογιστικών διατάξεων χαμηλού κόστους και χαμηλής κατανάλωσης ισχύος, οι οποίες ενσωματώνουν δυνατότητες μέτρησης (sensing), επεξεργασίας και ασύρματης επικοινωνίας. Οι διατάξεις αυτές, οι οποίες έχουν ιδιαίτερα μικρό μέγεθος, καλούνται κόμβοι αισθητήρες. Ένα ασύρματο δίκτυο κόμβων αισθητήρων αποτελείται από ένα πλήθος κόμβων οι οποίοι έχουν αναπτυχθεί σε κάποια περιοχή ενδιαφέροντος προκειμένου να μετρούν κάποια μεταβλητή του περιβάλλοντος. Ανάμεσα σε πολλές εφαρμογές, ο εντοπισμός και η παρακολούθηση των θέσεων πηγών οι οποίες εκπέμπουν κάποιο σήμα (π.χ. ακουστικό, ηλεκτρομαγνητικό) αποτελεί ένα πολύ ενδιαφέρον θέμα, το οποίο μάλιστα μπορεί να χρησιμοποιηθεί και ως βάση για τη μελέτη άλλων προβλημάτων τα οποία εμφανίζονται στα ασύρματα δίκτυα αισθητήρων. Οι περισσότερες από τις υπάρχουσες τεχνικές εντοπισμού θέσης μιας πηγής από μια συστοιχία αισθητήρων μπορούν να ταξινομηθούν σε δυο κατηγορίες: (α) Τις τεχνικές οι οποίες χρησιμοποιούν μετρήσεις διεύθυνσης άφιξης (Direction of Arrival, DOA) και (β) τις τεχνικές οι οποίες χρησιμοποιούν μετρήσεις διαφοράς χρόνων άφιξης (Time Difference of Arrival, TDOA). Ωστόσο, οι τεχνικές αυτές απαιτούν υψηλό ρυθμό δειγματοληψίας και ακριβή συγχρονισμό των κόμβων και δε συνάδουν έτσι με τις περιορισμένες ικανότητες των κόμβων αισθητήρων. Για τους λόγους αυτούς, το ενδιαφέρον έχει στραφεί σε μια τρίτη κατηγορία τεχνικών οι οποίες χρησιμοποιούν μετρήσεις ισχύος (Received Signal Strength, RSS). Το πρόβλημα του εντοπισμού θέσης χρησιμοποιώντας μετρήσεις ισχύος είναι ένα πρόβλημα εκτίμησης, όπου οι μετρήσεις συνδέονται με τις προς εκτίμηση παραμέτρους με μη-γραμμικό τρόπο. Στα πλαίσια της Διδακτορικής Διατριβής ασχολούμαστε αρχικά με την περίπτωση όπου επιθυμούμε να εκτιμήσουμε τη θέση και την ισχύ μιας πηγής χρησιμοποιώντας μετρήσεις ισχύος οι οποίες φθίνουν με βάση το αντίστροφο του τετραγώνου της απόστασης ανάμεσα στην πηγή και το σημείο μέτρησης. Για το πρόβλημα αυτό, προτείνουμε έναν εκτιμητή ο οποίος δίνει τις παραμέτρους της πηγής ως λύση ενός γραμμικού προβλήματος ελαχίστων τετραγώνων. Στη συνέχεια, υπολογίζουμε κατάλληλα βάρη και προτείνουμε έναν εκτιμητή ο οποίος δίνει τις παραμέτρους της πηγής ως λύση ενός προβλήματος ελαχίστων τετραγώνων με βάρη. Ακόμα, τροποποιούμε κατάλληλα τον τελευταίο εκτιμητή έτσι ώστε να είναι δυνατή η κατανεμημένη υλοποίησή του μέσω των προσαρμοστικών αλγορίθμων Least Mean Square (LMS) και Recursive Least Squares (RLS). Στη συνέχεια, εξετάζουμε την περίπτωση όπου ενδιαφερόμαστε να εκτιμήσουμε τη θέση μιας πηγής αλλά δεν έχουμε καμιά πληροφορία σχετικά με το μοντέλο εξασθένισης της ισχύος. Έτσι, υποθέτουμε πως αυτό περιγράφεται από μια άγνωστη γνησίως φθίνουσα συνάρτηση της απόστασης. Αρχικά, προσεγγίζουμε το πρόβλημα εκτίμησης κάνοντας την υπόθεση πως οι θέσεις των κόμβων αποτελούν τυχαία σημεία ομοιόμορφα κατανεμημένα στο επίπεδο. Χρησιμοποιώντας την υπόθεση αυτή, υπολογίζουμε εκτιμήσεις για τις αποστάσεις ανάμεσα στους κόμβους και την πηγή, και αναπτύσσουμε έναν αλγόριθμο εκτίμησης της θέσης της πηγής. Στη συνέχεια, προσεγγίζουμε το πρόβλημα εκτίμησης χωρίς την υπόθεση περί ομοιόμορφης κατανομής των θέσεων των κόμβων στο επίπεδο. Προτείνουμε μια κατάλληλη συνάρτηση κόστους για την περίπτωση αυτή, και δείχνουμε την ύπαρξη μιας συνθήκης υπό την οποία η βέλτιστη λύση μπορεί να υπολογιστεί. Η λύση αυτή είναι εσωτερικό σημείο ενός κυρτού πολυγώνου, το οποίο ονομάζουμε ταξινομημένο τάξης-K κελί Voronoi. Έτσι, δίνουμε αλγορίθμους υπολογισμού της λύσης αυτής, καθώς και κατανεμημένους αλγορίθμους οι οποίοι βασίζονται σε προβολές σε κυρτά σύνολα. Ακόμα, ασχολούμαστε με τις ιδιότητες των κελιών αυτών στην περίπτωση όπου οι θέσεις των κόμβων αισθητήρων είναι ομοιόμορφα κατανεμημένες στο επίπεδο και υπολογίζουμε κάποια φράγματα για το εμβαδόν τους. Τέλος, ασχολούμαστε με την περίπτωση όπου ενδιαφερόμαστε να εκτιμήσουμε τις θέσεις πολλαπλών πηγών με γνωστό μοντέλο εξασθένισης της ισχύος. Για το πρόβλημα αυτό, αρχικά προτείνουμε έναν αλγόριθμο διαδοχικής εκτίμησης και ακύρωσης της συνεισφοράς κάθε πηγής, προκειμένου να υπολογιστούν σταδιακά οι θέσεις όλων των πηγών. Ο αλγόριθμος αυτός, αποτελείται από τρία βήματα κατά τα οποία πρώτα υπολογίζεται μια προσεγγιστική θέση για την πηγή, στη συνέχεια εκτιμάται ένα σύνολο κόμβων το οποίο δέχεται μικρής έντασης παρεμβολή από τις υπόλοιπες πηγές, και τέλος επιχειρείται μια λεπτομερέστερη εκτίμηση της θέσης κάθε πηγής. Στη συνέχεια, επεκτείνοντας την τεχνική αυτή, προτείνουμε έναν επαναληπτικό αλγόριθμο εκτίμησης ο οποίος βασίζεται στον αλγόριθμο εναλλασσόμενων προβολών (Alternating Projections). Εξετάζουμε επίσης μεθόδους οι οποίες οδηγούν στη μείωση της υπολογιστικής πολυπλοκότητας του αλγορίθμου αυτού. / Technology advances in microelectronics and wireless communications have enabled the development of small-scale devices that integrate sensing, processing and short-range radio capabilities. The deployment of a large number of such devices, referred to as sensor nodes, over a territory of interest, defines the so-called wireless sensor network. Wireless sensor networks have attracted considerable attention in recent years and have motivated many new challenges, most of which require the synergy of many disciplines, including signal processing, networking and distributed algorithms. Among many other applications, source localization and tracking has been widely viewed as a canonical problem of wireless sensor networks. Furthermore, it constitutes an easily perceived problem that can be used as a vehicle to study more involved information processing and organization problems. Most of the source localization methods that have appeared in the literature can be classified into two broad categories, according to the physical variable they utilize. The algorithms of the first category utilize “time delay of arrival”(TDOA) measurements, and the algorithms of the second category use “direction of arrival” (DOA) measurements. DOA estimates are particularly useful for locating sources emitting narrowband signals, while TDOA measurements offer the increased capability of localizing sources emitting broadband signals. However, the methods of both categories impose two major requirements that render them inappropriate to be used in wireless sensor networks: (a) the analog signals at the outputs of the spatially distributed sensors should be sampled in a synchronized fashion, and (b) the sampling rate used should be high enough so as to capture the features of interest. These requirements, in turn, imply that accurate distributed synchronization methods should be implemented so as to keep the remote sensor nodes synchronized and that high frequency electronics as well as increased bandwidth are needed to transmit the acquired measurements. Due to the aforementioned limitations, source localization methods that rely upon received signal strength (RSS) measurements - originally explored for locating electromagnetic sources - have recently received revived attention. In this Thesis, we begin our study by considering the localization of an isotropic acoustic source using energy measurements from distributed sensors, in the case where the energy decays according to an inverse square law with respect to the distance. While most acoustic source localization algorithms require that distance estimates between the sensors and the source of interest are available, we propose a linear least squares criterion that does not make such an assumption. The new criterion can yield the location of the source and its transmit power in closed form. A weighted least squares cost function is also considered, and distributed implementation of the proposed estimators is studied. Numerical results indicate significant performance improvement as compared to a linear least squares based approach that utilizes energy ratios, and comparable performance to other estimators of higher computational complexity. In the sequel, we turn our attention to the case where the energy decay model is not known. For solving the localization problem in this case, we first make the assumption that the locations of the nodes near the source can be well described by a uniform distribution. Using this assumption, we derive distance estimates that are independent of both the energy decay model and the transmit power of the source. Numerical results show that these estimates lead to improved localization accuracy as compared to other model-independent approaches. In the sequel, we consider the more general case where the assumption about the uniform deployment of the sensors is not required. For this case, an optimization problem that does not require knowledge of the underlying energy decay model is proposed, and a condition under which the optimal solution can be computed is given. This condition employs a new geometric construct, called the sorted order-K Voronoi diagram. We give centralized and distributed algorithms for source localization in this setting. Finally, analytical results and simulations are used to verify the performance of the developed algorithms. The next problem we consider is the estimation of the locations of multiple acoustic sources by a network of distributed energy measuring sensors. The maximum likelihood (ML) solution to this problem is related to the optimization of a non-convex function of, usually, many variables. Thus, search-based methods of high complexity are required in order to yield an accurate solution. In order to reduce the computational complexity of the multiple source localization problem, we propose two methods. The first method proposes a sequential estimation algorithm, in which each source is localized, its contribution is cancelled, and the next source is considered. The second method makes use of an alternating projection (AP) algorithm that decomposes the original problem into a number of simpler, yet also non-convex, optimization steps. The particular form of the derived cost functions of each such optimization step indicates that, in some cases, an approximate form of these cost functions can be used. These approximate cost functions can be evaluated using considerably lower computational complexity. Thus, a low-complexity version of the AP algorithm is proposed. Extensive simulation results demonstrate that the proposed algorithm offers a performance close to that of the exact AP implementation, and in some cases, similar performance to that of the ML estimator.
85

Audiovisual voice activity detection and localization of simultaneous speech sources / Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais

Minotto, Vicente Peruffo January 2013 (has links)
Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas, para os casos de uma e múltiplas fontes sonoras, através do uso de informações audiovisuais, explorando-se variadas maneiras de se fundir as modalidades de áudio e vídeo. Este trabalho também emprega um arranjo de microfones para o processamento de som, o qual permite que as informações espaciais dos sinais acústicos sejam exploradas através do algoritmo estado-da-arte SRP (Steered Response Power). Por consequência adicional, uma eficiente implementação em GPU do SRP foi desenvolvida, possibilitando processamento em tempo real do algoritmo. Os experimentos realizados mostram uma acurácia média de 95% ao se efetuar VAD de até três locutores simultâneos, e um erro médio de 10cm ao se localizar tais locutores. / Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
86

Audiovisual voice activity detection and localization of simultaneous speech sources / Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais

Minotto, Vicente Peruffo January 2013 (has links)
Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas, para os casos de uma e múltiplas fontes sonoras, através do uso de informações audiovisuais, explorando-se variadas maneiras de se fundir as modalidades de áudio e vídeo. Este trabalho também emprega um arranjo de microfones para o processamento de som, o qual permite que as informações espaciais dos sinais acústicos sejam exploradas através do algoritmo estado-da-arte SRP (Steered Response Power). Por consequência adicional, uma eficiente implementação em GPU do SRP foi desenvolvida, possibilitando processamento em tempo real do algoritmo. Os experimentos realizados mostram uma acurácia média de 95% ao se efetuar VAD de até três locutores simultâneos, e um erro médio de 10cm ao se localizar tais locutores. / Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
87

Audiovisual voice activity detection and localization of simultaneous speech sources / Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais

Minotto, Vicente Peruffo January 2013 (has links)
Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas, para os casos de uma e múltiplas fontes sonoras, através do uso de informações audiovisuais, explorando-se variadas maneiras de se fundir as modalidades de áudio e vídeo. Este trabalho também emprega um arranjo de microfones para o processamento de som, o qual permite que as informações espaciais dos sinais acústicos sejam exploradas através do algoritmo estado-da-arte SRP (Steered Response Power). Por consequência adicional, uma eficiente implementação em GPU do SRP foi desenvolvida, possibilitando processamento em tempo real do algoritmo. Os experimentos realizados mostram uma acurácia média de 95% ao se efetuar VAD de até três locutores simultâneos, e um erro médio de 10cm ao se localizar tais locutores. / Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
88

Sound source localization with data and model uncertainties using the EM and Evidential EM algorithms / Estimation de sources acoustiques avec prise en compte de l'incertitude de propagation

Wang, Xun 09 December 2014 (has links)
Ce travail de thèse se penche sur le problème de la localisation de sources acoustiques à partir de signaux déterministes et aléatoires mesurés par un réseau de microphones. Le problème est résolu dans un cadre statistique, par estimation via la méthode du maximum de vraisemblance. La pression mesurée par un microphone est interprétée comme étant un mélange de signaux latents émis par les sources. Les positions et les amplitudes des sources acoustiques sont estimées en utilisant l’algorithme espérance-maximisation (EM). Dans cette thèse, deux types d’incertitude sont également pris en compte : les positions des microphones et le nombre d’onde sont supposés mal connus. Ces incertitudes sont transposées aux données dans le cadre théorique des fonctions de croyance. Ensuite, les positions et les amplitudes des sources acoustiques peuvent être estimées en utilisant l’algorithme E2M, qui est une variante de l’algorithme EM pour les données incertaines.La première partie des travaux considère le modèle de signal déterministe sans prise en compte de l’incertitude. L’algorithme EM est utilisé pour estimer les positions et les amplitudes des sources. En outre, les résultats expérimentaux sont présentés et comparés avec le beamforming et la holographie optimisée statistiquement en champ proche (SONAH), ce qui démontre l’avantage de l’algorithme EM. La deuxième partie considère le problème de l’incertitude du modèle et montre comment les incertitudes sur les positions des microphones et le nombre d’onde peuvent être quantifiées sur les données. Dans ce cas, la fonction de vraisemblance est étendue aux données incertaines. Ensuite, l’algorithme E2M est utilisé pour estimer les sources acoustiques. Finalement, les expériences réalisées sur les données réelles et simulées montrent que les algorithmes EM et E2M donnent des résultats similaires lorsque les données sont certaines, mais que ce dernier est plus robuste en présence d’incertitudes sur les paramètres du modèle. La troisième partie des travaux présente le cas de signaux aléatoires, dont l’amplitude est considérée comme une variable aléatoire gaussienne. Dans le modèle sans incertitude, l’algorithme EM est utilisé pour estimer les sources acoustiques. Dans le modèle incertain, les incertitudes sur les positions des microphones et le nombre d’onde sont transposées aux données comme dans la deuxième partie. Enfin, les positions et les variances des amplitudes aléatoires des sources acoustiques sont estimées en utilisant l’algorithme E2M. Les résultats montrent ici encore l’avantage d’utiliser un modèle statistique pour estimer les sources en présence, et l’intérêt de prendre en compte l’incertitude sur les paramètres du modèle. / This work addresses the problem of multiple sound source localization for both deterministic and random signals measured by an array of microphones. The problem is solved in a statistical framework via maximum likelihood. The pressure measured by a microphone is interpreted as a mixture of latent signals emitted by the sources; then, both the sound source locations and strengths can be estimated using an expectation-maximization (EM) algorithm. In this thesis, two kinds of uncertainties are also considered: on the microphone locations and on the wave number. These uncertainties are transposed to the data in the belief functions framework. Then, the source locations and strengths can be estimated using a variant of the EM algorithm, known as Evidential EM (E2M) algorithm. The first part of this work begins with the deterministic signal model without consideration of uncertainty. The EM algorithm is then used to estimate the source locations and strengths : the update equations for the model parameters are provided. Furthermore, experimental results are presented and compared with the beamforming and the statistically optimized near-field holography (SONAH), which demonstrates the advantage of the EM algorithm. The second part raises the issue of model uncertainty and shows how the uncertainties on microphone locations and wave number can be taken into account at the data level. In this case, the notion of the likelihood is extended to the uncertain data. Then, the E2M algorithm is used to solve the sound source estimation problem. In both the simulation and real experiment, the E2M algorithm proves to be more robust in the presence of model and data uncertainty. The third part of this work considers the case of random signals, in which the amplitude is modeled by a Gaussian random variable. Both the certain and uncertain cases are investigated. In the former case, the EM algorithm is employed to estimate the sound sources. In the latter case, microphone location and wave number uncertainties are quantified similarly to the second part of the thesis. Finally, the source locations and the variance of the random amplitudes are estimated using the E2M algorithm.
89

Who Spoke What And Where? A Latent Variable Framework For Acoustic Scene Analysis

Sundar, Harshavardhan 26 March 2016 (has links) (PDF)
Speech is by far the most natural form of communication between human beings. It is intuitive, expressive and contains information at several cognitive levels. We as humans, are perceptive to several of these cognitive levels of information, as we can gather the information pertaining to the identity of the speaker, the speaker's gender, emotion, location, the language, and so on, in addition to the content of what is being spoken. This makes speech based human machine interaction (HMI), both desirable and challenging for the same set of reasons. For HMI to be natural for humans, it is imperative that a machine understands information present in speech, at least at the level of speaker identity, language, location in space, and the summary of what is being spoken. Although one can draw parallels between the human-human interaction and HMI, the two differ in their purpose. We, as humans, interact with a machine, mostly in the context of getting a task done more efficiently, than is possible without the machine. Thus, typically in HMI, controlling the machine in a specific manner is the primary goal. In this context, it can be argued that, HMI, with a limited vocabulary containing specific commands, would suffice for a more efficient use of the machine. In this thesis, we address the problem of ``Who spoke what and where", in the context of a machine understanding the information pertaining to identities of the speakers, their locations in space and the keywords they spoke, thus considering three levels of information - speaker identity (who), location (where) and keywords (what). This can be addressed with the help of multiple sensors like microphones, video camera, proximity sensors, motion detectors, etc., and combining all these modalities. However, we explore the use of only microphones to address this issue. In practical scenarios, often there are times, wherein, multiple people are talking at the same time. Thus, the goal of this thesis is to detect all the speakers, their keywords, and their locations in mixture signals containing speech from simultaneous speakers. Addressing this problem of ``Who spoke what and where" using only microphone signals, forms a part of acoustic scene analysis (ASA) of speech based acoustic events. We divide the problem of ``who spoke what and where" into two sub-problems: ``Who spoke what?" and ``Who spoke where". Each of these problems is cast in a generic latent variable (LV) framework to capture information in speech at different levels. We associate a LV to represent each of these levels and model the relationship between the levels using conditional dependency. The sub-problem of ``who spoke what" is addressed using single channel microphone signal, by modeling the mixture signal in terms of LV mass functions of speaker identity, the conditional mass function of the keyword spoken given the speaker identity, and a speaker-specific-keyword model. The LV mass functions are estimated in a Maximum likelihood (ML) framework using the Expectation Maximization (EM) algorithm using Student's-t Mixture Model (tMM) as speaker-specific-keyword models. Motivated by HMI in a home environment, we have created our own database. In mixture signals, containing two speakers uttering the keywords simultaneously, the proposed framework achieves an accuracy of 82 % for detecting both the speakers and their respective keywords. The other sub-problem of ``who spoke where?" is addressed in two stages. In the first stage, the enclosure is discretized into sectors. The speakers and the sectors in which they are located are detected in an approach similar to the one employed for ``who spoke what" using signals collected from a Uniform Circular Array (UCA). However, in place of speaker-specific-keyword models, we use tMM based speaker models trained on clean speech, along with a simple Delay and Sum Beamformer (DSB). In the second stage, the speakers are localized within the active sectors using a novel region constrained localization technique based on time difference of arrival (TDOA). Since the problem being addressed is a multi-label classification task, we use the average Hamming score (accuracy) as the performance metric. Although the proposed approach yields an accuracy of 100 % in an anechoic setting for detecting both the speakers and their corresponding sectors in two-speaker mixture signals, the performance degrades to an accuracy of 67 % in a reverberant setting, with a $60$ dB reverberation time (RT60) of 300 ms. To improve the performance under reverberation, prior knowledge of the location of multiple sources is derived using a novel technique derived from geometrical insights into TDOA estimation. With this prior knowledge, the accuracy of the proposed approach improves to 91 %. It is worthwhile to note that, the accuracies are computed for mixture signals containing more than 90 % overlap of competing speakers. The proposed LV framework offers a convenient methodology to represent information at broad levels. In this thesis, we have shown its use with three different levels. This can be extended to several such levels to be applicable for a generic analysis of the acoustic scene consisting of broad levels of events. It will turn out that not all levels are dependent on each other and hence the LV dependencies can be minimized by independence assumption, which will lead to solving several smaller sub-problems, as we have shown above. The LV framework is also attractive to incorporate prior knowledge about the acoustic setting, which is combined with the evidence from the data to derive the information about the presence of an acoustic event. The performance of the framework, is dependent on the choice of stochastic models, which model the likelihood function of the data given the presence of acoustic events. However, it provides an access to compare and contrast the use of different stochastic models for representing the likelihood function.
90

A comparative study on the neurophysiological mechanisms underlying effects of methylphenidate and neurofeedback on inhibitory control in attention deficit hyperactivity disorder

Bluschke, Annet, Friedrich, Julia, Schreiter, Marie Luise, Roessner, Veit, Beste, Christian 28 December 2018 (has links)
In Attention Deficit Hyperactivity Disorder (AD(H)D), treatments using methylphenidate (MPH) and behavioralinterventions like neurofeedback (NF) reflect major therapeutic options. These treatments also ameliorate ex-ecutive dysfunctions in AD(H)D. However, the mechanisms underlying effects of MPH and NF on executivefunctions in AD(H)D (e.g. the ability to inhibit prepotent responses) are far from understood. It is particularlyunclear whether these interventions affect similar or dissociable neural mechanisms and associated functionalneuroanatomical structures. This, however, is important when aiming to further improve these treatments. Wecompared the neurophysiological mechanisms of MPH and theta/beta NF treatments on inhibitory control on the basis of EEG recordings and source localization analyses. The data show that MPH and theta/beta NF bothincrease the ability to inhibit pre-potent responses to a similar extent. However, the data suggest that MPH andNF target different neurophysiological mechanisms, especially when it comes to functional neuroanatomicalstructures associated with these effects. Both treatments seem to affect neurophysiological correlates of a‘braking function’ in medial frontal areas. However, in case of the NF intervention, inferior parietal areas are alsoinvolved. This likely reflects the updating and stabilisation of efficient internal representations in order to in-itiate appropriate actions. No effects were seen in correlates of perceptual and attentional selection processes.Notably, reliable effects were only obtained after accounting for intra-individual variability in the neurophy-siological data, which may also explain the diversity of findings in studies on treatment effects in AD(H)D,especially concerning neurofeedback.

Page generated in 0.2843 seconds