Global ETD Search

81	Robust Single-Channel Speech Enhancement and Speaker Localization in Adverse Environments Mosayyebpour, Saeed 30 April 2014 (has links) In speech communication systems such as voice-controlled systems, hands-free mobile telephones and hearing aids, the received signals are degraded by room reverberation and background noise. This degradation can reduce the perceived quality and intelligibility of the speech, and decrease the performance of speech enhancement and source localization. These problems are difficult to solve due to the colored and nonstationary nature of the speech signals, and features of the Room Impulse Response (RIR) such as its long duration and non-minimum phase. In this dissertation, we focus on two topics of speech enhancement and speaker localization in noisy reverberant environments. A two-stage speech enhancement method is presented to suppress both early and late reverberation in noisy speech using only one microphone. It is shown that this method works well even in highly reverberant rooms. Experiments under different acoustic conditions confirm that the proposed blind method is superior in terms of reducing early and late reverberation effects and noise compared to other well known single-microphone techniques in the literature. Time Difference Of Arrival (TDOA)-based methods usually provide the most accurate source localization in adverse conditions. The key issue for these methods is to accurately estimate the TDOA using the smallest number of microphones. Two robust Time Delay Estimation (TDE) methods are proposed which use the information from only two microphones. One method is based on adaptive inverse filtering which provides superior performance even in highly reverberant and moderately noisy conditions. It also has negligible failure estimation which makes it a reliable method in realistic environments. This method has high computational complexity due to the estimation in the first stage for the first microphone. As a result, it can not be applied in time-varying environments and real-time applications. Our second method improves this problem by introducing two effective preprocessing stages for the conventional Cross Correlation (CC)-based methods. The results obtained in different noisy reverberant conditions including a real and time-varying environment demonstrate that the proposed methods are superior compared to the conventional TDE methods. / Graduate / 2015-04-23 / 0544 / 0984 / saeed.mosayyebpour@gmail.com skewness early and late reverberation noise single-microphone spectral subtraction Time Delay Estimation (TDE) Time Difference of Arrival (TDOA) Adaptive Inverse Filtering (AIF) Generalized Cross-Correlation (GCC) room impulse response (RIR)
82	Bayesian 3D multiple people tracking using multiple indoor cameras and microphones Lee, Yeongseon 13 May 2009 (has links) This thesis represents Bayesian joint audio-visual tracking for the 3D locations of multiple people and a current speaker in a real conference environment. To achieve this objective, it focuses on several different research interests, such as acoustic-feature detection, visual-feature detection, a non-linear Bayesian framework, data association, and sensor fusion. As acoustic-feature detection, time-delay-of-arrival~(TDOA) estimation is used for multiple source detection. Localization performance using TDOAs is also analyzed according to different configurations of microphones. As a visual-feature detection, Viola-Jones face detection is used to initialize the locations of unknown multiple objects. Then, a corner feature, based on the results from the Viola-Jones face detection, is used for motion detection for robust objects. Simple point-to-line correspondences between multiple cameras using fundamental matrices are used to determine which features are more robust. As a method for data association and sensor fusion, Monte-Carlo JPDAF and a data association with IPPF~(DA-IPPF) are implemented in the framework of particle filtering. Three different tracking scenarios of acoustic source tracking, visual source tracking, and joint acoustic-visual source tracking are represented using the proposed algorithms. Finally the real-time implementation of this joint acoustic-visual tracking system using a PC, four cameras, and six microphones is addressed with two parts of system implementation and real-time processing. Object tracking Particle filter Data association Sensor fusion Visual feature detection TDOA detection Multiple target tracking Automatic tracking Sensor networks Multisensor data fusion Context-aware computing Acoustic localization
83	Who Spoke What And Where? A Latent Variable Framework For Acoustic Scene Analysis Sundar, Harshavardhan 26 March 2016 (has links) (PDF) Speech is by far the most natural form of communication between human beings. It is intuitive, expressive and contains information at several cognitive levels. We as humans, are perceptive to several of these cognitive levels of information, as we can gather the information pertaining to the identity of the speaker, the speaker's gender, emotion, location, the language, and so on, in addition to the content of what is being spoken. This makes speech based human machine interaction (HMI), both desirable and challenging for the same set of reasons. For HMI to be natural for humans, it is imperative that a machine understands information present in speech, at least at the level of speaker identity, language, location in space, and the summary of what is being spoken. Although one can draw parallels between the human-human interaction and HMI, the two differ in their purpose. We, as humans, interact with a machine, mostly in the context of getting a task done more efficiently, than is possible without the machine. Thus, typically in HMI, controlling the machine in a specific manner is the primary goal. In this context, it can be argued that, HMI, with a limited vocabulary containing specific commands, would suffice for a more efficient use of the machine. In this thesis, we address the problem of ``Who spoke what and where", in the context of a machine understanding the information pertaining to identities of the speakers, their locations in space and the keywords they spoke, thus considering three levels of information - speaker identity (who), location (where) and keywords (what). This can be addressed with the help of multiple sensors like microphones, video camera, proximity sensors, motion detectors, etc., and combining all these modalities. However, we explore the use of only microphones to address this issue. In practical scenarios, often there are times, wherein, multiple people are talking at the same time. Thus, the goal of this thesis is to detect all the speakers, their keywords, and their locations in mixture signals containing speech from simultaneous speakers. Addressing this problem of ``Who spoke what and where" using only microphone signals, forms a part of acoustic scene analysis (ASA) of speech based acoustic events. We divide the problem of ``who spoke what and where" into two sub-problems: ``Who spoke what?" and ``Who spoke where". Each of these problems is cast in a generic latent variable (LV) framework to capture information in speech at different levels. We associate a LV to represent each of these levels and model the relationship between the levels using conditional dependency. The sub-problem of ``who spoke what" is addressed using single channel microphone signal, by modeling the mixture signal in terms of LV mass functions of speaker identity, the conditional mass function of the keyword spoken given the speaker identity, and a speaker-specific-keyword model. The LV mass functions are estimated in a Maximum likelihood (ML) framework using the Expectation Maximization (EM) algorithm using Student's-t Mixture Model (tMM) as speaker-specific-keyword models. Motivated by HMI in a home environment, we have created our own database. In mixture signals, containing two speakers uttering the keywords simultaneously, the proposed framework achieves an accuracy of 82 % for detecting both the speakers and their respective keywords. The other sub-problem of ``who spoke where?" is addressed in two stages. In the first stage, the enclosure is discretized into sectors. The speakers and the sectors in which they are located are detected in an approach similar to the one employed for ``who spoke what" using signals collected from a Uniform Circular Array (UCA). However, in place of speaker-specific-keyword models, we use tMM based speaker models trained on clean speech, along with a simple Delay and Sum Beamformer (DSB). In the second stage, the speakers are localized within the active sectors using a novel region constrained localization technique based on time difference of arrival (TDOA). Since the problem being addressed is a multi-label classification task, we use the average Hamming score (accuracy) as the performance metric. Although the proposed approach yields an accuracy of 100 % in an anechoic setting for detecting both the speakers and their corresponding sectors in two-speaker mixture signals, the performance degrades to an accuracy of 67 % in a reverberant setting, with a $60$ dB reverberation time (RT60) of 300 ms. To improve the performance under reverberation, prior knowledge of the location of multiple sources is derived using a novel technique derived from geometrical insights into TDOA estimation. With this prior knowledge, the accuracy of the proposed approach improves to 91 %. It is worthwhile to note that, the accuracies are computed for mixture signals containing more than 90 % overlap of competing speakers. The proposed LV framework offers a convenient methodology to represent information at broad levels. In this thesis, we have shown its use with three different levels. This can be extended to several such levels to be applicable for a generic analysis of the acoustic scene consisting of broad levels of events. It will turn out that not all levels are dependent on each other and hence the LV dependencies can be minimized by independence assumption, which will lead to solving several smaller sub-problems, as we have shown above. The LV framework is also attractive to incorporate prior knowledge about the acoustic setting, which is combined with the evidence from the data to derive the information about the presence of an acoustic event. The performance of the framework, is dependent on the choice of stochastic models, which model the likelihood function of the data given the presence of acoustic events. However, it provides an access to compare and contrast the use of different stochastic models for representing the likelihood function. Signal Processing Acoustic Scene Analysis (ASA) Latent Variables Expectation Maximization EM Algorithm Mixture Models Acoustic Source Localization Hyperboloids Student's-t Mixture Models Gaussian Mixture Models Robust Mixture Modeling Time Difference of Arrival (TDOA) Communication Engineering
84	Lokalizace pohyblivých akustických zdrojů / Localization of moving acoustical sources Bezdíček, Martin January 2010 (has links) This master's thesis is focused on localization static (entering semester project) and moving acoustic sources (entering master's thesis) by the help of microphonic arrays. In the first part deal with common problems of localization. Further are here described types of microphonic arrays, simplifying possibilities which delimited this problems and general information about room acoustics. In the next part of this master's thesis are step by step mentioned methods localization of acoustic sources. In practical part were used algorithms: Steered-Beamformer-Based Locators and TDOA-Based Locators. Last part of this master's work includes results of these algorithms.
85	Mikrofonová pole pro prostorovou separaci akustických signálů / Microphone arrays for spatial separation of acoustic signals Grobelný, Petr January 2011 (has links) The goal of this master’s thesis is to explore the possibilities of multichannel localization of acoustic signal sources and their following application on a real signal localization and separation, using Beamforming methods. During this thesis two beamforming methods were selected, namely Delay and Sum a Constant Directivity Beamforming - Circular Arrays, and were applicated on real environment signals using two microphone arrays’ geometries ULA (Uniform linear array) and UCA (Uniform Circular array).
86	Detekce přítomnosti osob pomocí IoT senzorů / Room Occupancy Detection with IoT Sensors Kolarčík, Tomáš January 2021 (has links) The aim of this work was to create a module for home automation tools Home Assistant. The module is able to determine which room is inhabited and estimate more accurate position of people inside the room. Known GPS location cannot be used for this purpose because it is inaccurate inside buildings and therefore one of the indoor location techniques needs to be used. Solution based on Bluetooth Low Energy wireless technology was chosen. The localization technique is the fingerprinting method, which is based on estimating the position according to the signal strength at any point in space, which are compared with a database of these points using machine learning. The system can be supplemented with motion sensors that ensure a quick response when entering the room. This system can be deployed within a house, apartment or small to medium-sized company to determine the position of people in the building and can serve as a very powerful element of home automation.
87	The Frequency Monitor Network (FNET) Design and Situation Awareness Algorithm Development Zuo, Jian 24 April 2008 (has links) Wide Area Measurements (WAMs) have been widely used in the energy management system (EMS) of power system for monitoring, operation and control. In recent years, the advent of synchronized Phasor Measurements Unit (PMU) has added another dimension to the field of wide-area measurement. However, the high cost of the PMU, which includes the manufacture and deployment fee, is a hurdle to the wide use of the PMU in power systems. Unlike traditional PMUs, the frequency monitoring network (FNET) developed by the Virginia Tech Power IT lab is an Internet—based, GPS—synchronized, wide-area frequency monitoring network deployed at the distribution level, providing a low-cost and easily deployable WAMs solution. In this dissertation, the research work can be categorized into two parts: FNET Design and Situation Awareness Algorithm Development. / Ph. D. Event location estimation FNET Wide Area Measurement and Control (WAMC) Swing Equation Time Difference of Arrival (TDOA) Single Machine Infinite Bus (SMIB)
88	Mikrofonní pole malých rozměrů pro odhad směru přicházejícího zvuku / Small-Size Microphone Array for Estimation of Direction of Arrival of Sound Kubišta, Ladislav January 2020 (has links) This thesis describe detection of direction receiving sound with small–size microphone array. Work is based on analyzing methods of time delay estimation, energy decay or phase difference signal. Work focus mainly on finding of angle of arrival in small time difference. Results of measuring, as programming sound, so sound recorded in laboratory conditions and real enviroment, are mentioned in conclusion. All calculations were done by platform Matlab
89	A Wide-Area Perspective on Power System Operation and Dynamics Gardner, Robert Matthew 23 April 2008 (has links) Classically, wide-area synchronized power system monitoring has been an expensive task requiring significant investment in utility communications infrastructures for the service of relatively few costly sensors. The purpose of this research is to demonstrate the viability of power system monitoring from very low voltage levels (120 V). Challenging the accepted norms in power system monitoring, the document will present the use of inexpensive GPS time synchronized sensors in mass numbers at the distribution level. In the past, such low level monitoring has been overlooked due to a perceived imbalance between the required investment and the usefulness of the resulting deluge of information. However, distribution level monitoring offers several advantages over bulk transmission system monitoring. First, practically everyone with access to electricity also has a measurement port into the electric power system. Second, internet access and GPS availability have become pedestrian commodities providing a communications and synchronization infrastructure for the transmission of low-voltage measurements. Third, these ubiquitous measurement points exist in an interconnected fashion irrespective of utility boundaries. This work offers insight into which parameters are meaningful to monitor at the distribution level and provides applications that add unprecedented value to the data extracted from this level. System models comprising the entire Eastern Interconnection are exploited in conjunction with a bounty of distribution level measurement data for the development of wide-area disturbance detection, classification, analysis, and location routines. The main contributions of this work are fivefold: the introduction of a novel power system disturbance detection algorithm; the development of a power system oscillation damping analysis methodology; the development of several parametric and non-parametric power system disturbance location methods, new methods of power system phenomena visualization, and the proposal and mapping of an online power system event reporting scheme. / Ph. D. TDOA FNET FDR GPS Wide-Area monitoring wide-area measurements power system event power system load shedding generation trip eastern interconnection wams ems nerc ercot wecc parzen window interconnection islanding PMU half-plane method least squares event trigger generation-load mismatch electromechanical wave wave propagation time delay of arrival oscillation trigger modal analysis electric grid transmission network transmission system hypocenter frequency matrix pencil mahalanobis distance
90	Brave New World Reloaded: Advocating for Basic Constitutional Search Protections to Apply to Cell Phones from Eavesdropping and Tracking by Government and Corporate Entities Berrios-Ayala, Mark 01 December 2013 (has links) Imagine a world where someoneâ€™s personal information is constantly compromised, where federal government entities AKA Big Brother always knows what anyone is Googling, who an individual is texting, and their emoticons on Twitter. Government entities have been doing this for years; they never cared if they were breaking the law or their moral compass of human dignity. Every day the Federal government blatantly siphons data with programs from the original ECHELON to the new series like PRISM and Xkeyscore so they can keep their tabs on issues that are none of their business; namely, the personal lives of millions. Our allies are taking note; some are learning our bad habits, from Government Communications Headquartersâ€™ (GCHQ) mass shadowing sharing plan to Americaâ€™s Russian inspiration, SORM. Some countries are following the United Statesâ€™ poster child pose of a Brave New World like order of global events. Others like Germany are showing their resolve in their disdain for the rise of tyranny. Soon, these new found surveillance troubles will test the resolve of the American Constitution and its nationâ€™s strong love and tradition of liberty. Courts are currently at work to resolve how current concepts of liberty and privacy apply to the current conditions facing the privacy of society. It remains to be determined how liberty will be affected as well; liberty for the United States of America, for the European Union, the Russian Federation and for the people of the World in regards to the extent of privacy in todayâ€™s blurred privacy expectations. Legal Studies

Search results