• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 29
  • 6
  • 3
  • 2
  • 2
  • Tagged with
  • 61
  • 61
  • 18
  • 18
  • 17
  • 13
  • 13
  • 12
  • 11
  • 11
  • 9
  • 8
  • 8
  • 8
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Réseaux à grand nombre de microphones : applicabilité et mise en œuvre / Implementation and applicability of very large microphone arrays

Vanwynsberghe, Charles 12 December 2016 (has links)
L'apparition récente de microphones numériques MEMS a ouvert de nouvelles perspectives pour le développement de systèmes d'acquisition acoustiques massivement multi-canaux de grande envergure. De tels systèmes permettent de localiser des sources acoustiques avec de bonnes performances. En revanche, de nouvelles contraintes se posent. La première est le flux élevé de données issues de l'antenne, devant être traitées en un temps raisonnable. La deuxième contrainte est de connaître la position des nombreux microphones déployés in situ. Ce manuscrit propose des méthodes répondant à ces deux contraintes. Premièrement, une étude du système d'acquisition est présentée. On montre que les microphones MEMS sont adaptés pour des applications d'antennerie. Ensuite, un traitement en temps réel des signaux acquis via une implémentation parallèle sur GPU est proposé. Cette stratégie répond au problème de flux de données. On dispose ainsi d'un outil d'imagerie temps réel de sources large bande, permettant d'établir un diagnostic dynamique de la scène sonore.Deuxièmement, différentes méthodes de calibration géométrique pour la détermination de la position des microphones sont exposées. Dans des conditions réelles d'utilisation, les méthodes actuelles sont inefficaces pour des antennes étendues et à grand nombre de microphones. Ce manuscrit propose des techniques privilégiant la robustesse du processus de calibration. Les méthodes proposées couvrent différents environnements acoustiques réels, du champ libre au champ réverbérant. Leur efficacité est prouvée par différentes campagnes expérimentales. / Recently, digital MEMS microphones came out and have opened new perspectives. One of them is the design of large-aperture and massively multichannel acoustical acquisition systems. Such systems meet good requirements for efficient source localization. However, new problems arise. First, an important data flow comes from the array, and must be processed fast enough. Second, if the large array is set up in situ, retrieving the position of numerous microphones becomes a challenging task. This thesis proposes methods addressing these two problems. The first part exhibits the description of the acquisition system, which has been developed during the thesis. First, we show that MEMS microphone characteristics are suitable for array processing applications. Then, real-time processing of channel signals is achieved by a parallel GPU implementation. This strategy is one solution to the heavy data flow processing issue. In this way, a real-time acoustic imaging tool was developed, and enables a dynamic wide-band diagnosis, for an arbitrary duration.The second part presents several robust geometric calibration methods: they retrieve microphone positions, based only on the array acoustic signals. Indeed, in real-life conditions, the state of the art methods are inefficient with large arrays. This thesis proposes techniques that guarantee the robustness of the calibration process. The proposed methods allow calibration in the different existing soundscapes, from free field to reverberant field. Various experimental scenarios prove the efficiency of the methods.
52

Aplikace mikrofonního pole / Microphone array application

Toman, Vít January 2013 (has links)
Master’s thesis deals with description of issues of reading spatial audio signals using microphone array. Basic method of beamforming (Delay and Sum) is characterized on basis of chosen conception of the microphone array. Specific issues of audio detection and digital signal processing of converted audio signals are characterized and some ways how to solve the issues are adumbrated. Features and limitations of chosen ARM processor in the digital processing of multiple audio signals are described. Especially features and limitations of an internal A/D converter from the perspective of the beamforming are described.
53

Capturing and Modeling a Three-Dimensional Stationary Noise Source Directivity Pattern with a Dynamic Array in the Near Field

Mieskoski, Randy January 2013 (has links)
No description available.
54

Audiovisual voice activity detection and localization of simultaneous speech sources / Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais

Minotto, Vicente Peruffo January 2013 (has links)
Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas, para os casos de uma e múltiplas fontes sonoras, através do uso de informações audiovisuais, explorando-se variadas maneiras de se fundir as modalidades de áudio e vídeo. Este trabalho também emprega um arranjo de microfones para o processamento de som, o qual permite que as informações espaciais dos sinais acústicos sejam exploradas através do algoritmo estado-da-arte SRP (Steered Response Power). Por consequência adicional, uma eficiente implementação em GPU do SRP foi desenvolvida, possibilitando processamento em tempo real do algoritmo. Os experimentos realizados mostram uma acurácia média de 95% ao se efetuar VAD de até três locutores simultâneos, e um erro médio de 10cm ao se localizar tais locutores. / Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
55

Caractérisation du rayonnement acoustique d'un rail à l'aide d'un réseau de microphones / Spatial characterization of the wheel/rail contact noise by a multi-sensors method

Faure, Baldrik 22 September 2011 (has links)
Le secteur des transports ferroviaires en France est marqué par un dynamisme lié notamment à l'essor du réseau à grande vitesse et à la réimplantation du tramway dans de nombreuses agglomérations. Dans ce contexte, la réduction des nuisances sonores apparaît comme un enjeu majeur pour son développement. Afin d'agir efficacement à la source, il est indispensable d'identifier et d'étudier précisément les sources responsables de ces nuisances au passage des véhicules. Parmi les approches possibles, les antennes microphoniques et les traitements associés sont particulièrement adaptés à la caractérisation des sources ponctuelles mobiles, omnidirectionnelles et décorrélées.Pour les vitesses inférieures à 300 km/h, le bruit de roulement constitue la source principale du bruit ferroviaire ; il résulte du rayonnement acoustique des éléments tels que les roues, le rail et les traverses. Le rail, dont la contribution au bruit de roulement est prépondérante aux moyennes fréquences (entre 500 He et 1000 Hz environ), est une source étendue et cohérente pour laquelle les principes classiques de traitement d'antenne ne sont pas adaptés.La méthode de caractérisation proposée dans cette thèse est une méthode inverse d'optimisation paramétrique utilisant les signaux acoustiques issus d'une antenne microphonique. Les paramètres inconnus d'un modèle vibro-acoustique sont estimés par minimisation d'un critère des moindres carrés sur les matrices spectrales mesurée et modélisée au niveau de l'antenne. Dans le modèle vibro-acoustique, le rail est assimilé à un monopôle cylindrique dont la distribution longitudinale d'amplitude est liée à celle des vitesses vibratoires. Pour le calcul de ces vitesses, les différents modèles proposés mettent en évidence des ondes vibratoires se propageant dans le rail de part et d'autre de chaque excitation. Chacune de ces ondes est caractérisée par une amplitude au niveau de l'excitation, un nombre d'onde structural réel et une atténuation. Ces paramètres sont estimés par minimisation du critère, puis utilisés pour reconstruire le champ acoustique.Dans un premier temps, des simulations sont réalisées pour juger des performances de la méthode proposée, dans le cas d'excitations ponctuelles verticales. En particulier, sa robustesse est testée en présence de bruit ou d'incertitudes sur les paramètres supposés connus du modèle. Les effets de l'utilisation de modèles dégradés sont également étudiés. Concernant l'estimation des amplitudes, les résultats ont montré que la méthode est particulièrement robuste et efficace pour les excitations les plus proches de l'antenne. En revanche, pour l'estimation des autres paramètres, les performances sont supérieures pour les positions d'antenne excentrées. De manière générale, le nombre d'onde est correctement estimé sur l'ensemble des fréquences étudiées. Dans les cas à faible atténuation, un traitement classique par formation de voies en ondes planes suffit. En ce qui concerne l'estimation de l'atténuation, la faible sensibilité du critère limite l'efficacité de la méthode proposée.Enfin, certains résultats obtenus à partir des simulations ont été vérifiés lors de mesures in situ. L'excitation d'un rail expérimental par un marteau de chocs a tout d'abord permis de valider le modèle vibratoire pour la flexion verticale. Pour tester la méthode d'optimisation paramétrique, le rail a également été excité verticalement à l'aide d'un pot vibrant. Les principaux résultats des simulations ont été retrouvés, et des comportements particuliers relatifs à la présence de plusieurs ondes dans le rail ont été observés, ouvrant des perspectives de généralisation du modèle vibratoire utilisé. / In France, railway transport has been boosted by the expansion of the high-speed rail service and the resurgent implantation of tram networks in many city centers. In this context, the reduction of noise pollution becomes a crucial issue for its development. In order to directly act on the source area, it is necessary to precisely identify and study the sources responsible for this nuisance at train pass-by. Among all the potential approaches, microphone arrays and related signal processing techniques are particularly adapted to the characterization of omnidirectional and uncorrelated moving point sources. For speeds up to 300 km/h, rolling noise is the main railway noise source. It arises from the acoustic radiation of various elements such as wheels, rail or sleepers. The rail, which mainly contributes to rolling noise at mid-frequencies (from 500 Hz to 1000 Hz approximately), is an extended coherent source for which classical array processing methods are inappropriate. The characterization method proposed in this thesis is an inverse parametric optimization method that uses the acoustical signals measured by a microphone array. The unknown parameters of a vibro-acoustical model are estimated through the minimization of a least square criterion applied to the entries of the measured and modelled spectral matrices. In this vibro-acoustical model, the rail is considered as a cylindrical monopole whose lengthwise amplitude distribution is obtained from the vibratory velocity one. The different models proposed to obtain this velocity highlight the propagation of vibration waves towards both sides of every forcing point. Each wave is characterized by an amplitude at the forcing point, a real structural wavenumber and a decay rate. These parameters are estimated by the minimization of the least square criterion, and are then used in the vibro-acoustical model to rebuild the acoustical field radiated by the rail. First, simulations are performed in order to appraise the performances of the proposed method, in the case of vertical point excitations. In particular, its robustness to additive noise and to uncertainties in the model parameters that are supposed to be known is tested. The effect of using simplified models is also investigated. Results show that the method is efficient and robust for the amplitude estimation of the nearest contacts to the array. On the other hand, the estimation of the other parameters is improved when the array is shifted away from the contact points. The wavenumber is generally well estimated over the entire frequency range, and when the decay rate is low, a single beamforming technique may be sufficient. Concerning the decay rate estimation, the efficiency of the method is limited by the low sensitivity of the criterion. At last, measurements are performed in order to verify some results obtained from the simulations. The vibratory model is first validated for the vertical flexural waves trough the use of an impact hammer. Then, the parametric optimization method is tested by the vertical excitation of the rail with a modal shaker. The main simulation results are found, and some particular behavior due to other waves existing in the rail can be observed, opening the perspective of a generalized method including more complex vibratory modelings.
56

Audiovisual voice activity detection and localization of simultaneous speech sources / Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais

Minotto, Vicente Peruffo January 2013 (has links)
Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas, para os casos de uma e múltiplas fontes sonoras, através do uso de informações audiovisuais, explorando-se variadas maneiras de se fundir as modalidades de áudio e vídeo. Este trabalho também emprega um arranjo de microfones para o processamento de som, o qual permite que as informações espaciais dos sinais acústicos sejam exploradas através do algoritmo estado-da-arte SRP (Steered Response Power). Por consequência adicional, uma eficiente implementação em GPU do SRP foi desenvolvida, possibilitando processamento em tempo real do algoritmo. Os experimentos realizados mostram uma acurácia média de 95% ao se efetuar VAD de até três locutores simultâneos, e um erro médio de 10cm ao se localizar tais locutores. / Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
57

Audiovisual voice activity detection and localization of simultaneous speech sources / Detecção de atividade de voz e localização de fontes sonoras simultâneas utilizando informações audiovisuais

Minotto, Vicente Peruffo January 2013 (has links)
Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Detecção de Atividade de Voz (Voice Activity Detection - VAD) e Localização de Fontes Sonoras (Sound Source Localization - SSL). Consequentemente, VAD e SSL também servem como ferramentas mandatórias de pré-processamento em aplicações de Interfaces Humano-Computador (Human Computer Interface - HCI), como no caso de reconhecimento automático de voz e identificação de locutor. Entretanto, VAD e SSL ainda são problemas desafiadores quando se lidando com cenários acústicos realísticos, particularmente na presença de ruído, reverberação e locutores simultâneos. Neste trabalho, são propostas abordagens para tratar tais problemas, para os casos de uma e múltiplas fontes sonoras, através do uso de informações audiovisuais, explorando-se variadas maneiras de se fundir as modalidades de áudio e vídeo. Este trabalho também emprega um arranjo de microfones para o processamento de som, o qual permite que as informações espaciais dos sinais acústicos sejam exploradas através do algoritmo estado-da-arte SRP (Steered Response Power). Por consequência adicional, uma eficiente implementação em GPU do SRP foi desenvolvida, possibilitando processamento em tempo real do algoritmo. Os experimentos realizados mostram uma acurácia média de 95% ao se efetuar VAD de até três locutores simultâneos, e um erro médio de 10cm ao se localizar tais locutores. / Given the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
58

Tvarování přijímací charakteristiky mikrofonních polí / Beamforming using microphone arrays

Bartoň, Zdeněk January 2010 (has links)
The aim of the master thesis is to sum up theoretical information about beamforming methods of microphone arrays and to verify their functionality. At the beginning of this work there are simulated different varietes of linear uniform and nonuniform microphone arrays and circular arrays. The results are verificated by a practical measurement in ideal conditions. Then I will focuse on implementation of the DAS(Delay And Sum), SAB(Sub Array Beamforming), CDB(Constant Directivity Beamforming), CDB-CA(CDB-Circular Arrays) beamformer including theoretical and practical verification of the functionality in ideal conditions. At the end of this thesis are all beamforming methods compared with each other at SNR(signal to Noise Ratio) and directivity parameters.
59

Large-scale structures and noise generation in high-speed jets

Hileman, James Isaac 10 March 2004 (has links)
No description available.
60

Approach for frequency response-calibration for microphone arrays / Metod för kalibrering av frekvenssvar för mikrofonarrayer

Drotz, Jacob January 2023 (has links)
Matched frequency responses are a fundamental starting point for a variety ofimplementations for microphone arrays. In this report, two methods for frequencyresponse-calibration of a pre-assembled microphone array are presented andevaluated. This is done by extracting the deviation in frequency responses of themicrophones in relation to a selected reference microphone, using a swept sine asa stimulus signal and an inverse filter. The swept sine includes all frequencieswithin the bandwidth of human speech. This allows for a full frequency responsemeasurements from all microphones using a single recording.Using the swept sine, the deviation in frequency response between the microphonescan be obtained. This deviation represents the scaling factor that all microphonesmust be calibrated with to match the reference microphone. Applying the scalingfactors on the recorded stimulus signal shows an improvement for both implementedmethods, and where one method matches the frequency response of the microphoneswith high accuracy.Once the scaling factors of the various microphones is obtained, it can be usedto calibrate other recorded signals. This leads to an minor improvement formatching the frequency responses, as it has been shown that the differencesin frequency response between the microphones is signal-dependent and variesbetween recordings. The response differences between the microphones dependson the design of the array, speaker, room and the acoustic frequency dispersionthat occurs with sound waves. This makes it difficult to calibrate the frequencyresponses of the microphones without appropriate equipment because the responseof the microphones is noticeably affected by these other factors. Proposals to addressthese problems are discussed in the report as future work. / Matchade frekvenssvar är en grundläggande utgångspunkt för ett flertal implementationer för mikrofonarrayer. I denna rapport presenteras och utvärderas tvåmetoder för frekvenssvarskalibrering för en förmonterad mikrofonarray. Detta görsgenom att extrahera avvikelsen i frekvenssvar hos alla mikrofoner i förhållandetill en vald referensmikrofon. Frekvenssvaren tas fram med hjälp av ettsinussvep som stimulanssignal och ett inverterat filter. Sinussvepet inkluderar helafrekvensbredden för mänskligt tal och möjliggör att mikrofonernas fulla frekvenssvarkan analyseras från en enda inspelning.Med hjälp av sinussvepet kan avvikelsen i frekvenssvar mellan mikrofonerna erhållas.Denna avvikelse representerar den skalningsfaktor alla mikrofoner måste kalibrerasefter för att matcha referensmikrofonen. Genom att applicera faktorerna på deninspelade stimulussignalen ses en förbättring för båda implementerade metoderna,där en metod matchar mikrofonernas frekvenssvar med hög noggrannhet.När skalningsfaktorn för de olika mikrofonerna har erhållits kan den användas föratt kalibrera andra inspelade signaler. Detta leder till en liten förbättring i att matchafrekvenssvaren, då det har visat sig att skillnader mellan mikrofonernas frekvenssvarär signalberoende och varierar mellan inspelningar. Skillnader i frekvenssvar mellanmikrofonerna beror på ljudets utbredning i rummet, utformningen av arrayen,högtalaren och den akustiska frekvensspridningen som uppstår hos ljudvågor. Dettagör det svårt att kalibrera frekvenssvaren hos mikrofonerna utan lämplig utrustningeftersom mikrofonernas respons märkbart påverkas av dessa andra faktorer. Förslagför att kringgå dessa problem diskuteras i rapporten och tas upp som framtidaarbete.

Page generated in 0.0673 seconds