• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 26
  • 26
  • 8
  • 6
  • 5
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Annulation d'écho acoustique pour terminaux mobiles à un et deux microphones / Acoustic echo cancellation for single- and dual-microphone devices : application to mobile devices

Yemdji Tchassi, Christelle 18 June 2013 (has links)
Mobile terminals are arguably the most popular telecommunications device of the present day. With the expectation of use anytime, anywhere, mobile terminals are increasingly used in adverse scenarios such as in hands-free mode and in noisy environments. Speech quality is commonly degraded in such cases by the presence of acoustic echo and ambient noise. In consequence, mobile terminals are generally equipped with speech signal processing algorithms in order to assure acceptable speech quality. Classical approaches to speech signal processing involve independent acoustic echo cancellation, noise suppression and post-filtering. While performance is generally acceptable, degradations are noticeable at low signal-to-echo ratios (hands-free scenarios) and computational complexity can be high. Furthermore, while mobile terminals are increasingly equipped with multiple microphones, they are generally exploited for noise suppression alone, even if there is natural potential for combined noise suppression and echo control. This thesis presents new combination and synchronization architecture for acoustic echo cancellation for single- and dual-microphone devices. It moves beyond the current state-of-the-art by reducing computational complexity while improving performance in low signal-to-echo conditions. The thesis also presents the first dual-microphone solution to double-talk detection. These contributions pave the way for further applied research in speech processing; the novel architecture is readily extendible to multiple-microphone scenarios while respecting levels of computational efficiency required for integration in current mobile terminals. / Les téléphones mobiles sont sans aucun doute les terminaux de télécommunication le plus populaire de nos jours. Le besoin de mobilité étant toujours croissant, les téléphones mobiles sont parfois utilisés dans des conditions très adverses : mains-libres ou environnements bruités. Dans ces conditions, la qualité de la parole est perturbée par la présence de l'écho acoustique et du bruit ambiant. Les terminaux sont généralement équipés d'algorithmes de traitement de la parole afin de garantir une qualité de la parole acceptable. Composés d’un annuleur d’écho adaptatif, d’une réduction de bruit et d’une suppression d’écho résiduel, les chaines de traitement de parole classiques fournissent en général une qualité de la parole acceptable moyennant une complexité de calcul importante. Néanmoins, lorsque le rapport signal à écho est faible on peut noter des dégradations du signal utile. Les terminaux mobiles récents sont de plus en plus équipés de plusieurs microphones qui ne sont alors utilisés que pour la réduction de bruit bien qu’ils présentent un indéniable intérêt pour les systèmes de réduction conjointe de bruit et d’écho résiduel. Cette thèse présente une nouvelle architecture combinée d’annulation d’écho pour terminaux mobiles à un ou deux microphones. L’architecture proposée réduit efficacement la complexité de calcul tout en améliorant la qualité de la parole dans les scénarios défavorables. Nous présentons également la première solution bi-microphones de détection de double parole. Enfin, nos techniques bi-microphones peuvent facilement être appliquées aux terminaux multi-microphones et tout en ayant une capacité calculatoire acceptable pour les téléphones mobiles.
12

nsAnalyser : Speech quality testing application for telephone service / nsAnalyser : Talkvalitetstestapplikation för telefonitjänst

Stahl, Alexander January 2013 (has links)
This degree project was made in collaboration with Nordicstation. The project task was to develop a testing application for a self-developed telephone survey service, which uses third party software. This third party software showed to be unreliable at higher loads. The purpose of the application is to analyse the speech quality of clients connected to the service. This report gives an introduction to the speech quality algorithms Perceptual Evaluation of Speech Quality (PESQ) and Single Sided Speech Quality Measure (3SQM). It also gives descriptions of the methods used to develop the application. The final chapters in this report are about the testing of the telephone service. The primary result of the testing was that the telephone service is unable to acceptably handle 80+ clients and recommendations to Nordicstation is to set a maximum of parallel connected clients to 80 or find an alternative to the third party software currently in use. / Detta examensarbete har gjorts i samarbete med Nordicstation. Projektets uppgift var att utveckla ett test program för at testa en egenutvecklad telefonundersökning-tjänst, baserad på tredjeparts mjukvara. Denna tredjeparts mjukvara visade sig vara opålitlig vid högre belastning. Syftet med programmet är att analysera samtals kvalitéten på de klienter som är anslutna till tjänsten. Denna rapport ger en introduktion till ljudkvalitetsalgoritmer så som Perceptual Evaluation of Speech Quality (PESQ) och Single Sided Speech Quality Measure (3SQM). Rapporten går även igenom de metoder som använts för att utvecklat programmet. De sista kapitlen i denna rapport är om själva testningen av telefonitjänsten. Det primära resultatet av testningen var att telefontjänsten inte kan hantera 80+ klienter acceptabelt och rekommendationer till Nordicstation är att sätta ett tak på maximalt parallellt anslutna klienter till 80 eller hitta ett alternativ till den tredjeparts mjukvara som nu används.
13

Operating system based perceptual evaluation of call quality in radio telecommunications networks : development of call quality assessment at mobile terminals using the Symbian operating system, comparison with traditional approaches and proposals for a tariff regime relating call charging to perceived speech quality

Aburas, Akram January 2012 (has links)
Call quality has been crucial from the inception of telecommunication networks. Operators need to monitor call quality from the end-user's perspective, in order to retain subscribers and reduce subscriber 'churn'. Operators worry not only about call quality and interconnect revenue loss, but also about network connectivity issues in areas where mobile network gateways are prevalent. Bandwidth quality as experienced by the end-user is equally important in helping operators to reduce churn. The parameters that network operators use to improve call quality are mainly from the end-user's perspective. These parameters are usually ASR (answer seizure ratio), PDD (postdial delay), NER (network efficiency ratio), the number of calls for which these parameters have been analyzed and successful calls. Operators use these parameters to evaluate and optimize the network to meet their quality requirements. Analysis of speech quality is a major arena for research. Traditionally, users' perception of speech quality has been measured offline using subjective listening tests. Such tests are, however, slow, tedious and costly. An alternative method is therefore needed; one that can be automatically computed on the subscriber's handset, be available to the operator as well as to subscribers and, at the same time, provide results that are comparable with conventional subjective scores. QMeter® 'a set of tools for signal and bandwidth measurement that have been developed bearing in mind all the parameters that influence call and bandwidth quality experienced by the end-user' addresses these issues and, additionally, facilitates dynamic tariff propositions which enhance the credibility of the operator. This research focuses on call quality parameters from the end-user's perspective. The call parameters used in the research are signal strength, successful call rate, normal drop call rate, and hand-over drop rate. Signal strength is measured for every five milliseconds of an active call and average signal strength is calculated for each successful call. The successful call rate, normal drop rate and hand-over drop rate are used to achieve a measurement of the overall call quality. Call quality with respect to bundles of 10 calls is proposed. An attempt is made to visualize these parameters for better understanding of where the quality is bad, good and excellent. This will help operators, as well as user groups, to measure quality and coverage. Operators boast about their bandwidth but in reality, to know the locations where speed has to be improved, they need a tool that can effectively measure speed from the end-user's perspective. BM (bandwidth meter), a tool developed as a part of this research, measures the average speed of data sessions and stores the information for analysis at different locations. To address issues of quality in the subscriber segment, this research proposes the varying of tariffs based on call and bandwidth quality. Call charging based on call quality as perceived by the end-user is proposed, both to satisfy subscribers and help operators to improve customer satisfaction and increase average revenue per user. Tariff redemption procedures are put forward for bundles of 10 calls and 10 data sessions. In addition to the varying of tariffs, quality escalation processes are proposed. Deploying such tools on selected or random samples of users will result in substantial improvement in user loyalty which, in turn, will bring operational and economic advantages.
14

Multisensor Segmentation-based Noise Suppression for Intelligibility Improvement in MELP Coders

Demiroglu, Cenk 18 January 2006 (has links)
This thesis investigates the use of an auxiliary sensor, the GEMS device, for improving the quality of noisy speech and designing noise preprocessors to MELP speech coders. Use of auxiliary sensors for noise-robust ASR applications is also investigated to develop speech enhancement algorithms that use acoustic-phonetic properties of the speech signal. A Bayesian risk minimization framework is developed that can incorporate the acoustic-phonetic properties of speech sounds and knowledge of human auditory perception into the speech enhancement framework. Two noise suppression systems are presented using the ideas developed in the mathematical framework. In the first system, an aharmonic comb filter is proposed for voiced speech where low-energy frequencies are severely suppressed while high-energy frequencies are suppressed mildly. The proposed system outperformed an MMSE estimator in subjective listening tests and DRT intelligibility test for MELP-coded noisy speech. The effect of aharmonic comb filtering on the linear predictive coding (LPC) parameters is analyzed using a missing data approach. Suppressing the low-energy frequencies without any modification of the high-energy frequencies is shown to improve the LPC spectrum using the Itakura-Saito distance measure. The second system combines the aharmonic comb filter with the acoustic-phonetic properties of speech to improve the intelligibility of the MELP-coded noisy speech. Noisy speech signal is segmented into broad level sound classes using a multi-sensor automatic segmentation/classification tool, and each sound class is enhanced differently based on its acoustic-phonetic properties. The proposed system is shown to outperform both the MELPe noise preprocessor and the aharmonic comb filter in intelligibility tests when used in concatenation with the MELP coder. Since the second noise suppression system uses an automatic segmentation/classification algorithm, exploiting the GEMS signal in an automatic segmentation/classification task is also addressed using an ASR approach. Current ASR engines can segment and classify speech utterances in a single pass; however, they are sensitive to ambient noise. Features that are extracted from the GEMS signal can be fused with the noisy MFCC features to improve the noise-robustness of the ASR system. In the first phase, a voicing feature is extracted from the clean speech signal and fused with the MFCC features. The actual GEMS signal could not be used in this phase because of insufficient sensor data to train the ASR system. Tests are done using the Aurora2 noisy digits database. The speech-based voicing feature is found to be effective at around 10 dB but, below 10 dB, the effectiveness rapidly drops with decreasing SNR because of the severe distortions in the speech-based features at these SNRs. Hence, a novel system is proposed that treats the MFCC features in a speech frame as missing data if the global SNR is below 10 dB and the speech frame is unvoiced. If the global SNR is above 10 dB of the speech frame is voiced, both MFCC features and voicing feature are used. The proposed system is shown to outperform some of the popular noise-robust techniques at all SNRs. In the second phase, a new isolated monosyllable database is prepared that contains both speech and GEMS data. ASR experiments conducted for clean speech showed that the GEMS-based feature, when fused with the MFCC features, decreases the performance. The reason for this unexpected result is found to be partly related to some of the GEMS data that is severely noisy. The non-acoustic sensor noise exists in all GEMS data but the severe noise happens rarely. A missing data technique is proposed to alleviate the effects of severely noisy sensor data. The GEMS-based feature is treated as missing data when it is detected to be severely noisy. The combined features are shown to outperform the MFCC features for clean speech when the missing data technique is applied.
15

Τεχνικές προσανατολισμένης λήψης για μη στάσιμα ακουστικά σήματα : συγκριτική πειραματική αξιολόγηση σε πραγματικές συνθήκες

Πλατυπόδη, Μαρία 27 April 2015 (has links)
Οι τεχνικές προσανατολισμένης λήψης έχουν μελετηθεί εκτενώς τις τελευταίες δεκαετίες, καθώς βρίσκουν εφαρμογή σε διάφορους τομείς. Ωστόσο, για σήματα ευρείας ζώνης το πρόβλημα αυτό δεν έχει διερευνηθεί διεξοδικά. Σκοπός αυτής της εργασίας είναι να αναδείξει τις δυνατότητες και τους εγγενής περιορισμούς των τεχνικών προσανατολισμένης λήψης. Στα πρώτα κεφάλαια παρουσιάζονται οι θεμελιώδεις έννοιες της επεξεργασίας σημάτων σε διατάξεις μικροφώνων και οι πιο ευρέως χρησιμοποιούμενες τεχνικές προσανατολισμένης λήψης. Στο τελευταίο κεφάλαιο πραγματοποιούνται εξοικειώσεις πραγματικών ακουστικών συνθηκών σύμφωνα με το πρότυπο ETSI EG 202 396. Το μη-ανηχοϊκό μοντέλο υιοθετείται και πραγματικά ακουστικά σήματα λαμβάνονται από γραμμικές διατάξεις μικροφώνων. Ακόμη, η τεχνική ημίτονου εκθετικής σάρωσης χρησιμοποιείται για την εκτίμηση της κρουστικής απόκρισης των Ν-ακουστικών καναλιών. Τέλος, το μοντέλο 3-QUEST χρησιμοποιείται για την μέτρηση της ποιότητας ομιλίας σε θορυβώδη περιβάλλοντα. / Beamforming techniques have been studied extensively due to its applications in various areas. However, most of the efforts have been focused on the narrowband case. For wideband signals, this problem has not been thoroughly investigated. This thesis aims is to highlight potentials and the limitations of the conventional beamforming techniques. In the first chapters, the fundamental array processing theory and the most widely used beamforming techniques are presented. In the last chapter, different real-world acoustic scenarios are simulated according to ETSI EG 202 396-3 standard. In the simulations, the reverberant model is assumed and real audio signals are captured by a linear microphone array. The coefficients of the spatial filter are computed with the MVDR criterion. Moreover, acoustic impulse responses measurements are presented and performed for the construction of the steering vector. The speech quality in presence of background noise is measured by the 3-QUEST model.
16

Développement d'une méthode de diagnostic technique des dégradations de qualité vocale perçue des communications téléphoniques à partir d'une analyse du signal de parole / Development of a technical diagnostic method for voice quality impairments perceived in telephone communications, based on an analysis of speech signal

Tiemounou, Sibiri 17 February 2014 (has links)
Les opérateurs de télécommunications se doivent de maîtriser et d'évaluer la qualité des services qu'ils offrent à leurs clients, dans un contexte en perpétuelle évolution. Comme alternative rapide et à moindre coût aux évaluations fondées sur l'interrogation d'utilisateurs, des outils de mesure ont été développés, qui intègrent des modèles permettant de prédire la qualité perçue. Cette thèse avait pour but de concevoir un outil de diagnostic de qualité vocale (applicable aux services de téléphonie), complémentaire à de tels modèles objectifs, afin d'obtenir des informations spécifiques sur la nature des défauts présents sur le signal audio et d'orienter vers des causes potentielles de ces défauts. En partant de l'hypothèse que la qualité vocale est multidimensionnelle, nous avons fondé l'outil de diagnostic sur la modélisation des quatre dimensions identifiées dans la littérature : la Bruyance, représentative des bruits de fond, la Continuité, relative à la perception des discontinuités dans le signal, la Coloration, liée aux distorsions du spectre de la voix, et la Sonie, traduisant la perception du niveau sonore. Chacune de ces dimensions est quantifiée à l'aide d'indicateurs de qualité issus de l'analyse du signal audio. Notre démarche a consisté, dans un premier temps, à rechercher dans des modèles objectifs récents (notamment la norme P.863 de l'UIT-T) des indicateurs de qualité et à en développer d'autres pour caractériser parfaitement chaque dimension. S'est ensuivie une étude de performances de ces indicateurs, les plus pertinents devant être intégrés dans notre outil de diagnostic. Finalement, pour chaque dimension, nous avons développé un module de classification automatique de défauts perçus en fonction de la nature du défaut identifié dans le signal, ainsi qu'un module supplémentaire estimant l'impact du défaut sur la qualité vocale. L'outil proposé couvre les trois bandes audio (bande étroite, bande élargie et bande super-élargie) couramment utilisées dans les systèmes de télécommunications avec, toutefois, une priorité pour les signaux en bande super-élargie, plus représentatifs des contenus audio qu'on sera amené à rencontrer dans les futurs services de télécommunications. / Quality of service is a huge issue for telecommunications operators since they have to master and evaluate it in order to satisfy their customers. To replace expensive and time-consuming human judgment methods, objective methods, integrating objective models providing a prediction of the perceived quality, have been conceived. Our research aimed at developing a technical diagnostic method, complementary to objective voice quality models, which provides specific information about the nature of the perceived voice quality impairments and identifies the underlying technical causes. Assuming that speech quality is a multidimensional phenomenon, our technical diagnostic method is built on the modelling of the four perceptual dimensions identified in the literature: “Noisiness” relative to the perceived background noise, “Continuity” linked to discontinuity, “Coloration” related to frequency–response degradations and “Loudness” corresponding to the impact of the speech level, each one being quantified by quality degradation indicators based on audio signal analysis. A crucial step of our research was to find and/or to develop relevant quality degradation indicators to perfectly characterize each dimension. To do so, we identified quality degradation indicators in the most recent objective voice quality models (particularly the ITU-T P.863 recommendation, known as POLQA) and we analysed the performance of identified indicators. Then, the most relevant indicators have been considered in our diagnostic method. Finally, for each dimension, we proposed a detection block which automatically classifies a perceived degradation according to the nature of the defect detected in the audio signal, and an additional block providing information about the impact of degradations on speech quality. The proposed technical diagnostic method is designed to cover three bandwidths (Narrowband, Wideband and Super Wideband) used in telecommunications systems with a priority investigation to Super Wideband speech signals which remain very useful for future telephony applications.
17

Kombinované vícepásmové adaptivní zvýraznění řeči / Composite Subband Adaptive Speech Enhancement

Hovorka, Jaroslav January 2016 (has links)
The thesis deals with single channel and multiple channel algorithms for speech enhancement. The goal of this work is to perform the deep analysis of both single channel and multiple channel algorithms in sense of their behaviour in noisy environment of combat vehicles and platforms. Based on this analysis a new composite speech enhancement algorithm will be designed. This new approach is expected to increase quality of the processed speech in military communications systems. These systems are characterised by their operation under very noisy conditions where background noise is very high and signal-to-noise ratio extremely low. These noisy conditions are typical for the range of military and combat platforms and vehicles.
18

Evaluation of the Perceived Speech Quality for G729D and Opus : With Different Network Scenarios and an Implemented VoIP Application

Almér, Louise January 2022 (has links)
Communication has always been a vital part of our society, and day-to-day communication is increasingly becoming more digital. VoIP (voice over IP) is used for real-time communication, and to be able to send the information over the internet must the speech be compressed to lower the number of bits needed for transmission. Codecs are used to compress the speech, or any other type of data transmitting over a network, which can introduce some noise if lossy compression is used. Depending on the bandwidth, bit rate, and codec used can distortion be minimized which would result in higher perceived speech quality. In the thesis, two codecs, G729D and Opus, were tested and evaluated with two different objective perceive speech quality metrics, POLQA and PESQ. The codecs were also tested with different emulated network scenarios, 2G, 3G, 4G, satellite two-hop, and LAN. Furthermore, Opus was tested with and without VAD (voice activity detection) to see how VAD could affect the perceived speech quality. The different network scenarios did not impact the results of the evaluation, since the main difference between the network scenarios was latency, which POLQA and PESQ do not consider in the evaluation. Opus achieved a higher MOS-LQO (mean opinion score listening quality objective) than G729D. However, when VAD was enabled with Opus for a low bit rate, 8 kbit/s, the MOS-LQO was lower than without VAD.
19

Deep learning methods for reverberant and noisy speech enhancement

Zhao, Yan 15 September 2020 (has links)
No description available.
20

Operating System Based Perceptual Evaluation of Call Quality in Radio Telecommunications Networks. Development of call quality assessment at mobile terminals using the Symbian operating system, comparison with traditional approaches and proposals for a tariff regime relating call charging to perceived speech quality.

Aburas, Akram January 2012 (has links)
Call quality has been crucial from the inception of telecommunication networks. Operators need to monitor call quality from the end-user¿s perspective, in order to retain subscribers and reduce subscriber ¿churn¿. Operators worry not only about call quality and interconnect revenue loss, but also about network connectivity issues in areas where mobile network gateways are prevalent. Bandwidth quality as experienced by the end-user is equally important in helping operators to reduce churn. The parameters that network operators use to improve call quality are mainly from the end-user¿s perspective. These parameters are usually ASR (answer seizure ratio), PDD (postdial delay), NER (network efficiency ratio), the number of calls for which these parameters have been analyzed and successful calls. Operators use these parameters to evaluate and optimize the network to meet their quality requirements. Analysis of speech quality is a major arena for research. Traditionally, users¿ perception of speech quality has been measured offline using subjective listening tests. Such tests are, however, slow, tedious and costly. An alternative method is therefore needed; one that can be automatically computed on the subscriber¿s handset, be available to the operator as well as to subscribers and, at the same time, provide results that are comparable with conventional subjective scores. QMeter® ¿ a set of tools for signal and bandwidth measurement that have been developed bearing in mind all the parameters that influence call and bandwidth quality experienced by the end-user ¿ addresses these issues and, additionally, facilitates dynamic tariff propositions which enhance the credibility of the operator. This research focuses on call quality parameters from the end-user¿s perspective. The call parameters used in the research are signal strength, successful call rate, normal drop call rate, and hand-over drop rate. Signal strength is measured for every five milliseconds of an active call and average signal strength is calculated for each successful call. The successful call rate, normal drop rate and hand-over drop rate are used to achieve a measurement of the overall call quality. Call quality with respect to bundles of 10 calls is proposed. An attempt is made to visualize these parameters for better understanding of where the quality is bad, good and excellent. This will help operators, as well as user groups, to measure quality and coverage. Operators boast about their bandwidth but in reality, to know the locations where speed has to be improved, they need a tool that can effectively measure speed from the end-user¿s perspective. BM (bandwidth meter), a tool developed as a part of this research, measures the average speed of data sessions and stores the information for analysis at different locations. To address issues of quality in the subscriber segment, this research proposes the varying of tariffs based on call and bandwidth quality. Call charging based on call quality as perceived by the end-user is proposed, both to satisfy subscribers and help operators to improve customer satisfaction and increase average revenue per user. Tariff redemption procedures are put forward for bundles of 10 calls and 10 data sessions. In addition to the varying of tariffs, quality escalation processes are proposed. Deploying such tools on selected or random samples of users will result in substantial improvement in user loyalty which, in turn, will bring operational and economic advantages.

Page generated in 0.0882 seconds