Spelling suggestions: "subject:"cource separation"" "subject:"bource separation""
81 |
Localisation et rehaussement de sources de parole au format Ambisonique : analyse de scènes sonores pour faciliter la commande vocale / Localization and enhancement of speech from the Ambisonics formatPerotin, Lauréline 31 October 2019 (has links)
Cette thèse s'inscrit dans le contexte de l'essor des assistants vocaux mains libres. Dans un environnement domestique, l'appareil est généralement posé à un endroit fixe, tandis que le locuteur s'adresse à lui depuis diverses positions, sans nécessairement s'appliquer à être proche du dispositif, ni même à lui faire face. Cela ajoute des difificultés majeures par rapport au cas, plus simple, de la commande vocale en champ proche (pour les téléphones portables par exemple) : ici, la réverbération est plus importante ; des réflexions précoces sur les meubles entourant l'appareil peuvent brouiller le signal ; les bruits environnants sont également sources d'interférences. À ceci s'ajoutent de potentiels locuteurs concurrents qui rendent la compréhension du locuteur principal particulièrement difficile. Afin de faciliter la reconnaissance vocale dans ces conditions adverses, plusieurs pré-traitements sont proposés ici. Nous utilisons un format audio spatialisé, le format Ambisonique, adapté à l'analyse de scènes sonores. Dans un premier temps, nous présentons une méthode de localisation des sources sonores basée sur un réseau de neurones convolutif et récurrent. Nous proposons des descripteurs inspirés du vecteur d'intensité acoustique qui améliorent la performance de localisation, notamment dans des situations réelles où plusieurs sources sont présentes et l'antenne de microphones est posée sur une table. La technique de visualisation appelée layerwise relevance propagation (LRP) met en valeur les zones temps-fréquence positivement corrélées avec la localisation prédite par le réseau dans un cas donné. En plus d'être méthodologiquement indispensable, cette analyse permet d'observer que le réseau de neurones exploite principalement les zones dans lesquelles le son direct domine la réverbération et le bruit ambiant. Dans un second temps, nous proposons une méthode pour rehausser la parole du locuteur principal et faciliter sa reconnaissance. Nous nous plaçons dans le cadre de la formation de voies basée sur des masques temps-fréquence estimés par un réseau de neurones. Afin de traiter le cas où plusieurs personnes parlent à un volume similaire, nous utilisons l'information de localisation pour faire un premier rehaussement à large bande dans la direction du locuteur cible. Nous montrons que donner cette information supplémentaire au réseau n'est pas suffisant dans le cas où deux locuteurs sont proches ; en revanche, donner en plus la version rehaussée du locuteur concurrent permet au réseau de renvoyer de meilleurs masques. Ces masques permettent d'en déduire un filtre multicanal qui améliore grandement la reconnaissance vocale. Nous évaluons cet algorithme dans différents environnements, y compris réels, grâce à un moteur de reconnaissance de la parole utilisé comme boîte noire. Dans un dernier temps, nous combinons les systèmes de localisation et de rehaussement et nous évaluons la robustesse du second aux imprécisions du premier sur des exemples réels. / This work was conducted in the fast-growing context of hands-free voice command. In domestic environments, smart devices are usually laid in a fixed position, while the human speaker gives orders from anywhere, not necessarily next to the device, or nor even facing it. This adds difficulties compared to the problem of near-field voice command (typically for mobile phones) : strong reverberation, early reflections on furniture around the device, and surrounding noises can degrade the signal. Moreover, other speakers may interfere, which make the understanding of the target speaker quite difficult. In order to facilitate speech recognition in such adverse conditions, several preprocessing methods are introduced here. We use a spatialized audio format suitable for audio scene analysis : the Ambisonic format. We first propose a sound source localization method that relies on a convolutional and recurrent neural network. We define an input feature vector inspired by the acoustic intensity vector which improves the localization performance, in particular in real conditions involving several speakers and a microphone array laid on a table. We exploit the visualization technique called layerwise relevance propagation (LRP) to highlight the time-frequency zones that are correlate positively with the network output. This analysis is of paramount importance to establish the validity of a neural network. In addition, it shows that the neural network essentially relies on time-frequency zones where direct sound dominates reverberation and background noise. We then present a method to enhance the voice of the main speaker and ease its recognition. We adopt a mask-based beamforming framework based on a time-frequency mask estimated by a neural network. To deal with the situation of multiple speakers with similar loudness, we first use a wideband beamformer to enhance the target speaker thanks to the associated localization information. We show that this additional information is not enough for the network when two speakers are close to each other. However, if we also give an enhanced version of the interfering speaker as input to the network, it returns much better masks. The filters generated from those masks greatly improve speech recognition performance. We evaluate this algorithm in various environments, including real ones, with a black-box automatic speech recognition system. Finally, we combine the proposed localization and enhancement systems and evaluate the robustness of the latter to localization errors in real environments.
82 |
Human Urine : can it be applied as fertilizer in agricultural systems?Filling, Julia January 2018 (has links)
In cities today, vast amounts of nutrients are being wasted. Improvement in nutrient management within agriculture can contribute to a more sustainable society. Reusing nutrients in agriculture could aid in creating a more circular system, where organic fertilizers can be used instead of chemical fertilizers. Urine is a liquid which has a high nutrient content. According to the Swedish environmental protection agency, human urine can replace mineral fertilizers, by using methods such as source separation, where urine is divided from faeces. This is a cheap, effective and sustainable fertilizer management system that can be easily achieved. In this study, urine fertilizers were compared with ecological and conventional fertilizers (NPK and cow manure). The study examined the effect of different urine fertilizers compared with organic and inorganic ones on plant growth, nutrient content, pH value and microbial growth. The plant growth experiment was carried out in the greenhouse facilities in Alnarp, Sweden. The results from the experiment show that cow manure has a better outcome when it comes to plant growth, but Aurin, one of the urine fertilizers, had the highest uptake of nitrate. Non-diluted urine had a stable result in all analyses. According to this study human urine is a fertilizer which can be used in crop cultivation systems, and can deliver good agricultural results.
83 |
Insamling och behandling av klosettvatten från slutna tankar i Södertälje : en utvärdering av massflöden och förbättringsområden / Collection and treatment of blackwater from cesspits in Södertälje, Sweden : an evaluation of mass flows and potential of improvementJernå, Charlotta January 2022 (has links)
Källsorterande avloppssystem gör att resurserna i olika avloppsflöden kan tas tillvara. Särskilt klosettvatten innehåller näringsämnen som är viktiga att föra tillbaka till jordbruket. I Södertälje finns en anläggning som behandlar klosettvatten från slutna tankar och produkten används som gödsel. Hygieniseringen sker i två steg, först våtkompostering som höjer temperaturen och sedan ammoniakhygienisering genom tillsats av urea. För att våtkomposteringen ska fungera effektivt är det viktigt att klosettvattnet är så koncentrerat som möjligt, så att energiinnehållet är högt. Av denna anledning bör vakuumtoalett eller annan extremt snålspolande toalett användas. Våtkomposten i Södertälje har varit i drift sedan 2012 och det här examensarbetet har syftat till att utvärdera anläggningen. Det första målet var att kvantifiera massflöden av kväve och fosfor. Dock visade sig variationen i underlaget vara så stor att inga säkra slutsatser kunnat dras utifrån tillgängliga data. Osäkerheten visas med de 90% konfidensintervallen för in- och utgående totalmängder kväve och fosfor för åren 2014–2021. Ingående mängder uppskattades till 0–170 kg P och 370–6200 N medan utgående mängder uppskattades till 60–280 kg P och 4000–23000 kg N. Ingående närings-koncentrationer tyder på att klosettvattnet i genomsnitt kommit från runt 100 personer per år, dock råder en stor variation och osäkerhet. Det andra målet var att uppskatta gasutsläpp från anläggningen. Osäkerheten i underlaget var som sagt stor vilket syns med det 90% konfidensintervallet som gick från -9 ton till 11 ton för kvävehaltiga gaser och från -4,4 ton till 20 ton för koldioxid, totala mäng-der utsläpp under åren 2014–2021. Resultat från litteraturstudien tyder på att utsläpp av metan och lustgas bör vara väldigt låga efter att urea har tillsatts vid behandlingen. Det är viktigt att lagring av produkten sker täckt för att förhindra ammoniakavgångar. Det tredje målet var att undersöka möjligheter att behandla andra substrat på våtkomposten, då den har kapacitet att ta emot mer material. Både säkerställande av leverans från slutna tankar samt anslutning av latrin från koloniföreningar har identifierats ha stor potential att öka kvantitet och kvalitet på inkommande substrat. Hantering av latrintunnor kräver däremot investering i en mottagningsanordning och om det skulle vara av intresse behöver alternativ och kostnader undersökas vidare. Latrin från koloniföreningar som samlas upp i slutna tankar, vilket är fallet hos två föreningar i kommunen, kan enkelt tillföras våtkomposten och bedöms kunna öka TS-halten vilket gör våtkomposteringen mer effektiv. Som sista del i arbetet har en provtagningsplan tagits fram för kontroll av latrin uppsamlat i slutna tankar. Flertalet aktörer är inblandade i anläggningen och ett kontinuerligt arbete krävs för att systemet ska fungera tillfredsställande. Det är därför av största vikt att de olika aktörerna både har tydliga rutiner och ett nära samarbete.
84 |
Parameter Estimation and Signal Processing Techniques for Operational Modal AnalysisCHAUHAN, SHASHANK 18 April 2008 (has links)
No description available.
85 |
Estimation of Atmospheric Phase Scintillation Via Decorrelation of Water Vapor Radiometer SignalsNessel, James Aaron January 2015 (has links)
No description available.
86 |
Multichannel audio processing for speaker localization, separation and enhancementMartí Guerola, Amparo 29 October 2013 (has links)
This thesis is related to the field of acoustic signal processing and its applications to emerging
communication environments. Acoustic signal processing is a very wide research area covering
the design of signal processing algorithms involving one or several acoustic signals to perform
a given task, such as locating the sound source that originated the acquired signals, improving
their signal to noise ratio, separating signals of interest from a set of interfering sources or recognizing
the type of source and the content of the message. Among the above tasks, Sound Source
localization (SSL) and Automatic Speech Recognition (ASR) have been specially addressed in
this thesis. In fact, the localization of sound sources in a room has received a lot of attention in
the last decades. Most real-word microphone array applications require the localization of one
or more active sound sources in adverse environments (low signal-to-noise ratio and high reverberation).
Some of these applications are teleconferencing systems, video-gaming, autonomous
robots, remote surveillance, hands-free speech acquisition, etc. Indeed, performing robust sound
source localization under high noise and reverberation is a very challenging task. One of the
most well-known algorithms for source localization in noisy and reverberant environments is
the Steered Response Power - Phase Transform (SRP-PHAT) algorithm, which constitutes the
baseline framework for the contributions proposed in this thesis. Another challenge in the design
of SSL algorithms is to achieve real-time performance and high localization accuracy with a reasonable
number of microphones and limited computational resources. Although the SRP-PHAT
algorithm has been shown to be an effective localization algorithm for real-world environments,
its practical implementation is usually based on a costly fine grid-search procedure, making the
computational cost of the method a real issue. In this context, several modifications and optimizations
have been proposed to improve its performance and applicability. An effective strategy
that extends the conventional SRP-PHAT functional is presented in this thesis. This approach
performs a full exploration of the sampled space rather than computing the SRP at discrete spatial
positions, increasing its robustness and allowing for a coarser spatial grid that reduces the
computational cost required in a practical implementation with a small hardware cost (reduced
number of microphones). This strategy allows to implement real-time applications based on
location information, such as automatic camera steering or the detection of speech/non-speech
fragments in advanced videoconferencing systems.
As stated before, besides the contributions related to SSL, this thesis is also related to the
field of ASR. This technology allows a computer or electronic device to identify the words spoken
by a person so that the message can be stored or processed in a useful way. ASR is used on
a day-to-day basis in a number of applications and services such as natural human-machine
interfaces, dictation systems, electronic translators and automatic information desks. However,
there are still some challenges to be solved. A major problem in ASR is to recognize people
speaking in a room by using distant microphones. In distant-speech recognition, the microphone
does not only receive the direct path signal, but also delayed replicas as a result of multi-path
propagation. Moreover, there are multiple situations in teleconferencing meetings when multiple
speakers talk simultaneously. In this context, when multiple speaker signals are present, Sound
Source Separation (SSS) methods can be successfully employed to improve ASR performance
in multi-source scenarios. This is the motivation behind the training method for multiple talk
situations proposed in this thesis. This training, which is based on a robust transformed model
constructed from separated speech in diverse acoustic environments, makes use of a SSS method
as a speech enhancement stage that suppresses the unwanted interferences. The combination
of source separation and this specific training has been explored and evaluated under different
acoustical conditions, leading to improvements of up to a 35% in ASR performance. / Martí Guerola, A. (2013). Multichannel audio processing for speaker localization, separation and enhancement [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/33101
87 |
Κατασκευή μικροϋπολογιστικού συστήματος διαχωρισμού σημάτων με τον αλγόριθμο ICAΧονδρός, Παναγιώτης 13 October 2013 (has links)
Η διπλωματική εργασία αυτή αφορά την κατασκευή ενός μικροϋπολογιστικού συστήματος διαχωρισμού σημάτων. Ο διαχωρισμός των σημάτων γίνεται με βάση τη θεωρία της τεχνικής της Ανάλυσης Ανεξάρτητων Συνιστωσών. Αφού παρουσιαστεί η θεωρία της τεχνικής, παρουσιάζεται ο μικροελεγκτής ADuC 7026 που επελέγη για την υλοποίηση. Στη συνέχεια γίνεται η παρουσίαση του λογισμικού προσομοίωσης του μικροελεγκτή και παρατίθενται βασικά παραδείγματα για τον προγραμματισμό του. Τέλος, αναπτύσσονται, χωρίς τη χρήση περιφερειακών, και προσομοιώνονται, με τη χρήση περιφερειακών τρεις αλγόριθμοι, δυο εκδόσεις του FastICA και μια έκδοση του InfoMax. Οι αλγόριθμοι αυτοί αξιολογούνται ως προς τις επιδόσεις τους και εξάγονται τα συμπεράσματα. / This thesis deals with the construction of a microcomputer system to separate signals. The separation of the signals is based on the theory of the technique of Independent Component Analysis. The theory of the technique and the microcontroller ADuC 7026 chosen for implementation are presented. Then, follows the presentation of the software on which the microcontroller is simulated and basic examples of its programming are mentioned. Finally, three algorithms, two versions of FastICA and a version of InfoMax, are developed without the use of peripheral systems and simulated using peripheral systems. These algorithms are evaluated for their performance and conclusions are drawn.
88 |
Bekontaktis pulso matavimas naudojant internetinę vaizdo kamerą / Non-contact cardiac pulse measurement using web cameraSeniut, Konstantin 10 June 2011 (has links)
Baigiamajame magistro darbe yra nagrinėjamas bekontaktis pulso matavimo metodas. Darbo tikslams pasiekti naudota Logitech C310 internetinė vaizdo kamera. Įrašomo vaizdo dydis yra 640X480 pikselių. Filmavimo sparta – 15 kadrų per sekundę. Vaizdo įrašo ilgis – 30 sekundžių. Tiriamieji buvo filmuojami apie 0,5 m atstumu nuo kameros. Tiriamųjų amžius nuo 24 iki 64 metų. Vaizdas buvo įrašomas, esant įvairiam apšvietimui: tiek dienos metu, tiek šviečiant skirtingo galingumo lempoms. Rezultatams palyginti buvo naudojamas ant riešo uždedamas pulso matavimo prietaisas ReliOn, kurio veikimas pagrįstas kraujagyslėse pulsuojančio kraujo spaudimo kitimu. Išgautam pulso signalui apdoroti, palyginimui buvo panaudoti du nepriklausomų komponenčių analizės algoritmai: Fast ICA bei stSobi. Eksperimentams atlikti buvo naudojama C# programavimo kalba ir Matlab 2008 matematinis skaičiavimo paketas. / The thesis analyses the non-contact cardiac pulse measurement method. To achieve work main goals Logitech C310 web camera was used. Video resolution was 640X480 pixels. Video capture speed was 15 frames per second. Video length was 30 seconds. Distance from web camera to human face was ~ 0,5 m. Participant age varied from 24 to 64 years old. Video was captured with different light sources: sun, lamps with different power. For results comparison ReliOn handy pulse measurement device was used. Pulse signal was filtered using two independent component analysis algorithms: Fast ICA and stSobi. Experiments have been made using C# programming language and Matlab 2008 mathematical package.
89 |
Manipulations spatiales de sons spectrauxMouba Ndjila, Joan 09 November 2009 (has links)
Dans les applications d'écoute active, il est primordial de pouvoir interagir avec les sources individuelles présentes dans le mix, par exemple en changeant leur position spatiale. Dans cette thèse, nous avons proposé des techniques binaurales pour la localisation et la spatialisation, basées sur les différences interaurales en amplitude et en temps d'arrivée. Les techniques sont développées dans le plan temps-fréquence. Elles permettent de localiser et de projeter toute source dans l'espace environnant un auditeur. aussi nous avons mis au point des techniques de séparation binaurale de source basées sur le Maximum de vraisemblance et de masques spatiaux probabilistes. Enfin nous avons étendu les techniques binaurales à des techniques multi-diffusion utilisant un ensemble de haut-parleurs. Les techniques proposées sont éprouvées et comparées à des techniques de référence de la littérature. Pour des performances similaires aux techniques existantes, nos propositions ont un avantage significatif en terme de complexité qui les rendent appropriées aux applications temps-réel. / In active listening applications, it is important to be able to interact with individual sources present in the mix, for example by changing their spatial position. In this thesis, we proposed techniques for binaural localization and spatialization, based on interaural differences in amplitude and in time of arrival. The techniques are developed in the time-frequency plane. They can locate and project sources in the space surrounding a listener. We also developed binaural source separation methods based on the Maximum Likelihood and on spatial probabilistic masks. Finally, we extended binaural spatialization techniques to multi-diffusion techniques which use a set of speakers for diffusion. The proposed techniques are tested and compared to referenced, well-known techniques. For similar performance with the existing ones, our proposed techniques highlight complexity advantages and are suitable for real-time applications.
90 |
Decomposition methods of NMR signal of complex mixtures : models ans applicationsToumi, Ichrak 28 October 2013 (has links)
L'objectif de ce travail était de tester des méthodes de SAS pour la séparation des spectres complexes RMN de mélanges dans les plus simples des composés purs. Dans une première partie, les méthodes à savoir JADE et NNSC ont été appliqué es dans le cadre de la DOSY , une application aux données CPMG était démontrée. Dans une deuxième partie, on s'est concentré sur le développement d'un algorithme efficace "beta-SNMF" . Ceci s'est montré plus performant que NNSC pour beta inférieure ou égale à 2. Etant donné que dans la littérature, le choix de beta a été adapté aux hypothèses statistiques sur le bruit additif, une étude statistique du bruit RMN de la DOSY a été faite pour obtenir une image plus complète de nos données RMN étudiées. / The objective of the work was to test BSS methods for the separation of the complex NMR spectra of mixtures into the simpler ones of the pure compounds. In a first part, known methods namely JADE and NNSC were applied in conjunction for DOSY , performing applications for CPMG were demonstrated. In a second part, we focused on developing an effective algorithm "beta- SNMF ". This was demonstrated to outperform NNSC for beta less or equal to 2. Since in the literature, the choice of beta has been adapted to the statistical assumptions on the additive noise, a statistical study of NMR DOSY noise was done to get a more complete picture about our studied NMR data.
Page generated in 0.0878 seconds