Global ETD Search

1	Towards Green AI: Cost-Efficient Deep Learning using Domain Knowledge Srivastava, Sangeeta 12 August 2022 (has links) No description available. Computer Science Artificial Intelligence Green AI knowledge guided learning model compression sound event detection edge intelligence on-device intelligence physics-guided machine learning efficient deep learning
2	Simulation de scènes sonores environnementales : Application à l’analyse sensorielle et l’analyse automatique / Simulation of environmental acoustic scenes : Application to sensory and computational analyses Lafay, Grégoire 08 December 2016 (has links) La présente thèse traite de l'analyse de scènes extraites d'environnements sonores, résultat auditif du mélange de sources émettrices distinctes et concomitantes. Ouvrant le champ des sources et des recherches possibles au-delà des domaines plus spécifiques que sont la parole ou la musique, l'environnement sonore est un objet complexe. Son analyse, le processus par lequel le sujet lui donne sens, porte à la fois sur les données perçues et sur le contexte de perception de ces données.Tant dans le domaine de la perception que de l'apprentissage machine, toute expérience suppose un contrôle fin de l'expérimentateur sur les stimuli proposés. Néanmoins, la nature de l'environnement sonore nécessite de se placer dans un cadre écologique, c'est à dire de recourir à des données réelles, enregistrées, plutôt qu'à des stimuli de synthèse. Conscient de cette problématique, nous proposons un modèle permettant de simuler, à partir d'enregistrements de sons isolés, des scènes sonores dont nous maîtrisons les propriétés structurelles -- intensité, densité et diversité des sources. Appuyé sur les connaissances disponibles sur le système auditif humain, le modèle envisage la scène sonore comme un objet composite, une somme de sons sources.Nous investissons à l'aide de cet outil deux champs d'application. Le premier concerne la perception, et la notion d'agrément perçu dans des environnements urbains. L'usage de données simulées nous permet d'apprécier finement l'impact de chaque source sonore sur celui-ci. Le deuxième concerne la détection automatique d'événements sonores et propose une méthodologie d'évaluation des algorithmes mettant à l'épreuve leurs capacités de généralisation. / This thesis deals with environmental scene analysis, the auditory result of mixing separate but concurrent emitting sources. The sound environment is a complex object, which opens the field of possible research beyond the specific areas that are speech or music. For a person to make sense of its sonic environment, the involved process relies on both the perceived data and its context. For each experiment, one must be, as much as possible,in control of the evaluated stimuli, whether the field of investigation is perception or machine learning. Nevertheless, the sound environment needs to be studied in an ecological framework, using real recordings of sounds as stimuli rather than synthetic pure tones. We therefore propose a model of sound scenes allowing us to simulate complex sound environments from isolated sound recordings. The high level structural properties of the simulated scenes -- such as the type of sources, their sound levels or the event density -- are set by the experimenter. Based on knowledge of the human auditory system, the model abstracts the sound environment as a composite object, a sum of soundsources. The usefulness of the proposed model is assessed on two areas of investigation. The first is related to the soundscape perception issue, where the model is used to propose an innovative experimental protocol to study pleasantness perception of urban soundscape. The second tackles the major issue of evaluation in machine listening, for which we consider simulated data in order to powerfully assess the generalization capacities of automatic sound event detection systems. Simulation de données Détection d’événements sonores Analyse de scènes auditives Évaluation perceptive Paysage sonore urbain Qualité des paysages sonores Cognition ancrée Psychologie cognitive Data simulation Sound event detection Auditive scenes analysis Perception assessment Cognition Urban soundscape Soundscape quality Grounded cognition Cognitive psychology
3	Ψηφιακή επεξεργασία και αυτόματη κατηγοριοποίηση περιβαλλοντικών ήχων Νταλαμπίρας, Σταύρος 20 September 2010 (has links) Στο κεφάλαιο 1 παρουσιάζεται μία γενική επισκόπηση της αυτόματης αναγνώρισης γενικευμένων ακουστικών γεγονότων. Επιπλέον συζητάμε τις εφαρμογές της τεχνολογίας αναγνώρισης ακουστικού σήματος και δίνουμε μία σύντομη περιγραφή του state of the art. Τέλος, αναφέρουμε τη συνεισφορά της διατριβής. Στο κεφάλαιο 2 εισάγουμε τον αναγνώστη στο χώρο της επεξεργασίας ακουστικών σημάτων που δε περιλαμβάνουν ομιλία. Παρουσιάζονται οι σύγχρονες προσεγγίσεις όσον αφορά στις μεθοδολογίες εξαγωγής χαρακτηριστικών και αναγνώρισης προτύπων. Στο κεφάλαιο 3 προτείνεται ένα καινοτόμο σύστημα αναγνώρισης ήχων ειδικά σχεδιασμένο για το χώρο των ηχητικών γεγονότων αστικού περιβάλλοντος και αναλύεται ο σχεδιασμός της αντίστοιχης βάσης δεδομένων. Δημιουργήθηκε μία ιεραρχική πιθανοτική δομή μαζί με δύο ομάδες ακουστικών παραμέτρων που οδηγούν σε υψηλή ακρίβεια αναγνώρισης. Στο κεφάλαιο 4 ερευνάται η χρήση της τεχνικής πολλαπλών αναλύσεων όπως εφαρμόζεται στο πρόβλημα της διάκρισης ομιλίας/μουσικής. Στη συνέχεια η τεχνική αυτή χρησιμοποιήθηκε για τη δημιουργία ενός συστήματος το οποίο συνδυάζει χαρακτηριστικά από διαφορετικά πεδία με στόχο την αποδοτική ανάλυση online ραδιοφωνικών σημάτων. Στο κεφάλαιο 5 προτείνεται ένα σύστημα το οποίο εντοπίζει μη-τυπικές καταστάσεις σε περιβάλλον σταθμού μετρό με στόχο να βοηθήσει το εξουσιοδοτημένο προσωπικό στην συνεχή επίβλεψη του χώρου. Στο κεφάλαιο 6 προτείνεται ένα προσαρμοζόμενο σύστημα για ακουστική παρακολούθηση εν δυνάμει καταστροφικών καταστάσεων ικανό να λειτουργεί κάτω από διαφορετικά περιβάλλοντα. Δείχνουμε ότι το σύστημα επιτυγχάνει υψηλή απόδοση και μπορεί να προσαρμόζεται αυτόνομα σε ετερογενείς ακουστικές συνθήκες. Στο κεφάλαιο 7 ερευνάται η χρήση της μεθόδου ανίχνευσης καινοτομίας για ακουστική επόπτευση κλειστών και ανοιχτών χώρων. Ηχογραφήθηκε μία βάση δεδομένων πραγματικού κόσμου και προτείνονται τρεις πιθανοτικές τεχνικές. Στο κεφάλαιο 8 παρουσιάζεται μία καινοτόμα μεθοδολογία για αναγνώριση γενικευμένου ακουστικού σήματος που οδηγεί σε υψηλή ακρίβεια αναγνώρισης. Εκμεταλλευόμαστε τα πλεονεκτήματα της χρονικής συγχώνευσης χαρακτηριστικών σε συνδυασμό με μία παραγωγική τεχνική κατηγοριοποίησης. / The dissertation is outlined as followed: In chapter 1 we present a general overview of the task of automatic recognition of sound events. Additionally we discuss the applications of the generalized audio signal recognition technology and we give a brief description of the state of the art. Finally we mention the contribution of the thesis. In chapter 2 we introduce the reader to the area of non speech audio processing. We provide the current trend in the feature extraction methodologies as well as the pattern recognition techniques. In chapter 3 we analyze a novel sound recognition system especially designed for addressing the domain of urban environmental sound events. A hierarchical probabilistic structure was constructed along with a combined set of sound parameters which lead to high accuracy. chapter 4 is divided in the following two parts: a) we explore the usage of multiresolution analysis as regards the speech/music discrimination problem and b) the previously acquired knowledge was used to build a system which combined features of different domains towards efficient analysis of online radio signals. In chapter 5 we exhaustively experiment on a new application of the sound recognition technology, space monitoring based on the acoustic modality. We propose a system which detects atypical situations under a metro station environment towards assisting the authorized personnel in the space monitoring task. In chapter 6 we propose an adaptive framework for acoustic surveillance of potentially hazardous situations under environments of different acoustic properties. We show that the system achieves high performance and has the ability to adapt to heterogeneous environments in an unsupervised way. In chapter 7 we investigate the usage of the novelty detection method to the task of acoustic monitoring of indoor and outdoor spaces. A database with real-world data was recorded and three probabilistic techniques are proposed. In chapter 8 we present a novel methodology for generalized sound recognition that leads to high recognition accuracy. The merits of temporal feature integration as well as multi domain descriptors are exploited in combination with a state of the art generative classification technique. Κατηγοριοποίηση ήχων Θεωρία πιθανοτήτων 621.382 8 Sound classification Probability theory Sound event detection Hidden Markov models Generalized recognition of audio signals Computational auditory scene analysis Novelty detection
4	Towards a Nuanced Evaluation of Voice Activity Detection Systems : An Examination of Metrics, Sampling Rates and Noise with Deep Learning / Mot en nyanserad utvärdering av system för detektering av talaktivitet Joborn, Ludvig, Beming, Mattias January 2022 (has links) Recently, Deep Learning has revolutionized many fields, where one such area is Voice Activity Detection (VAD). This is of great interest to sectors of society concerned with detecting speech in sound signals. One such sector is the police, where criminal investigations regularly involve analysis of audio material. Convolutional Neural Networks (CNN) have recently become the state-of-the-art method of detecting speech in audio. But so far, understanding the impact of noise and sampling rates on such methods remains incomplete. Additionally, there are evaluation metrics from neighboring fields that remain unintegrated into VAD. We trained on four different sampling rates and found that changing the sampling rate could have dramatic effects on the results. As such, we recommend explicitly evaluating CNN-based VAD systems on pertinent sampling rates. Further, with increasing amounts of white Gaussian noise, we observed better performance by increasing the capacity of our Gated Recurrent Unit (GRU). Finally, we discuss how careful consideration is necessary when choosing a main evaluation metric, leading us to recommend Polyphonic Sound Detection Score (PSDS). voice activity detection VAD deep learning machine learning ML artificial intelligence AI convolutional neural network CNN deep neural network DNN sound event detection SED mel spectrogram audio processing polyphonic sound detection score PSDS signal processing signal to noise ratio SNR RCRNN sampling rate Gaussian noise Computer Sciences Datavetenskap (datalogi)
5	Detection and Classification of Sparse Traffic Noise Events / Detektering och klassificering av bullerhändelser från gles trafik Golshani, Kevin, Ekberg, Elias January 2023 (has links) Noise pollution is a big health hazard for people living in urban areas, and its effects on humans is a growing field of research. One of the major contributors to urban noise pollution is the noise generated by traffic. Noise simulations can be made in order to build noise maps used for noise management action plans, but in order to test their accuracy real measurements needs to be done, in this case in the form of noise measurements taken adjacent to a road. The aim of this project is to test machine learning based methods in order to develop a robust way of detecting and classifying vehicle noise in sparse traffic conditions. The primary focus is to detect traffic noise events, and the secondary focus is to classify what kind of vehicle is producing the noise. The data used in this project comes from sensors installed on a testbed at a street in southern Stockholm. The sensors include a microphone that is continuously measuring the local noise environment, a radar that detects each time a vehicle is passing by, and a camera that also detects a vehicle by capturing its license plate. Only sparse traffic noises are considered for this thesis, as such the audio recordings used are those where the radar has only detected one vehicle in a 40 second window. This makes the data gathered weakly labeled. The resulting detection method is a two-step process: First, the unsupervised learning method k-means is implemented for the generation of strong labels. Second, the supervised learning method random forest or support vector machine uses the strong labels in order to classify audio features. The detection system of sparse traffic noise achieved satisfactory results. However, the unsupervised vehicle classification method produced inadequate results and the clustering could not differentiate different vehicle classes based on the noise data. / Buller är en stor hälsorisk för människor som bor i stadsområden, och dess effekter på människor är ett växande forskningsfält. En av de största bidragen till stadsbuller är oljud som genereras av trafiken. Man kan utföra simuleringar i syfte att skapa bullerkartor som kan användas till planer för att minska dessa ljud. För att testa deras noggrannhet måste verkliga mätningar tas, i detta fall i formen av ljudmätningar tagna intill en väg. Syftet med detta projekt är att testa maskininlärningsmetoder för att utveckla ett robust sätt att detektera och klassificera fordonsljud i glesa trafikförhållanden. Primärt fokus ligger på att detektera bullerhändelser från trafiken, och sekundärt fokus är att försöka klassificera vilken typ av fordon som producerade ljudet. Datan som används i detta projekt kommer från sensorer installerade på en testbädd på en gata i södra Stockholm. Sensorerna inkluderar en mikrofon som kontinuerligt mäter den lokala ljudmiljön, en radar som detekterar varje gång ett fordon passerar, och en kamera som också detekterar ett fordon genom att ta bild på dess registreringsskylt. Endast ljud från gles trafik kommer att beaktas och användas i detta arbete, och därför används bara de ljudinspelningar där radarn har upptäckt ett enskilt fordon under ett 40 sekunders intervall. Detta gör att den insamlade datan har svaga etiketter. Den resulterande detekteringsmetoden är en tvåstegsprocess: För det första används den oövervakade inlärningsmetoden k-means för att generera starka etiketter. För det andra används de starka etiketterna av den övervakade inlärningsmetoden slumpmässig beslutsskog eller stödvektormaskin i syfte att klassificera ljudegenskaper. Detekteringssystemet av glest trafikljud uppnådde tillfredsställande resultat. Däremot producerade den oövervakade klassificeringsmetoden för fordonsljud otillräckliga resultat, och klustringen kunde inte urskilja mellan olika fordonsklasser baserat på ljuddatan. Noise pollution Machine learning Sound event detection SED Support vector machine SVM Random forest RF Decision tree K-means clustering Spherical k-means clustering Traffic noise Buller Maskininlärning Ljudhändelsedetektering Stödvektormaskin SVM Slumpmässiga beslutsskogar RF K-means klustring Sfärisk k-means klustring Trafikljud Bullerhändelse Other Mathematics Annan matematik

1

Page generated in 0.3259 seconds