Global ETD Search

1	Neural networks for analysing music and environmental audio Sigtia, Siddharth January 2017 (has links) In this thesis, we consider the analysis of music and environmental audio recordings with neural networks. Recently, neural networks have been shown to be an effective family of models for speech recognition, computer vision, natural language processing and a number of other statistical modelling problems. The composite layer-wise structure of neural networks allows for flexible model design, where prior knowledge about the domain of application can be used to inform the design and architecture of the neural network models. Additionally, it has been shown that when trained on sufficient quantities of data, neural networks can be directly applied to low-level features to learn mappings to high level concepts like phonemes in speech and object classes in computer vision. In this thesis we investigate whether neural network models can be usefully applied to processing music and environmental audio. With regards to music signal analysis, we investigate 2 different problems. The fi rst problem, automatic music transcription, aims to identify the score or the sequence of musical notes that comprise an audio recording. We also consider the problem of automatic chord transcription, where the aim is to identify the sequence of chords in a given audio recording. For both problems, we design neural network acoustic models which are applied to low-level time-frequency features in order to detect the presence of notes or chords. Our results demonstrate that the neural network acoustic models perform similarly to state-of-the-art acoustic models, without the need for any feature engineering. The networks are able to learn complex transformations from time-frequency features to the desired outputs, given sufficient amounts of training data. Additionally, we use recurrent neural networks to model the temporal structure of sequences of notes or chords, similar to language modelling in speech. Our results demonstrate that the combination of the acoustic and language model predictions yields improved performance over the acoustic models alone. We also observe that convolutional neural networks yield better performance compared to other neural network architectures for acoustic modelling. For the analysis of environmental audio recordings, we consider the problem of acoustic event detection. Acoustic event detection has a similar structure to automatic music and chord transcription, where the system is required to output the correct sequence of semantic labels along with onset and offset times. We compare the performance of neural network architectures against Gaussian mixture models and support vector machines. In order to account for the fact that such systems are typically deployed on embedded devices, we compare performance as a function of the computational cost of each model. We evaluate the models on 2 large datasets of real-world recordings of baby cries and smoke alarms. Our results demonstrate that the neural networks clearly outperform the other models and they are able to do so without incurring a heavy computation cost.
2	Percepční citlivost ve frekvenční a temporální doméně u hudebních a řečových stimulů / Perceptual sensitivity to music and speech stimuli in the frequency and temporal domains Lukeš, David January 2014 (has links) The subject of this thesis is perceptual sensitivity with respect to subtle frequency-based and temporal manipulations in speech, music and mixed stimuli. We hypothesize that an individual's sensitivity to variation in all three types of stimuli should be similar (i.e. a correlation should exist), seeing that findings in evolutionary biology, neurosciences, psy- chology and experimental phonetics are pointing towards a relatively strong link between the mechanisms of perception in speech and music. Our listening experiment revealed mostly intermediate correlations; additionally, we argue that by employing syntactically less complicated stimuli, which would target specifically fundamental sensitivity without requiring a complex syntactic analysis in parallel, even more robust correlations could be obtained. While the influence of prior formal linguistic education on performance in the test was negligible, the influence of musical experience was considerable, which lends further support to the idea of simplifying especially the music stimuli in future research. Key words: music, speech, perception, sensitivity, correlation
3	Feature selection for multimodal: acoustic Event detection Butko, Taras 08 July 2011 (has links) Acoustic Event Detection / The detection of the Acoustic Events (AEs) naturally produced in a meeting room may help to describe the human and social activity. The automatic description of interactions between humans and environment can be useful for providing: implicit assistance to the people inside the room, context-aware and content-aware information requiring a minimum of human attention or interruptions, support for high-level analysis of the underlying acoustic scene, etc. On the other hand, the recent fast growth of available audio or audiovisual content strongly demands tools for analyzing, indexing, searching and retrieving the available documents. Given an audio document, the first processing step usually is audio segmentation (AS), i.e. the partitioning of the input audio stream into acoustically homogeneous regions which are labelled according to a predefined broad set of classes like speech, music, noise, etc. Acoustic event detection (AED) is the objective of this thesis work. A variety of features coming not only from audio but also from the video modality is proposed to deal with that detection problem in meeting-room and broadcast news domains. Two basic detection approaches are investigated in this work: a joint segmentation and classification using Hidden Markov Models (HMMs) with Gaussian Mixture Densities (GMMs), and a detection-by-classification approach using discriminative Support Vector Machines (SVMs). For the first case, a fast one-pass-training feature selection algorithm is developed in this thesis to select, for each AE class, the subset of multimodal features that shows the best detection rate. AED in meeting-room environments aims at processing the signals collected by distant microphones and video cameras in order to obtain the temporal sequence of (possibly overlapped) AEs that have been produced in the room. When applied to interactive seminars with a certain degree of spontaneity, the detection of acoustic events from only the audio modality alone shows a large amount of errors, which is mostly due to the temporal overlaps of sounds. This thesis includes several novelties regarding the task of multimodal AED. Firstly, the use of video features. Since in the video modality the acoustic sources do not overlap (except for occlusions), the proposed features improve AED in such rather spontaneous scenario recordings. Secondly, the inclusion of acoustic localization features, which, in combination with the usual spectro-temporal audio features, yield a further improvement in recognition rate. Thirdly, the comparison of feature-level and decision-level fusion strategies for the combination of audio and video modalities. In the later case, the system output scores are combined using two statistical approaches: weighted arithmetical mean and fuzzy integral. On the other hand, due to the scarcity of annotated multimodal data, and, in particular, of data with temporal sound overlaps, a new multimodal database with a rich variety of meeting-room AEs has been recorded and manually annotated, and it has been made publicly available for research purposes. Acoustic event Audio classification Audio segmentation Feature selection Multimodal Feature extraction Fuzzy integral Online systems Support vector machines Hidden marko models 531/534
4	Identifikace hudby, řeči, křiku, zpěvu v audio (video) záznamu / Music, Speech, Crying, Singing Detection in Audio (Video) Danko, Michal January 2016 (has links) This thesis follows the trend of last decades in using neural networks in order to detect speech in noisy data. The text begins with basic knowledge about discussed topics, such as audio features, machine learning and neural networks. The network parameters are examined in order to provide the most suitable background for the experiments. The main focus of the experiments is to observe the influence of various sound events on the speech detection on a small, diverse database. Where the sound events correlated to the speech proved to be the most beneficial. In addition, the accuracy of the acoustic events, previously used only as a supplement to the speech, is also a part of experimentation. The experiment of examining the extending of the datasets by more fairly distributed data shows that it doesn't guarantee an improvement. And finally, the last experiment demonstrates that the network indeed succeeded in learning how to predict voice activity in both clean and noisy data.
5	Sistema para monitoramento e análise de paisagens acústicas submarinas. / System for monitoring and analysing underwater acoustic landscapes. Alvarez Rosario, Alexander 14 October 2015 (has links) O Monitoramento Acústico Passivo (PAM) submarino refere-se ao uso de sistemas de escuta e gravação subaquática, com o intuito de detectar, monitorar e identificar fontes sonoras através das ondas de pressão que elas produzem. Se diz que é passivo já que tais sistemas unicamente ouvem, sem perturbam o meio ambiente acústico existente, diferentemente de ativos, como os sonares. O PAM submarino tem diversas áreas de aplicação, como em sistemas de vigilância militar, seguridade portuária, monitoramento ambiental, desenvolvimento de índices de densidade populacional de espécies, identificação de espécies, etc. Tecnologia nacional nesta área é praticamente inexistente apesar da sua importância. Neste contexto, o presente trabalho visa contribuir com o desenvolvimento de tecnologia nacional no tema através da concepção, construção e operação de equipamento autônomo de PAM e de métodos de processamento de sinais para detecção automatizada de eventos acústicos submarinos. Foi desenvolvido um equipamento, nomeado OceanPod, que possui características como baixo custo de fabrica¸c~ao, flexibilidade e facilidade de configuração e uso, voltado para a pesquisa científica, industrial e para controle ambiental. Vários protótipos desse equipamento foram construídos e utilizados em missões no mar. Essas jornadas de monitoramento permitiram iniciar a criação de um banco de dados acústico, o qual permitiu fornecer a matéria prima para o teste de detectores de eventos acústicos automatizados e em tempo real. Adicionalmente também é proposto um novo método de detecção-identificação de eventos acústicos, baseado em análise estatística da representação tempo-frequência dos sinais acústicos. Este novo método foi testado na detecção de cetáceos, presentes no banco de dados gerado pelas missões de monitoramento. / Passive Acoustic Monitoring (PAM) refers to the use of systems to listen and record underwater soundscape, in order to detect, track and identify sound sources through the pressure waves that they produce. It is said to be passive as these systems only hear, not put noise in the environment, such as sonars. Underwater PAM has various application areas, such as military surveillance systems, port security, environmental monitoring, development of population density rates of species, species identification, etc. National technology in the field is practically nonexistent despite its importance. In this context, this paper aims to contribute to the national technology development in the field by designing, building, and operating a self-contained PAM equipment, also developing signal-processing methods for automated detection of underwater acoustic events. A device, named \"OceanPod\"which has characteristics such as low manufacturing cost, flexibility and ease of setup and use, intended for scientific, industrial research and environmental control was developed. Several prototypes of the equipment were built and used in several missions at seawaters. These missions monitoring, enabled start creating an acoustic database, which provided the raw material for the automated acoustic events detectors and realtime test. Additionally, it is also proposed a new method of detecting, identifying sound events, based on statistical analysis of the time-frequency representation of the acoustic signals. This new method has been tested in the detection of cetaceans present in the database generated by missions monitoring. Bioacústica (Monitoramento) Instrumentação de acústica submarina Instrumentos de sinais acústicos Underwater acoustic instrumentation Underwater acoustics signal processing Underwater bio-acoustic event detection Underwater passive acoustic monitoring
6	Sistema para monitoramento e análise de paisagens acústicas submarinas. / System for monitoring and analysing underwater acoustic landscapes. Alexander Alvarez Rosario 14 October 2015 (has links) O Monitoramento Acústico Passivo (PAM) submarino refere-se ao uso de sistemas de escuta e gravação subaquática, com o intuito de detectar, monitorar e identificar fontes sonoras através das ondas de pressão que elas produzem. Se diz que é passivo já que tais sistemas unicamente ouvem, sem perturbam o meio ambiente acústico existente, diferentemente de ativos, como os sonares. O PAM submarino tem diversas áreas de aplicação, como em sistemas de vigilância militar, seguridade portuária, monitoramento ambiental, desenvolvimento de índices de densidade populacional de espécies, identificação de espécies, etc. Tecnologia nacional nesta área é praticamente inexistente apesar da sua importância. Neste contexto, o presente trabalho visa contribuir com o desenvolvimento de tecnologia nacional no tema através da concepção, construção e operação de equipamento autônomo de PAM e de métodos de processamento de sinais para detecção automatizada de eventos acústicos submarinos. Foi desenvolvido um equipamento, nomeado OceanPod, que possui características como baixo custo de fabrica¸c~ao, flexibilidade e facilidade de configuração e uso, voltado para a pesquisa científica, industrial e para controle ambiental. Vários protótipos desse equipamento foram construídos e utilizados em missões no mar. Essas jornadas de monitoramento permitiram iniciar a criação de um banco de dados acústico, o qual permitiu fornecer a matéria prima para o teste de detectores de eventos acústicos automatizados e em tempo real. Adicionalmente também é proposto um novo método de detecção-identificação de eventos acústicos, baseado em análise estatística da representação tempo-frequência dos sinais acústicos. Este novo método foi testado na detecção de cetáceos, presentes no banco de dados gerado pelas missões de monitoramento. / Passive Acoustic Monitoring (PAM) refers to the use of systems to listen and record underwater soundscape, in order to detect, track and identify sound sources through the pressure waves that they produce. It is said to be passive as these systems only hear, not put noise in the environment, such as sonars. Underwater PAM has various application areas, such as military surveillance systems, port security, environmental monitoring, development of population density rates of species, species identification, etc. National technology in the field is practically nonexistent despite its importance. In this context, this paper aims to contribute to the national technology development in the field by designing, building, and operating a self-contained PAM equipment, also developing signal-processing methods for automated detection of underwater acoustic events. A device, named \"OceanPod\"which has characteristics such as low manufacturing cost, flexibility and ease of setup and use, intended for scientific, industrial research and environmental control was developed. Several prototypes of the equipment were built and used in several missions at seawaters. These missions monitoring, enabled start creating an acoustic database, which provided the raw material for the automated acoustic events detectors and realtime test. Additionally, it is also proposed a new method of detecting, identifying sound events, based on statistical analysis of the time-frequency representation of the acoustic signals. This new method has been tested in the detection of cetaceans present in the database generated by missions monitoring. Bioacústica (Monitoramento) Instrumentação de acústica submarina Instrumentos de sinais acústicos Underwater acoustic instrumentation Underwater acoustics signal processing Underwater bio-acoustic event detection Underwater passive acoustic monitoring

1

Page generated in 0.0653 seconds