Global ETD Search

1	Wired for sound : on the digitalisation of music and music culture Beer, David January 2006 (has links) No description available. 006.5
2	Accelerating finite difference models with field programmable gate arrays : application to real-time audio synthesis and acoustic modelling Gibbons, J. A. January 2006 (has links) No description available. 006.5
3	Hardware and algorithm architectures for real-time additive synthesis Symons, Peter Robert January 2005 (has links) No description available. 006.5
4	Development and exploration of a timbre space representation of audio Nicol, Craig Andrew January 2005 (has links) No description available. 006.5
5	Real-time sound spatialization, software design and implementation Moore, David Robert January 2004 (has links) No description available. 006.5
6	Novel techniques for audio music classification and search West, Kristopher C. January 2008 (has links) No description available. 006.5
7	Artificial intelligence-based approach to modelling of pipe organs Hamadicharef, Brahim January 2005 (has links) The aim of the project was to develop a new Artificial Intelligence-based method to aid modeling of musical instruments and sound design. Despite significant advances in music technology, sound design and synthesis of complex musical instruments is still time consuming, error prone and requires expert understanding of the instrument attributes and significant expertise to produce high quality synthesised sounds to meet the needs of musicians and musical instrument builders. Artificial Intelligence (Al) offers an effective means of capturing this expertise and for handling the imprecision and uncertainty inherent in audio knowledge and data. This thesis presents new techniques to capture and exploit audio expertise, following extended knowledge elicitation with two renowned music technologist/audio experts, developed and embodied into an intelligent audio system. The Al combined with perceptual auditory modeling ba.sed techniques (ITU-R BS 1387) make a generic modeling framework providing a robust methodology for sound synthesis parameters optimisation with objective prediction of sound synthesis quality. The evaluation, carried out using typical pipe organ sounds, has shown that the intelligent audio system can automatically design sounds judged by the experts to be of very good quality, while significantly reducing the expert's work-load by up to a factor of three and need for extensive subjective tests. This research work, the first initiative to capture explicitly knowledge from audio experts for sound design, represents an important contribution for future design of electronic musical instruments based on perceptual sound quality will help to develop a new sound quality index for benchmarking sound synthesis techniques and serve as a research framework for modeling of a wide range of musical instruments. 006.5
8	A study on reusing resources of speech synthesis for closely-related languages Samsudin, Nur Hana January 2017 (has links) This thesis describes research on building a text-to-speech (TTS) framework that can accommodate the lack of linguistic information of under-resource languages by using existing resources from another language. It describes the adaptation process required when such limited resource is used. The main natural languages involved in this research are Malay and Iban language. The thesis includes a study on grapheme to phoneme mapping and the substitution of phonemes. A set of substitution matrices is presented which show the phoneme confusion in term of perception among respondents. The experiments conducted study the intelligibility as well as perception based on context of utterances. The study on the phonetic prosody is then presented and compared to the Klatt duration model. This is to find the similarities of cross language duration model if one exists. Then a comparative study of Iban native speaker with an Iban polyglot TTS using Malay resources is presented. This is to confirm that the prosody of Malay can be used to generate Iban synthesised speech. The central hypothesis of this thesis is that by using a closely-related language resource, a natural sounding speech can be produced. The aim of this research was to show that by sticking to the indigenous language characteristics, it is possible to build a polyglot synthesised speech system even with insufficient speech resources. 006.5
9	Intelligibility of synthetic speech in noise and reverberation Isaac, Karl Bruce January 2015 (has links) Synthetic speech is a valuable means of output, in a range of application contexts, for people with visual, cognitive, or other impairments or for situations were other means are not practicable. Noise and reverberation occur in many of these application contexts and are known to have devastating effects on the intelligibility of natural speech, yet very little was known about the effects on synthetic speech based on unit selection or hidden Markov models. In this thesis, we put forward an approach for assessing the intelligibility of synthetic and natural speech in noise, reverberation, or a combination of the two. The approach uses an experimental methodology consisting of Amazon Mechanical Turk, Matrix sentences, and noises that approximate the real-world, evaluated with generalized linear mixed models. The experimental methodologies were assessed against their traditional counterparts and were found to provide a number of additional benefits, whilst maintaining equivalent measures of relative performance. Subsequent experiments were carried out to establish the efficacy of the approach in measuring intelligibility in noise and then reverberation. Finally, the approach was applied to natural speech and the two synthetic speech systems in combinations of noise and reverberation. We have examine and report on the intelligibility of current synthesis systems in real-life noises and reverberation using techniques that bridge the gap between the audiology and speech synthesis communities and using Amazon Mechanical Turk. In the process, we establish Amazon Mechanical Turk and Matrix sentences as valuable tools in the assessment of synthetic speech intelligibility. 006.5
10	Connectionist multivariate density-estimation and its application to speech synthesis Uria, Benigno January 2016 (has links) Autoregressive models factorize a multivariate joint probability distribution into a product of one-dimensional conditional distributions. The variables are assigned an ordering, and the conditional distribution of each variable modelled using all variables preceding it in that ordering as predictors. Calculating normalized probabilities and sampling has polynomial computational complexity under autoregressive models. Moreover, binary autoregressive models based on neural networks obtain statistical performances similar to that of some intractable models, like restricted Boltzmann machines, on several datasets. The use of autoregressive probability density estimators based on neural networks to model real-valued data, while proposed before, has never been properly investigated and reported. In this thesis we extend the formulation of neural autoregressive distribution estimators (NADE) to real-valued data; a model we call the real-valued neural autoregressive density estimator (RNADE). Its statistical performance on several datasets, including visual and auditory data, is reported and compared to that of other models. RNADE obtained higher test likelihoods than other tractable models, while retaining all the attractive computational properties of autoregressive models. However, autoregressive models are limited by the ordering of the variables inherent to their formulation. Marginalization and imputation tasks can only be solved analytically if the missing variables are at the end of the ordering. We present a new training technique that obtains a set of parameters that can be used for any ordering of the variables. By choosing a model with a convenient ordering of the dimensions at test time, it is possible to solve any marginalization and imputation tasks analytically. The same training procedure also makes it practical to train NADEs and RNADEs with several hidden layers. The resulting deep and tractable models display higher test likelihoods than the equivalent one-hidden-layer models for all the datasets tested. Ensembles of NADEs or RNADEs can be created inexpensively by combining models that share their parameters but differ in the ordering of the variables. These ensembles of autoregressive models obtain state-of-the-art statistical performances for several datasets. Finally, we demonstrate the application of RNADE to speech synthesis, and confirm that capturing the phone-conditional dependencies of acoustic features improves the quality of synthetic speech. Our model generates synthetic speech that was judged by naive listeners as being of higher quality than that generated by mixture density networks, which are considered a state-of-the-art synthesis technique. 006.5

Search results