Global ETD Search

31	The synthesis of sound with application in a MIDI environment Kesterton, Anthony James January 1991 (has links) The wide range of options for experimentation with the synthesis of sound are usually expensive, difficult to obtain, or limit the experimenter. The work described in this thesis shows how the IBM PC and software can be combined to provide a suitable platform for experimentation with different synthesis techniques. This platform is based on the PC, the Musical Instrument Digital Interface (MIDI) and a musical instrument called a digital sampler. The fundamental concepts of sound are described, with reference to digital sound reproduction. A number of synthesis techniques are described. These are evaluated according to the criteria of generality, efficiency and control. The techniques discussed are additive synthesis, frequency modulation synthesis, subtractive synthesis, granular synthesis, resynthesis, wavetable synthesis, and sampling. Spiral synthesis, physical modelling, waveshaping and spectral interpolation are discussed briefly. The Musical Instrument Digital Interface is a standard method of connecting digital musical instruments together. It is the MIDI standard and equipment conforming to that standard that makes this implementation of synthesis techniques possible. As a demonstration of the PC platform, additive synthesis, frequency modulation synthesis, granular synthesis and spiral synthesis have been implemented in software. A PC equipped with a MIDI interface card is used to perform the synthesis. The MIDI protocol is used to transmit the resultant sound to a digital sampler. The INMOS transputer is used as an accelerator, as the calculation of a waveform using software is a computational intensive process. It is concluded that sound synthesis can be performed successfully using a PC and the appropriate software, and utilizing the facilities provided by a MIDI environment including a digital sampler. Computer sound processing -- Research Music -- Data processing -- Research MIDI (Standard)
32	A distributed approach to surround sound production Smith, Adrian Wilfrid January 1999 (has links) The requirement for multi-channel surround sound in audio production applications is growing rapidly. Audio processing in these applications can be costly, particularly in multi-channel systems. A distributed approach is proposed for the development of a realtime spatialization system for surround sound music production, using Ambisonic surround sound methods. The latency in the system is analyzed, with a focus on the audio processing and network delays, in order to ascertain the feasibility of an enhanced, distributed real-time spatialization system. Surround-sound systems Computer sound processing Music -- Data processing
33	High-level audio morphing strategies Hatch, Wesley January 2005 (has links) No description available. Computer sound processing.
34	A comparative study of time-stretching algorithms for audio signals / Markle, Blake L. January 2001 (has links) No description available. Computer sound processing. Vocoder.
35	The effects of 16 variables on a telephone information system which uses synthetic speech Beaudet, Douglas Barrett January 1988 (has links) Information systems that employ synthetic speech are emerging daily in the consumer market. However, many of these systems are being developed without first investigating the numerous factors that affect the design and usability of these systems. This study investigated the effects of 16 variables on a telephone information system which uses synthetic speech as the display modality. The information system was for a fictitious department store. Subjects telephoned the system and searched for information messages on specific store items. Upon hearing the message, subjects transcribed what they heard and rated their perceived difficulty in understanding the message, their confidence in correctly remembering the message, and their perceived difficulty in finding the store item in the system. Subject search performance measures were recorded during each search, and system evaluation subjective ratings were collected at the end of each experimental session. A Hadamard 32x32 matrix design was used in this screening study to test efficiently the main effects of the 16 variables on 23 measures of user performance. Only 32 data points were required to evaluate the variables in the screening study. The analyses identified 8 variables (speech rate, menu organization, number of targets, wallet guide, menu feedback, background music, subject age, and subject gender) as having a significant effect in at least two tests; 4 variables (voice type, pause/resume, repeat keyword, and command feedback) as having a significant effect in one test; and 4 variables (input timeout, system response time, selection feedback, and spell-out keyword) that did not have a significant effect in any test. The analyses also assessed the worth of the 12 dependent measures in providing meaningful test results. / Master of Science LD5655.V855 1988.B427 Computer sound processing Psychoacoustics
36	Using subjective ratings to select independent variables in the design of telephone inquiry systems Merkle, Peter Jay Jr. January 1988 (has links) This thesis describes a two part research program in which the applicability of subjective ratings to the selection of independent variables was evaluated. The first portion of the research reviewed a case study involving the application of complex system investigation to the development of a telephone inquiry system. A telephone inquiry system is one in which users seek information in a data base by calling the system, listening to information presented by a synthetic voice, and directing movement through the database with commands on the telephone keypad keys. The complex system investigation method used included identifying the independent variables by brainstorming, then reducing the list by subjecting the variables to literature review, feasibility analysis, relevance analysis, and subjective ratings of the factors based on a prototype system. Variables which were not likely to have an immediate impact on human performance in the system were set to a constant value. The use of subjective ratings to select independent variables stems from the need to reduce large numbers of independent variables to a list which can be used as candidates for a screening study. The result of the case study was a list of 19 candidate factors suggested for implementation in a screening study. The second portion of the research describes an experiment in which 5 independent variables ( number of steps in a search, adapting speech rate, transaction summary, native/non-native, and sex of the voice) were chosen to represent the 19 candidate factors in an experiment testing the validity of the ·subjective ratings technique. The results indicated that the subjective ratings of the prototype system were effective in predicting performance and subjective ratings. The impact of these results on the methodology and telephone inquiry systems is also discussed. / M.S. LD5655.V855 1988.M474 Computer sound processing Psychoacoustics Telephone systems
37	A Study of Timbre Modulation Using a Digital Computer, with Applications to Composition Hamilton, Richard L. 12 1900 (has links) This paper presents a means of modulating timbre in digital sound synthesis using additive processes . A major portion of the paper is a computer program, written in Pl/1, which combines this additive method of timbre modulation with several other sound manipulation ideas to form a compositional program. This program-which is named CART for Computer Aided Rotational Translation-provides input for the Music 360 digital sound synthesis program. The paper contains three major parts: (1) a discussion of the CART program's evolution; (2) a manual describing in detail the use of CART; and (3) two tape compositions realized using the program. An appendix contains the program listing and listing of the input cards that were used to produce the two compositions. computer sound processing computer music digital sound synthesis Computer sound processing. Computer composition. CART (Computer file) Computer music.
38	Detecção de atividade vocal empregando máquinas de Boltzmann restritas. / Voice activity detection employing restricted Boltzmann machines. Borin, Rogério Guerra 06 December 2016 (has links) Neste trabalho, uma versão de RBM (Restricted Boltzmann Machine) tendo uma camada de classificação é adaptada a fim de permitir o seu uso com dados definidos num domínio contínuo. Essa adaptação dá origem a uma variante do modelo para o qual são desenvolvidas as regras de atualização de parâmetros dos treinamentos discriminativo, generativo e híbrido. A aplicação da variante como classificador no problema de VAD (Voice Activity Detection) é então investigada. Por meio de simulações envolvendo o corpus NOIZEUS e empregando como entradas do classificador tanto MFCCs (Mel-Frequency Cepstral Coefficients) quanto FBEs (Filter-Bank Energies), são obtidos resultados comparáveis aos de detectores considerados como estado da arte, com um menor custo computacional. A variante de RBM é comparada também com as SVMs (Support Vector Machines) lineares e com núcleo gaussiano. Com treinamento discriminativo, a RBM fornece desempenhos intermediários entre as duas versões de SVM, porém um custo computacional que é consideravelmente inferior aos de ambas. Adicionalmente, um conjunto de medidas do áudio que tiveram seu uso em VAD proposto recentemente são avaliadas com o emprego da RBM com treinamento discriminativo. Embora os resultados não sejam conclusivos, os desempenhos conseguidos indicam que essas medidas não são vantajosas quando comparadas com os tradicionais MFCCs. / In this work, a type of Restricted Boltzmann Machine (RBM) having a classification layer is adapted to allow its use with data defined in a continuous domain. Such adaptation gives rise to a variant of the model for which the parameter update rules are developed for the discriminative, generative and hybrid types of training. The application of the variant as a classifier to the Voice Activity Detection (VAD) problem is then investigated. By means of simulations involving the corpus NOIZEUS and employing Mel-Frequency Cepstral Coefficients (MFCCs) or Filter-Bank Energies (FBEs) as classifier inputs, results comparable to those of state-of-the-art detectors are achieved with a lower computational cost. The RBM variant is also compared to the linear and Gaussian kernel Support Vector Machines (SVMs). With the discriminative training, the RBM provides intermediate performances between the two SVM types, but a computational cost that is considerably lower than theirs. Additionally, a set of measures from the audio whose application in VAD has been recently proposed are evaluated by employing the RBM with discriminative training. Although the results are not conclusive, the performances obtained indicate that the measures are not advantageous when compared to the traditional MFCCs. Artificial intelligence Inteligência artificial Processamento de sinais Processamento de som Signal processing Sound processing Telefonia Telephony
39	Detecção de atividade vocal empregando máquinas de Boltzmann restritas. / Voice activity detection employing restricted Boltzmann machines. Rogério Guerra Borin 06 December 2016 (has links) Neste trabalho, uma versão de RBM (Restricted Boltzmann Machine) tendo uma camada de classificação é adaptada a fim de permitir o seu uso com dados definidos num domínio contínuo. Essa adaptação dá origem a uma variante do modelo para o qual são desenvolvidas as regras de atualização de parâmetros dos treinamentos discriminativo, generativo e híbrido. A aplicação da variante como classificador no problema de VAD (Voice Activity Detection) é então investigada. Por meio de simulações envolvendo o corpus NOIZEUS e empregando como entradas do classificador tanto MFCCs (Mel-Frequency Cepstral Coefficients) quanto FBEs (Filter-Bank Energies), são obtidos resultados comparáveis aos de detectores considerados como estado da arte, com um menor custo computacional. A variante de RBM é comparada também com as SVMs (Support Vector Machines) lineares e com núcleo gaussiano. Com treinamento discriminativo, a RBM fornece desempenhos intermediários entre as duas versões de SVM, porém um custo computacional que é consideravelmente inferior aos de ambas. Adicionalmente, um conjunto de medidas do áudio que tiveram seu uso em VAD proposto recentemente são avaliadas com o emprego da RBM com treinamento discriminativo. Embora os resultados não sejam conclusivos, os desempenhos conseguidos indicam que essas medidas não são vantajosas quando comparadas com os tradicionais MFCCs. / In this work, a type of Restricted Boltzmann Machine (RBM) having a classification layer is adapted to allow its use with data defined in a continuous domain. Such adaptation gives rise to a variant of the model for which the parameter update rules are developed for the discriminative, generative and hybrid types of training. The application of the variant as a classifier to the Voice Activity Detection (VAD) problem is then investigated. By means of simulations involving the corpus NOIZEUS and employing Mel-Frequency Cepstral Coefficients (MFCCs) or Filter-Bank Energies (FBEs) as classifier inputs, results comparable to those of state-of-the-art detectors are achieved with a lower computational cost. The RBM variant is also compared to the linear and Gaussian kernel Support Vector Machines (SVMs). With the discriminative training, the RBM provides intermediate performances between the two SVM types, but a computational cost that is considerably lower than theirs. Additionally, a set of measures from the audio whose application in VAD has been recently proposed are evaluated by employing the RBM with discriminative training. Although the results are not conclusive, the performances obtained indicate that the measures are not advantageous when compared to the traditional MFCCs. Inteligência artificial Processamento de sinais Processamento de som Telefonia Artificial intelligence Signal processing Sound processing Telephony
40	Cochlear implant sound coding with across-frequency delays Taft, Daniel Adam January 2009 (has links) The experiments described in this thesis investigate the temporal relationship between frequency bands in a cochlear implant sound processor. Initial studies were of cochlea-based traveling wave delays for cochlear implant sound processing strategies. These were later broadened into studies of an ensemble of across-frequency delays. / Before incorporating cochlear delays into a cochlear implant processor, a set of suitable delays was determined with a psychoacoustic calibration to pitch perception, since normal cochlear delays are a function of frequency. The first experiment assessed the perception of pitch evoked by electrical stimuli from cochlear implant electrodes. Six cochlear implant users with acoustic hearing in their non-implanted ears were recruited for this, since they were able to compare electric stimuli to acoustic tones. Traveling wave delays were then computed for each subject using the frequencies matched to their electrodes. These were similar across subjects, ranging over 0-6 milliseconds along the electrode array. / The next experiment applied the calibrated delays to the ACE strategy filter outputs before maxima selection. The effects upon speech perception in noise were assessed with cochlear implant users, and a small but significant improvement was observed. A subsequent sensitivity analysis indicated that accurate calibration of the delays might not be necessary after all; instead, a range of across-frequency delays might be similarly beneficial. / A computational investigation was performed next, where a corpus of recorded speech was passed through the ACE cochlear implant sound processing strategy in order to determine how across-frequency delays altered the patterns of stimulation. A range of delay vectors were used in combination with a number of processing parameter sets and noise levels. The results showed that additional stimuli from broadband sounds (such as the glottal pulses of vowels) are selected when frequency bands are desynchronized with across-frequency delays. Background noise contains fewer dominant impulses than a single talker and so is not enhanced in this way. / In the following experiment, speech perception with an ensemble of across-frequency delays was assessed with eight cochlear implant users. Reverse cochlear delays (high frequency delays) were equivalent to conventional cochlear delays. Benefit was diminished for larger delays. Speech recognition scores were at baseline with random delay assignments. An information transmission analysis of speech in quiet indicated that the discrimination of voiced cues was most improved with across-frequency delays. For some subjects, this was seen as improved vowel discrimination based on formant locations and improved transmission of the place of articulation of consonants. / A final study indicated that benefits to speech perception with across-frequency delays are diminished when the number of maxima selected per frame is increased above 8-out-of-22 frequency bands.

Search results