591 |
Automatic Speech Recognition for ageing voicesVipperla, Ravichander January 2011 (has links)
With ageing, human voices undergo several changes which are typically characterised by increased hoarseness, breathiness, changes in articulatory patterns and slower speaking rate. The focus of this thesis is to understand the impact of ageing on Automatic Speech Recognition (ASR) performance and improve the ASR accuracies for older voices. Baseline results on three corpora indicate that the word error rates (WER) for older adults are significantly higher than those of younger adults and the decrease in accuracies is higher for males speakers as compared to females. Acoustic parameters such as jitter and shimmer that measure glottal source disfluencies were found to be significantly higher for older adults. However, the hypothesis that these changes explain the differences in WER for the two age groups is proven incorrect. Experiments with artificial introduction of glottal source disfluencies in speech from younger adults do not display a significant impact on WERs. Changes in fundamental frequency observed quite often in older voices has a marginal impact on ASR accuracies. Analysis of phoneme errors between younger and older speakers shows a pattern of certain phonemes especially lower vowels getting more affected with ageing. These changes however are seen to vary across speakers. Another factor that is strongly associated with ageing voices is a decrease in the rate of speech. Experiments to analyse the impact of slower speaking rate on ASR accuracies indicate that the insertion errors increase while decoding slower speech with models trained on relatively faster speech. We then propose a way to characterise speakers in acoustic space based on speaker adaptation transforms and observe that speakers (especially males) can be segregated with reasonable accuracies based on age. Inspired by this, we look at supervised hierarchical acoustic models based on gender and age. Significant improvements in word accuracies are achieved over the baseline results with such models. The idea is then extended to construct unsupervised hierarchical models which also outperform the baseline models by a good margin. Finally, we hypothesize that the ASR accuracies can be improved by augmenting the adaptation data with speech from acoustically closest speakers. A strategy to select the augmentation speakers is proposed. Experimental results on two corpora indicate that the hypothesis holds true only when the amount of available adaptation is limited to a few seconds. The efficacy of such a speaker selection strategy is analysed for both younger and older adults.
|
592 |
Multichannel Digital Signal Processor Based Red/Black KeysetSmith, Quentin D. 10 1900 (has links)
International Telemetering Conference Proceedings / October 26-29, 1992 / Town and Country Hotel and Convention Center, San Diego, California / This paper addresses a method to provide both secure and non-secure voice communications to a DS-1 network from a common keyset. In order to comply with both the electrical isolation requirements and the operational security issues regarding voice communications, an all-digital approach to the keyset was developed based upon the AD2101 DSP. Protocols that are handled by the keyset include: Multiple PTT modes, hot mike, telephone access, priority override, direct access, indirect access, paging, and monitor only. Special features that are addressed include: independent channel by channel assignment of access protocols, headset assignment, speaker assignment, and PTT assignment. Multiple microprocessors are used to implement the foregoing as well as down-loadable configurations, remote keyset control and monitoring, and composite audio outputs. Partitioning of the digital design provides RED to BLACK channel isolation and RED channel to AC power isolation of greater than 107 dB.
|
593 |
Edwards Digital Switch System OverviewSwitzer, Earl R., Straehley, Erwin H. 10 1900 (has links)
International Telemetering Conference Proceedings / October 26-29, 1992 / Town and Country Hotel and Convention Center, San Diego, California / The Edwards Digital Switch (EDS) is a digital communication system that provides advanced voice networking capabilities to the Edwards Test Range. The EDS is a member of a new family of all-digital switching systems that internally handle data in digital form. To accommodate analog voice and data circuits, conversions between analog and digital formats occur at the system interfaces. The EDS consists of six groups of configuration items: System-level control and monitoring is centralized in the Control and Display Subsystem. Workstations provide subsystem-level control and monitoring. The Central Switching Subsystem, as the primary interface with the range environment, provides system connectivity to radios, telephone circuits, and communications links to other facilities. It integrates the EDS with links to the Control Room Switching Subsystems. Each Control Room Switching Subsystem connects individual user stations within a Mission Control Room or other localized area. The user equipment element consists of a Subscriber Terminal Unit, Channel Expander, and interface panels for headsets, foot switches, and speakers. The Remote Radio Control Unit optimizes usage of available frequencies, allowing control of tunable radios from the Control and Display Subsystem. *The original name, Edwards Communication Switching System (ECSS) was changed to Edwards Digital Switch (EDS) in 1990. The Site Selection Unit facilitates the handover of voice communications between receiver sites when a long-range test is monitored. The system architecture is based on a central system-level control element, a central switch, multiple subsystem-level control elements, multiple subsystem switches, and end-equipment items that are interconnected through the switch network. The EDS combines multiple voice communications applications in a single system. The system is being expanded to integrate voice and data switching. Its major function is support of multiparty networked voice communications within Mission Control Rooms and between other test participants. Other voice functions are an intercom capability including both Direct Access (hot line) and Indirect Access (dial-up), subscriber loop connections to the base-level telephone exchange, and the Public Switched Network System. Digital interfaces allow integration of ciphertext data and Time Space Position Information data switching functions. A system based on the EDS design has also been installed by the Air Force at Eglin AFB. Engineering studies for systems that make use of the EDS design are currently underway by the Navy at China Lake and the Army at White Sands Missile Range. The EDS project office has actively pursued promising program management concepts such as: specifying nondevelopmental items, requiring industry standard interconnectivity and interoperability, and using a multiyear fixed-price requirements-type contract to encourage multiservice participation.
|
594 |
Contemporary Plans for Training the Boy's Changing VoiceCox, Rolla Kenneth 01 1900 (has links)
The purpose of this study is to describe contemporary plans for training the boy's changing voice and to prescribe ensemble material for these voices. Specific Problems: Analysis of the general problem leads to subordinate questions, or sub-problems, which may be stated as follows: 1. What are the contemporary plans of training the boy's changing voice? 2. What are the most usable musical materials available for use by ensembles which include boys with changing voices?
|
595 |
A Comparison of Major Theories of Laryngeal VibrationSmith, Sue Ellen 12 1900 (has links)
The purpose of this study was to compare major theories of laryngeal vibration. The basic hypothesis of the study was that the differences and similarities between the major theories of laryngeal vibration could be made evident and clear through a comparative study. It was assumed that there are two or more theories of laryngeal vibration and that all the major theories of laryngeal vibration from 1945 to the present have been described in written form in English.
|
596 |
Multibiometric security in wireless communication systemsSepasian, Mojtaba January 2010 (has links)
This thesis has aimed to explore an application of Multibiometrics to secured wireless communications. The medium of study for this purpose included Wi-Fi, 3G, and WiMAX, over which simulations and experimental studies were carried out to assess the performance. In specific, restriction of access to authorized users only is provided by a technique referred to hereafter as multibiometric cryptosystem. In brief, the system is built upon a complete challenge/response methodology in order to obtain a high level of security on the basis of user identification by fingerprint and further confirmation by verification of the user through text-dependent speaker recognition. First is the enrolment phase by which the database of watermarked fingerprints with memorable texts along with the voice features, based on the same texts, is created by sending them to the server through wireless channel. Later is the verification stage at which claimed users, ones who claim are genuine, are verified against the database, and it consists of five steps. Initially faced by the identification level, one is asked to first present one’s fingerprint and a memorable word, former is watermarked into latter, in order for system to authenticate the fingerprint and verify the validity of it by retrieving the challenge for accepted user. The following three steps then involve speaker recognition including the user responding to the challenge by text-dependent voice, server authenticating the response, and finally server accepting/rejecting the user. In order to implement fingerprint watermarking, i.e. incorporating the memorable word as a watermark message into the fingerprint image, an algorithm of five steps has been developed. The first three novel steps having to do with the fingerprint image enhancement (CLAHE with 'Clip Limit', standard deviation analysis and sliding neighborhood) have been followed with further two steps for embedding, and extracting the watermark into the enhanced fingerprint image utilising Discrete Wavelet Transform (DWT). In the speaker recognition stage, the limitations of this technique in wireless communication have been addressed by sending voice feature (cepstral coefficients) instead of raw sample. This scheme is to reap the advantages of reducing the transmission time and dependency of the data on communication channel, together with no loss of packet. Finally, the obtained results have verified the claims.
|
597 |
ESTILL VOICE TRAINING: THE KEY TO HOLISTIC VOICE AND SPEECH TRAINING FOR THE ACTORSalsbury, Katharine 28 April 2014 (has links)
The aim of this paper is to examine the Estill Voice Training System to explain how it may be used in tandem with widely accepted voice and speech methodologies such as those developed by Kristin Linklater, Patsy Rodenburg and Dudley Knight/Phil Thompson in order produce versatile performers able to meet the vocal gauntlet flung at the feet of the contemporary actor. Students must be able to effectively function as voice-over talent, sing musical theatre, rattle off classical text with aplomb and work in film, all with superior vocal health. Synthesizing proven techniques with the skills presented in the inter-disciplinary Estill Voice Training System, I hope to develop a new, anatomically specific, voice and speech training progression to efficiently assist the student actor discover the physical and emotional vocal ranges demanded of the contemporary actor.
|
598 |
An American Actor's DialectBruckmueller, Michael J. 01 January 2004 (has links)
Over the course of the past ten years, both studying and teaching Voice & Speech for the Actor, I have become frustrated with the status quo of so called 'standard speech'. The two dialects that I have studied in depth are Edith Skinner's 'American Classical Stage Standard' and Kenneth Crannell's 'Career Speech'. I have found something lacking in both the Skinner dialect and Crannell's 'Career Speech'. Yet, I believe that each has a strength from which the other could benefit. The specificity of the Skinner dialect makes 'American Classical Stage Standard' not only easy to learn but also an excellent tool in ear training. The problem with this dialect is that before its artificial creation, it did not exist in the American English language. Additionally, 'American Classical Stage Standard' is not appropriate for theatrical works in a contemporary setting. Conversely, the 'standards' that have been formed in reaction to Skinner's method, such as Crannell's 'Career Speech', are rooted in American English Speech. But since Crannell's 'Career Speech' relies heavily on observation, the resulting paradigm avoids specificity because in the real world not everyone speaks in the same way. The dialect that I am setting forth in this project is my attempt to combine the Skinner dialect and Crannell's 'Career Speech' to create a dialect that is contemporary but non-geographic specific in sound. My American Actor's dialect will be simple and efficient to learn and teach and will provide the student with a base dialect for further study in voice and speech for the stage and for contemporary American theatrical works set post 1980 if there is no dialect called for in the script or if the director chooses not to include dialect work in that specific production.
|
599 |
The significance of kinaesthetic vocal sensations related to listening behaviour : an MRI studyMiller, Nicola Anne January 2014 (has links)
The aim of this project was to investigate the nature and possible significance of first-person kinaesthetic vocal sensations observed in association with musical listening. Hearing and voice are known to be closely linked but the mechanisms that underlie their close relationship are not yet understood. The presence of kinaesthetic vocal sensations challenges accounts of auditory processing that are divorced from peripheral vocal input and, instead, suggests the hypothesis that auditory and vocal processing mechanisms rely on shared peripheral substrates in addition to their increasingly recognized shared (brain-based) central substrates. To investigate this hypothesis, I used MRI and developed a measurement protocol (informed by established methods in cephalometry) that would allow me to relate vocal structures to their direct and indirect bony attachments to the craniofacial skeleton, cervical spine and sternum. After establishing the method's validity in subjects at rest, I acquired midsagittal MR images (under conditions where articulatory and postural input was negligible) while subjects (1) hummed and (2) listened (in a focused way) to low and high notes at each end of their range. Geometric and shape analysis of craniocaudal, craniocervical and anteroposterior variables revealed significant differences between low- and high-note conditions and widespread correlations between variables for both humming and listening investigations. An unexpected association between pitch change and changes of cervical alignment was also found. These results were complemented and extended by using the same MR images to build an active shape model (ASM). In addition to showing how vocal structures move together, ASM showed goal-related vocal activity to consist of one or more independent modes of variation. Together, the observations, experimental results, and evidence from diverse historical and contemporary sources, support the hypothesis that mechanisms underlying auditory and vocal processing rely on shared central and peripheral substrates. Wide-ranging implications arising from this hypothesis are also discussed.
|
600 |
Operational benefit of implementing VoIP in a tactical environment / Operational benefit of implementing Voice Over Internet Protocol in a tactical environmentLewis, Rosemary 06 1900 (has links)
Approved for public release, distribution is unlimited / In this thesis, Voice over Internet Protocol (VoIP) technology will be explored and a recommendation of the operational benefit of VoIP will be provided. A network model will be used to demonstrate improvement of voice End-to-End delay by implementing quality of service (QoS) controls. An overview of VoIP requirements will be covered and recommended standards will be reviewed. A clear definition of a Battle Group will be presented and an overview of current analog RF voice technology will be explained. A comparison of RF voice technology and VoIP will modeled using OPNET Modeler 9.0. / Lieutenant, United States Navy
|
Page generated in 0.026 seconds