• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Active Control of the Human Voice from a Sphere

Anderson, Monty J 01 May 2015 (has links) (PDF)
This work investigates the application of active noise control (ANC) to speech. ANC has had success reducing tonal noise. In this work, that success was extended to noise that is not completely tonal but has some tonal elements such as speech. Limitations such as causality were established on the active control of human speech. An optimal configuration for control actuators was developed for a sphere using a genetic algorithm. The optimal error sensor location was found from exploring the nulls associated with the magnitude of the radiated pressure with reference to the primary pressure field. Both numerically predicted and experimentally validated results for the attenuation of single frequency tones were shown. The differences between the numerically predicted results for attenuation with a sphere present in the pressure field and monopoles in the free-field are also discussed.The attenuation from ANC of both monotone and natural speech is shown and a discussion about the effect of causality on the results is given. The sentence “Joe took father’s shoe bench out” was used for both monotone and natural speech. Over this entire monotone speech sentence, the average attenuation was 8.6 dB with a peak attenuation of 10.6 dB for the syllable “Joe”. Natural speech attenuation was 1.1 dB for the sentence average with a peak attenuation on the syllable “bench” of 2.4 dB. In addition to the lower attenuation values for natural speech, the pressure level for the word “took” was increased by 2.3 dB. Also, the harmonic at 420 Hz in the word “father’s” of monotone speech was reduced globally up to 20 dB. Based on the results of the attenuation of monotone and natural speech, it was concluded that a reasonable amount of attenuation could be achieved on natural speech if its correlation could approach that of monotone speech.
2

Design of a Robust and Flexible Grammar for Speech Control

Ludyga, Tomasz 28 May 2024 (has links)
Voice interaction is an established automatization and accessibility feature. While many satisfactory speech recognition solutions are available today, the interpretation of text se-mantic is in some use-cases difficult. Differentiated can be two types of text semantic ex-traction models: probabilistic and pure rule-based. Rule-based reasoning is formalizable into grammars and enables fast language validation, transparent decision-making and easy customization. In this thesis we develop a context-free ANTLR semantic grammar to control software by speech in a medical, smart glasses related, domain. The implementation is preceded by research of state-of-the-art, requirements consultation and a thorough design of reusable system abstractions. Design includes definitions of DSL, meta grammar, generic system ar-chitecture and tool support. Additionally, we investigate trivial and experimental grammar improvement techniques. Due to multifaceted flexibility and robustness of the designed framework, we indicate its usability in critical and adaptive systems. We determine 75% semantic recognition accuracy in the medical main use-case. We compare it against se-mantic extraction using SpaCy and two fine-tuned AI classifiers. The evaluation reveals high accuracy of BERT for sequence classification and big potential of hybrid solutions with AI techniques on top grammars, essentially for detection of alerts. The accuracy is strong dependent on input quality, highlighting the importance of speech recognition tailored to specific vocabulary.:1 Introduction 1 1.1 Motivation 1 1.2 CAIS.ME Project 2 1.3 Problem Statement 2 1.4 Thesis Overview 3 2 Related Work 4 3 Foundational Concepts and Systems 6 3.1 Human-Computer Interaction in Speech 6 3.2 Speech Recognition 7 3.2.1 Open-source technologies 8 3.2.2 Other technologies 9 3.3 Language Recognition 9 3.3.1 Regular expressions 10 3.3.2 Lexical tokenization 10 3.3.3 Parsing 10 3.3.4 Domain Specific Languages 11 3.3.5 Formal grammars 11 3.3.6 Natural Language Processing 12 3.3.7 Model-Driven Engineering 14 4 State-of-the-Art: Grammars 15 4.1 Overview 15 4.2 Workbenches for Grammar Design 16 4.2.1 ANTLR 16 4.2.2 Xtext 17 4.2.3 JetBrains MPS 17 4.2.4 Other tools 18 4.3 Design Approaches 19 5 Problem Analysis 23 5.1 Methodology 23 5.2 Identification of Use-Cases 24 5.3 Requirements Analysis 26 5.3.1 Functional requirements 26 5.3.2 Qualitative requirements 26 5.3.3 Acceptance criteria 27 6 Design 29 6.1 Preprocessing 29 6.2 Underlying Domain Specific Modelling 31 6.2.1 Language model definition 31 6.2.2 Formalization 32 6.2.3 Constraints 32 6.3 Generic Grammar Syntax 33 6.4 Architecture 36 6.5 Integration of AI Techniques 38 6.6 Grammar Improvement 40 6.6.1 Identification of synonyms 40 6.6.2 Automatic addition of synonyms 42 6.6.3 Addition of same-meaning strings 42 6.6.4 Addition and modification of rules 43 6.7 Processing of unrecognized input 44 6.8 Summary 45 7 Implementation and Evaluation 47 7.1 Development Environment 47 7.2 Implementation 48 7.2.1 Grammar model transformation 48 7.2.2 Output construction 50 7.2.3 Testing 50 7.2.4 Reusability for similar use-cases 51 7.3 Limitations and Challenges 52 7.4 Comparison to NLP Solutions 54 8 Conclusion 58 8.1 Summary of Findings 58 8.2 Future Research and Development 60 Acronyms 62 Bibliography 63 List of Figures 73 List of Tables 74 List of Listings 75

Page generated in 0.0654 seconds