Return to search

Design of a Robust and Flexible Grammar for Speech Control

Voice interaction is an established automatization and accessibility feature. While many satisfactory speech recognition solutions are available today, the interpretation of text se-mantic is in some use-cases difficult. Differentiated can be two types of text semantic ex-traction models: probabilistic and pure rule-based. Rule-based reasoning is formalizable into grammars and enables fast language validation, transparent decision-making and easy customization. In this thesis we develop a context-free ANTLR semantic grammar to control software by speech in a medical, smart glasses related, domain. The implementation is preceded by research of state-of-the-art, requirements consultation and a thorough design of reusable system abstractions. Design includes definitions of DSL, meta grammar, generic system ar-chitecture and tool support. Additionally, we investigate trivial and experimental grammar improvement techniques. Due to multifaceted flexibility and robustness of the designed framework, we indicate its usability in critical and adaptive systems. We determine 75% semantic recognition accuracy in the medical main use-case. We compare it against se-mantic extraction using SpaCy and two fine-tuned AI classifiers. The evaluation reveals high accuracy of BERT for sequence classification and big potential of hybrid solutions with AI techniques on top grammars, essentially for detection of alerts. The accuracy is strong dependent on input quality, highlighting the importance of speech recognition tailored to specific vocabulary.:1 Introduction 1
1.1 Motivation 1
1.2 CAIS.ME Project 2
1.3 Problem Statement 2
1.4 Thesis Overview 3
2 Related Work 4
3 Foundational Concepts and Systems 6
3.1 Human-Computer Interaction in Speech 6
3.2 Speech Recognition 7
3.2.1 Open-source technologies 8
3.2.2 Other technologies 9
3.3 Language Recognition 9
3.3.1 Regular expressions 10
3.3.2 Lexical tokenization 10
3.3.3 Parsing 10
3.3.4 Domain Specific Languages 11
3.3.5 Formal grammars 11
3.3.6 Natural Language Processing 12
3.3.7 Model-Driven Engineering 14
4 State-of-the-Art: Grammars 15
4.1 Overview 15
4.2 Workbenches for Grammar Design 16
4.2.1 ANTLR 16
4.2.2 Xtext 17
4.2.3 JetBrains MPS 17
4.2.4 Other tools 18
4.3 Design Approaches 19
5 Problem Analysis 23
5.1 Methodology 23
5.2 Identification of Use-Cases 24
5.3 Requirements Analysis 26
5.3.1 Functional requirements 26
5.3.2 Qualitative requirements 26
5.3.3 Acceptance criteria 27
6 Design 29
6.1 Preprocessing 29
6.2 Underlying Domain Specific Modelling 31
6.2.1 Language model definition 31
6.2.2 Formalization 32
6.2.3 Constraints 32
6.3 Generic Grammar Syntax 33
6.4 Architecture 36
6.5 Integration of AI Techniques 38
6.6 Grammar Improvement 40
6.6.1 Identification of synonyms 40
6.6.2 Automatic addition of synonyms 42
6.6.3 Addition of same-meaning strings 42
6.6.4 Addition and modification of rules 43
6.7 Processing of unrecognized input 44
6.8 Summary 45
7 Implementation and Evaluation 47
7.1 Development Environment 47
7.2 Implementation 48
7.2.1 Grammar model transformation 48
7.2.2 Output construction 50
7.2.3 Testing 50
7.2.4 Reusability for similar use-cases 51
7.3 Limitations and Challenges 52
7.4 Comparison to NLP Solutions 54
8 Conclusion 58
8.1 Summary of Findings 58
8.2 Future Research and Development 60
Acronyms 62
Bibliography 63
List of Figures 73
List of Tables 74
List of Listings 75

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:91619
Date28 May 2024
CreatorsLudyga, Tomasz
ContributorsWendt, Karsten, Aßmann, Uwe, Technische Universität Dresden
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/publishedVersion, doc-type:masterThesis, info:eu-repo/semantics/masterThesis, doc-type:Text
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0023 seconds