Global ETD Search

1	Deep Learning basierte Sprachinteraktion für Social Assistive Robots Guhr, Oliver 25 September 2024 (has links) In dieser Dissertation wurde ein Voice User Interface (VUI) für Socially Assistive Robot (SAR) konzipiert und entwickelt, mit dem Ziel, eine sprachbasierte Interaktion in Pflegeanwendungen zuermöglichen. Diese Arbeit schließt eine Forschungslücke, indem sie ein VUI entwickelt, das mit der natürlichen deutschen Alltagssprache operiert. Der Fokus lag auf der Nutzung von Fortschritten im Bereich der Deep Learning-basierten Sprachverarbeitung, um die Anforderungen der Robotik und der Nutzergruppen zu erfüllen. Es wurden zwei zentrale Forschungsfragen behandelt: die Ermittlung der Anforderungen an ein VUI für SARs in der Pflege und die Konzeption sowie Implementierung eines solchen VUIs. Die Arbeit erörtert die spezifischen Anforderungen der Robotik und der Nutzenden an ein VUIs. Des Weiteren wurden die geplanten Einsatzszenarien und Nutzergruppen des entwickelten VUIs, einschließlich dessen Anwendung in der Demenztherapie und in Pflegewohnungen, detailliert beschrieben. Im Hauptteil der Arbeit wurde das konzipierte VUI vorgestellt, das durch seine Offline-Fähigkeit und die Integration externer Sensoren und Aktoren des Roboters in das VUI auszeichnet. Die Arbeit behandelt auch die zentralen Bausteine für die Implementierung des VUIs, darunter Spracherkennung, Verarbeitung transkribierter Texte, Sentiment-Analyse und Textsegmentierung. Das entwickelte Dialogmanagement-Modell sowie die Evaluierung aktueller Sprachsynthesesysteme wurden ebenfalls diskutiert. In einer Nutzerstudie wurde die Anwendbarkeit des VUIs ohne spezifische Schulung getestet, mit dem Ergebnis, dass Teilnehmende 93% der Aufgaben erfolgreich lösen konnten. Zukünftige Forschungs- und Entwicklungsaktivitäten umfassen Langzeit-Evaluationen des VUIs in der Robotik und die Entwicklung eines digitalen Assistenten. Die Integration von LLMs undmultimodalen Modellen in VUIs stellt einen weiteren wichtigen Forschungsschwerpunkt dar, ebenso wie die Effizienzsteigerung von Deep Learning-Modellen für mobile Roboter.:Zusammenfassung 3 Abstract 4 1 Einleitung 13 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2 Problemstellung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.1 RobotikinderPflege . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.2 AmbientAssistedLiving . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3 ZielsetzungundForschungsfragen . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4 AufbauderArbeit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Grundlagen 19 2.1 SociallyAssistiveRobotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 VoiceUserInterfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 DeepLearningzurSprachverarbeitung . . . . . . . . . . . . . . . . . . . . . . . 25 3 Konzeption 33 3.1 AnforderungenältererMenschenanVUIs . . . . . . . . . . . . . . . . . . . . . 33 3.2 AnforderungenderRobotikanVUIs . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3 Anwendungskontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.1 RobotergestützteMAKS-Therapie . . . . . . . . . . . . . . . . . . . . . . 38 3.3.2 AALWohnung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 Nutzeranalyse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5 EntwicklungszielefürdasVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 Systemarchitektur 45 4.1 Architekturentscheidungen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 KomponentendesVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.1 Systemkontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.2 AllgemeinesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . 50 4.2.3 DetailliertesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . . 51 4.3 ModulareerweiterbareInteraktion . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3.1 InteraktiondurchSkills . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 7 4.3.2 SchnittstellenmodellderSkills . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.3 ImplementierungsmodellderSkills . . . . . . . . . . . . . . . . . . . . . 56 5 Spracherkennung 59 5.1 VomgeprochenenWortzumText . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.2 VoiceActivityDetection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.3 AutomaticSpeechRecognotion . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.3.1 Evaulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.2 OptimierungderModellefürCPUInferenz . . . . . . . . . . . . . . . . 67 6 Sprachverarbeitung 71 6.1 VomTextzurIntention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.2 Intent-ErkennungundNamedEntityRecognition . . . . . . . . . . . . . . . . . 72 6.3 SegmentierungvonAussagen . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.3.1 Datensatz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.3.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.3.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.3.4 Ablationsstudien . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3.5 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.4 TextbasierteSentimentanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.4.1 Daten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.4.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.4.3 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 7 DialogManagment 97 7.1 VonderIntentionzurAktion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 7.2 VerfahrenfürdasDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . 98 7.3 EinmodularesDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . . .100 7.4 Sprachsynthese. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103 8 EvaluationdesFrameworks 107 8.1 AufbauundZielsetzung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.2 TestdesignundMethodologie . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.3 TeilnehmendederStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110 8.4 AnalyseundInterpretationderErgebnisse . . . . . . . . . . . . . . . . . . . . .111 8.5 EinschränkungenderStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 8.6 Fazit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118 9 ZusammenfassungundAusblick 121 9.1 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121 9.2 Ausblick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124 / In this dissertation, a Voice User Interface (VUI) for Socially Assistive Robot (SAR) was designed and developed, with the aim of enabling voice-based interaction in care applications. This work fills a research gap by developing a VUI that operates with natural everyday natural everyday German language. The focus was on utilising advances in the field of deep learning-based speech and language processing to fulfil the requirements of robotics and user groups. Two central research questions were addressed: determining the requirements for a VUI for SARs in care applications and the design and implementation of such a VUIs. The work discusses the specific requirements of robotics and the users of a VUI. Furthermore, the planned application scenarios and user groups of the developed VUI, including its application in dementia therapy and in care homes, were described in detail. In the main part of the thesis, the designed VUI was presented, which is characterised by its offline capability and the integration of external sensors and actuators of the robot into the VUI. The thesis also deals with the central building blocks for the implementation of the VUIs, including speech recognition, processing of transcribed texts, sentiment analysis and text segmentation. The dialogue management model developed and the evaluation of current speech synthesis systems were also discussed. In a user study, the applicability of the VUIs was tested. Without specific training, the participants were able to successfully solve 93% of the tasks. Future research and development activities should include longterm evaluations of the VUIs in robotics and the development of a digital assistant. The integration of LLMs and multimodal models in VUIs is another important research focus, as is increasing the efficiency of deep learning models for mobile robots:Zusammenfassung 3 Abstract 4 1 Einleitung 13 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2 Problemstellung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.1 RobotikinderPflege . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.2 AmbientAssistedLiving . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3 ZielsetzungundForschungsfragen . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4 AufbauderArbeit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Grundlagen 19 2.1 SociallyAssistiveRobotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 VoiceUserInterfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 DeepLearningzurSprachverarbeitung . . . . . . . . . . . . . . . . . . . . . . . 25 3 Konzeption 33 3.1 AnforderungenältererMenschenanVUIs . . . . . . . . . . . . . . . . . . . . . 33 3.2 AnforderungenderRobotikanVUIs . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3 Anwendungskontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.1 RobotergestützteMAKS-Therapie . . . . . . . . . . . . . . . . . . . . . . 38 3.3.2 AALWohnung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 Nutzeranalyse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5 EntwicklungszielefürdasVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 Systemarchitektur 45 4.1 Architekturentscheidungen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 KomponentendesVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.1 Systemkontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.2 AllgemeinesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . 50 4.2.3 DetailliertesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . . 51 4.3 ModulareerweiterbareInteraktion . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3.1 InteraktiondurchSkills . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 7 4.3.2 SchnittstellenmodellderSkills . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.3 ImplementierungsmodellderSkills . . . . . . . . . . . . . . . . . . . . . 56 5 Spracherkennung 59 5.1 VomgeprochenenWortzumText . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.2 VoiceActivityDetection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.3 AutomaticSpeechRecognotion . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.3.1 Evaulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.2 OptimierungderModellefürCPUInferenz . . . . . . . . . . . . . . . . 67 6 Sprachverarbeitung 71 6.1 VomTextzurIntention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.2 Intent-ErkennungundNamedEntityRecognition . . . . . . . . . . . . . . . . . 72 6.3 SegmentierungvonAussagen . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.3.1 Datensatz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.3.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.3.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.3.4 Ablationsstudien . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3.5 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.4 TextbasierteSentimentanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.4.1 Daten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.4.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.4.3 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 7 DialogManagment 97 7.1 VonderIntentionzurAktion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 7.2 VerfahrenfürdasDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . 98 7.3 EinmodularesDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . . .100 7.4 Sprachsynthese. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103 8 EvaluationdesFrameworks 107 8.1 AufbauundZielsetzung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.2 TestdesignundMethodologie . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.3 TeilnehmendederStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110 8.4 AnalyseundInterpretationderErgebnisse . . . . . . . . . . . . . . . . . . . . .111 8.5 EinschränkungenderStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 8.6 Fazit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118 9 ZusammenfassungundAusblick 121 9.1 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121 9.2 Ausblick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124 Sprachinterface, Sprache, Robotik info:eu-repo/classification/ddc/006 ddc:006
2	Exploring Mothers’ Perspectives on Social Assistive Robots in Postpartum Depression Healthcare Paulsson, Tobiaz January 2022 (has links) Postpartum Depression (PPD) affects 8-15 percent of new mothers in Sweden every year. The majority of PPD cases go undetected, and only a few percent receive adequate care. New ways to detect and diagnose PPD are required. In my previous work, PPD experts expressed willingness to integrate social assistive robots (SARs) into their medical team. Moreover, we disclosed that future work was needed to include patients' perspectives on the subject. This thesis aims to provide insights from mothers with experience of mental health issues in relation to their pregnancy to elicit perceptions, attitudes and opinions towards SARs in PPD healthcare. Semi-structured interviews with participants (n=10) and a generative design activity were conducted and analyzed using thematic analysis. The results suggest split opinions towards SARs in PPD healthcare. Participants expressed healthcare needs, and how SARs could be used to address these issues. Opinions towards the robot's appearance, including characteristics, gender and ethnicity were also discussed. Future work including midwives, child health nurses' and perspectives is needed, as well as a larger sample of women to validate the robot’s appearance, gender and characteristics. Social Assistive Robots Postpartum Depression Maternity Healthcare Qualitative Research Thematic Analysis Human-Robot Interaction Human Computer Interaction Computer and Information Sciences Data- och informationsvetenskap Robotics Robotteknik och automation

Search results

Deep Learning basierte Sprachinteraktion für Social Assistive Robots

Exploring Mothers’ Perspectives on Social Assistive Robots in Postpartum Depression Healthcare