Global ETD Search

1	Avaliação de usabilidade para sistemas de transcrição automática de laudos em radiologia. / Usability evaluation for automatic transcription system of radiology reports. Martins, Valéria Farinazzo 15 April 2011 (has links) Este trabalho relaciona elementos das áreas de Computação e Saúde para comporem a elaboração de uma metodologia para avaliação de usabilidade de sistemas de transcrição automática de laudos na área de Radiologia. Inicialmente, é apresentado um estudo realizado sobre a área de Interface do Usuário Baseada em Voz que identifica requisitos para os sistemas que trabalham com comunicação mediada por voz assim como as iniciativas no sentido de se criar uma metodologia para sua avaliação. Em seguida é realizado um estudo dos sistemas de transcrição automática de laudos, no qual os principais requisitos são caracterizados e classificados. Os dois estudos acima citados foram integrados para a elaboração de uma metodologia para a avaliação de Sistemas de Transcrição Automática de Laudos em Radiologia. A metodologia foi, então, validada previamente através de inspeções e testes de usabilidade realizados fora do ambiente hospitalar, através do uso de um sistema de transcrição automática de laudos em Radiologia. Posteriormente, a metodologia foi também aplicada a um hospital da cidade de São Paulo. Como resultado principal foi proposto um guia bastante detalhado para se avaliar os Sistemas de Transcrição Automática de Laudos em Radiologia, cada vez mais presentes em hospitais e clínicas no país, além dos relatos das experiências obtidas com a aplicação desta metodologia em um caso real. / This work combines knowledge from Computer Science and Health Science in order to propose an evaluation methodology for Automatic Transcription System of Radiology Reports. At first a study regarding a Voice User Interface is presented, this interface identifies the requirements for Spoken Language Dialogue Systems and it can also be used as a tool for an evaluation methodology. Following, a study of automatic transcription systems is presented; in this study the main requirements are listed and classified. Both studies were integrated to allow a new methodology to evaluate an Automatic Transcription System in Radiology. This methodology was previously validated through some inspections and usability tests outside the hospital environment and afterword the methodology was used in a hospital in São Paulo city. As a main result it was proposed a very detailed guide for evaluating the Automatic Transcription Systems Reports in Radiology, increasingly found in hospitals and clinics in this country, apart from reports on experiences gained in applying this methodology in a real case. Avaliação de usabilidade Dictation system Sistema de ditado Usability evaluation Voice user interface Voice user interface
2	Avaliação de usabilidade para sistemas de transcrição automática de laudos em radiologia. / Usability evaluation for automatic transcription system of radiology reports. Valéria Farinazzo Martins 15 April 2011 (has links) Este trabalho relaciona elementos das áreas de Computação e Saúde para comporem a elaboração de uma metodologia para avaliação de usabilidade de sistemas de transcrição automática de laudos na área de Radiologia. Inicialmente, é apresentado um estudo realizado sobre a área de Interface do Usuário Baseada em Voz que identifica requisitos para os sistemas que trabalham com comunicação mediada por voz assim como as iniciativas no sentido de se criar uma metodologia para sua avaliação. Em seguida é realizado um estudo dos sistemas de transcrição automática de laudos, no qual os principais requisitos são caracterizados e classificados. Os dois estudos acima citados foram integrados para a elaboração de uma metodologia para a avaliação de Sistemas de Transcrição Automática de Laudos em Radiologia. A metodologia foi, então, validada previamente através de inspeções e testes de usabilidade realizados fora do ambiente hospitalar, através do uso de um sistema de transcrição automática de laudos em Radiologia. Posteriormente, a metodologia foi também aplicada a um hospital da cidade de São Paulo. Como resultado principal foi proposto um guia bastante detalhado para se avaliar os Sistemas de Transcrição Automática de Laudos em Radiologia, cada vez mais presentes em hospitais e clínicas no país, além dos relatos das experiências obtidas com a aplicação desta metodologia em um caso real. / This work combines knowledge from Computer Science and Health Science in order to propose an evaluation methodology for Automatic Transcription System of Radiology Reports. At first a study regarding a Voice User Interface is presented, this interface identifies the requirements for Spoken Language Dialogue Systems and it can also be used as a tool for an evaluation methodology. Following, a study of automatic transcription systems is presented; in this study the main requirements are listed and classified. Both studies were integrated to allow a new methodology to evaluate an Automatic Transcription System in Radiology. This methodology was previously validated through some inspections and usability tests outside the hospital environment and afterword the methodology was used in a hospital in São Paulo city. As a main result it was proposed a very detailed guide for evaluating the Automatic Transcription Systems Reports in Radiology, increasingly found in hospitals and clinics in this country, apart from reports on experiences gained in applying this methodology in a real case. Avaliação de usabilidade Sistema de ditado Voice user interface Dictation system Usability evaluation Voice user interface
3	Whole Care<sup>+</sup>: An integrated health care for the elderly living in their homes Park, Hyo Ri 01 May 2011 (has links) The elderly experience their health getting significantly deteriorated as they age. They suffer not only from chronic diseases but from various geriatric diseases such as high blood pressure, arthritis and cardiovascular disease. Their mental health also retreats creating challenges for the elderly from the loss of short term memory to dementia. Furthermore, after they retire, the elderly’s social network decreases as their social activities are inevitably limited to a small group of people like families and friends. With the face of such impairments in their physical, mental and social health, many elderly cannot help but are being institutionalized or sent to specialized places like nursing homes, which provide them professional care. However, a study indicates that most Americans prefer to stay in their homes as they get older since they can maintain their social connections to neighbors and friends, be close to their medical caregivers in town as well as attain emotional comfort and security with familiar surrounding and environments. On top of that, Americans of all ages value on keeping their ability of independence and autonomy by controlling their lives in general. Various health care-aid devices and services appear to offer specific support to health care activities for the elderly in their homes. However, such aids have more focused only on when the elderly’s health is degraded or on very specific areas such as tracking health data like blood pressure, blood sugar and calorie intakes. The elderly need comprehensive understanding about their health problems, healthy daily habits and timely interactions with their families and caregivers, in order to keep independent living safely in their places. Smart Home technology has much potential to support the elderly’s independent living as well as interactions with others. To better understand this, we conducted a user-centered design project which looks at the management of the elderly’s health enabled by Smart Home technology. Health care system Interaction design Smart home Elderly Caregivers Voice User interface Graphic User interface
4	A User Centered Design and Prototype of a Mobile Reading Device for the Visually Impaired Keefer, Robert B. 10 June 2011 (has links) No description available. Computer Science Document Image Processing Document Image Layout Analysis Voice User Interface User Interface Modeling
5	A Voice-based Multimodal User Interface for VTQuest Schneider, Thomas W. 14 June 2005 (has links) The original VTQuest web-based software system requires users to interact using a mouse or a keyboard, forcing the users' hands and eyes to be constantly in use while communicating with the system. This prevents the user from being able to perform other tasks which require the user's hands or eyes at the same time. This restriction on the user's ability to multitask while using VTQuest is unnecessary and has been eliminated with the creation of the VTQuest Voice web-based software system. VTQuest Voice extends the original VTQuest functionality by providing the user with a voice interface to interact with the system using the Speech Application Language Tags (SALT) technology. The voice interface provides the user with the ability to navigate through the site, submit queries, browse query results, and receive helpful hints to better utilize the voice system. Individuals with a handicap that prevents them from using their arms or hands, users who are not familiar with the mouse and keyboard style of communication, and those who have their hands preoccupied need alternative communication interfaces which do not require the use of their hands. All of these users require and benefit from a voice interface being added onto VTQuest. Through the use of the voice interface, all of the system's features can be accessed exclusively with voice and without the use of a user's hands. Using a voice interface also frees the user's eyes from being used during the process of selecting an option or link on a page, which allows the user to look at the system less frequently. VTQuest Voice is implemented and tested for operation on computers running Microsoft Windows using Microsoft Internet Explorer with the correct SALT and Adobe Scalable Vector Graphics (SVG) Viewer plug-ins installed. VTQuest Voice offers a variety of features including an extensive grammar and out-of-turn interaction, which are flexible for future growth. The grammar offers ways in which users may begin or end a query to better accommodate the variety of ways users may phrase their queries. To accommodate for abbreviations of building names and alternate pronunciations of building names, the grammar also includes nicknames for the buildings. The out-of-turn interaction combines multiple steps into one spoken sentence thereby shortening the interaction and also making the process more natural for the user. The addition of a voice interface is recommended for web applications which a user may need to use his or her eyes and hands to multitask. Additional functionality which can be added later to VTQuest Voice is touch screen support and accessibility from cell phones, Personal Digital Assistants (PDAs), and other mobile devices. / Master of Science human computer interaction voice user interface multi modal user interface Client/server software
6	Exploring a voice user interface to convey information in an e-commerce website Liljestam, Christopher January 2019 (has links) Screen readers for visually impaired users are poorly optimized for e-commerce websites hence the exclusion of the content. It creates a societal need for accessibility of the content in e-commerce websites for the visually impaired users. This study explores how six blindfolded participants could co-design a Voice User Interface (VUI) in an e-commerce website to convey its information that creates a good user experience for visually impaired. The result of a co-design workshop with interaction design practices showed that a VUI should be humanlike and convey relevant information. Failed speech recognition and overwhelming information had a negative impact on the user experience. To cope with the problems, the VUI should provide more control to the users by conveying explicit confirmations and retrospective information from past shopping trips. Due to the difficulties in finding visually impaired participants, the design process was not completed hence the ideation needs an additional design process. social innovation co-design voice user interface visually impaired e-commerce accessibility interaction design humanlike blind navigation Engineering and Technology Teknik och teknologier
7	Context aware voice user interface Demeter, Nora January 2014 (has links) In this thesis I address the topic of a non-visual approach for interaction on mobile,as an alternative to their existing visual displays in situations where hands free usageof the device is preferred. The current technology will be examined through existingwork with special attention to its limitations, which user groups are currently using anysort of speech recognition or voice command functions and look at in which scenariosare these the most used and most desired. Then I will examine through interviews whypeople trust or distrust voice interactions and how they feel about the possibilities andlimitations of the technology at hand, how individual users use this currently and wheredo they see the technology in the future. After this I will develop an alternative voiceinteraction concept, and validate it through a set of workshops. voice commands voice user interface mobile user interface interaction design phenomenology natural language interface context awareness Engineering and Technology Teknik och teknologier
8	Testing Privacy and Security of Voice Interface Applications in the Internet of Things Era Shafei, Hassan, 0000-0001-6844-5100 04 1900 (has links) Voice User Interfaces (VUI) are rapidly gaining popularity, revolutionizing user interaction with technology through the widespread adoption in devices such as desktop computers, smartphones, and smart home assistants, thanks to significant advancements in voice recognition and processing technologies. Over a hundred million users now utilize these devices daily, and smart home assistants have been sold in massive numbers, owing to their ease and convenience in controlling a diverse range of smart devices within the home IoT environment through the power of voice, such as controlling lights, heating systems, and setting timers and alarms. VUI enables users to interact with IoT technology and issue a wide range of commands across various services using their voice, bypassing traditional input methods like keyboards or touchscreens. With ease, users can inquire in natural language about the weather, stock market, and online shopping and access various other types of general information.However, as VUI becomes more integrated into our daily lives, it brings to the forefront issues related to security, privacy, and usability. Concerns such as the unauthorized collection of user data, the potential for recording private conversations, and challenges in accurately recognizing and executing commands across diverse accents, leading to misinterpretations and unintended actions, underscore the need for more robust methods to test and evaluate VUI services. In this dissertation, we delve into voice interface testing, evaluation for privacy and security associated with VUI applications, assessment of the proficiency of VUI in handling diverse accents, and investigation into access control in multi-user environments. We first study the privacy violations of the VUI ecosystem. We introduced the definition of the VUI ecosystem, where users must connect the voice apps to corresponding services and mobile apps to function properly. The ecosystem can also involve multiple voice apps developed by the same third-party developers. We explore the prevalence of voice apps with corresponding services in the VUI ecosystem, assessing the landscape of privacy compliance among Alexa voice apps and their companion services. We developed a testing framework for this ecosystem. We present the first study conducted on the Alexa ecosystem, specifically focusing on voice apps with account linking. Our designed framework analyzes both the privacy policies of these voice apps and their companion services or the privacy policies of multiple voice apps published by the same developers. Using machine learning techniques, the framework automatically extracts data types related to data collection and sharing from these privacy policies, allowing for a comprehensive comparison. Next, researchers studied the voice apps' behavior to conduct privacy violation assessments. An interaction approach with voice apps is needed to extract the behavior where pre-defined utterances are input into the simulator to simulate user interaction. The set of pre-defined utterances is extracted from the skill's web page on the skill store. However, the accuracy of the testing analysis depends on the quality of the extracted utterances. An utterance or interaction that was not captured by the extraction process will not be detected, leading to inaccurate privacy assessment. Therefore, we revisited the utterance extraction techniques used by prior works to study the skill's behavior for privacy violations. We focused on analyzing the effectiveness and limitations of existing utterance extraction techniques. We proposed a new technique that improved prior work extraction techniques by utilizing the union of these techniques and human interaction. Our proposed technique makes use of a small set of human interactions to record all missing utterances, then expands that to test a more extensive set of voice apps. We also conducted testing on VUI with various accents to study by designing a testing framework that can evaluate VUI on different accents to assess how well VUI implemented in smart speakers caters to a diverse population. Recruiting individuals with different accents and instructing them to interact with the smart speaker while adhering to specific scripts is difficult. Thus, we proposed a framework known as AudioAcc, which facilitates evaluating VUI performance across diverse accents using YouTube videos. Our framework uses a filtering algorithm to ensure that the extracted spoken words used in constructing these composite commands closely resemble natural speech patterns. Our framework is scalable; we conducted an extensive examination of the VUI performance across a wide range of accents, encompassing both professional and amateur speakers. Additionally, we introduced a new metric called Consistency of Results (COR) to complement the standard Word Error Rate (WER) metric employed for assessing ASR systems. This metric enables developers to investigate and rewrite skill code based on the consistency of results, enhancing overall WER performance. Moreover, we looked into a special case related to the access control of VUI in multi-user environments. We proposed a framework for automated testing to explore the access control weaknesses to determine whether the accessible data is of consequence. We used the framework to assess the effectiveness of voice access control mechanisms within multi-user environments. Thus, we show that the convenience of using voice systems poses privacy risks as the user's sensitive data becomes accessible. We identify two significant flaws within the access control mechanisms proposed by the voice system, which can exploit the user's private data. These findings underscore the need for enhanced privacy safeguards and improved access control systems within online shopping. We also offer recommendations to mitigate risks associated with unauthorized access, shedding light on securing the user's private data within the voice systems. / Computer and Information Science Computer science Alexa voice applications Privacy and security in IoT Smart speaker Voice assistant Voice User Interface (VUI) VUI data privacy
9	Deep Learning basierte Sprachinteraktion für Social Assistive Robots Guhr, Oliver 25 September 2024 (has links) In dieser Dissertation wurde ein Voice User Interface (VUI) für Socially Assistive Robot (SAR) konzipiert und entwickelt, mit dem Ziel, eine sprachbasierte Interaktion in Pflegeanwendungen zuermöglichen. Diese Arbeit schließt eine Forschungslücke, indem sie ein VUI entwickelt, das mit der natürlichen deutschen Alltagssprache operiert. Der Fokus lag auf der Nutzung von Fortschritten im Bereich der Deep Learning-basierten Sprachverarbeitung, um die Anforderungen der Robotik und der Nutzergruppen zu erfüllen. Es wurden zwei zentrale Forschungsfragen behandelt: die Ermittlung der Anforderungen an ein VUI für SARs in der Pflege und die Konzeption sowie Implementierung eines solchen VUIs. Die Arbeit erörtert die spezifischen Anforderungen der Robotik und der Nutzenden an ein VUIs. Des Weiteren wurden die geplanten Einsatzszenarien und Nutzergruppen des entwickelten VUIs, einschließlich dessen Anwendung in der Demenztherapie und in Pflegewohnungen, detailliert beschrieben. Im Hauptteil der Arbeit wurde das konzipierte VUI vorgestellt, das durch seine Offline-Fähigkeit und die Integration externer Sensoren und Aktoren des Roboters in das VUI auszeichnet. Die Arbeit behandelt auch die zentralen Bausteine für die Implementierung des VUIs, darunter Spracherkennung, Verarbeitung transkribierter Texte, Sentiment-Analyse und Textsegmentierung. Das entwickelte Dialogmanagement-Modell sowie die Evaluierung aktueller Sprachsynthesesysteme wurden ebenfalls diskutiert. In einer Nutzerstudie wurde die Anwendbarkeit des VUIs ohne spezifische Schulung getestet, mit dem Ergebnis, dass Teilnehmende 93% der Aufgaben erfolgreich lösen konnten. Zukünftige Forschungs- und Entwicklungsaktivitäten umfassen Langzeit-Evaluationen des VUIs in der Robotik und die Entwicklung eines digitalen Assistenten. Die Integration von LLMs undmultimodalen Modellen in VUIs stellt einen weiteren wichtigen Forschungsschwerpunkt dar, ebenso wie die Effizienzsteigerung von Deep Learning-Modellen für mobile Roboter.:Zusammenfassung 3 Abstract 4 1 Einleitung 13 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2 Problemstellung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.1 RobotikinderPflege . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.2 AmbientAssistedLiving . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3 ZielsetzungundForschungsfragen . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4 AufbauderArbeit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Grundlagen 19 2.1 SociallyAssistiveRobotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 VoiceUserInterfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 DeepLearningzurSprachverarbeitung . . . . . . . . . . . . . . . . . . . . . . . 25 3 Konzeption 33 3.1 AnforderungenältererMenschenanVUIs . . . . . . . . . . . . . . . . . . . . . 33 3.2 AnforderungenderRobotikanVUIs . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3 Anwendungskontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.1 RobotergestützteMAKS-Therapie . . . . . . . . . . . . . . . . . . . . . . 38 3.3.2 AALWohnung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 Nutzeranalyse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5 EntwicklungszielefürdasVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 Systemarchitektur 45 4.1 Architekturentscheidungen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 KomponentendesVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.1 Systemkontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.2 AllgemeinesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . 50 4.2.3 DetailliertesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . . 51 4.3 ModulareerweiterbareInteraktion . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3.1 InteraktiondurchSkills . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 7 4.3.2 SchnittstellenmodellderSkills . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.3 ImplementierungsmodellderSkills . . . . . . . . . . . . . . . . . . . . . 56 5 Spracherkennung 59 5.1 VomgeprochenenWortzumText . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.2 VoiceActivityDetection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.3 AutomaticSpeechRecognotion . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.3.1 Evaulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.2 OptimierungderModellefürCPUInferenz . . . . . . . . . . . . . . . . 67 6 Sprachverarbeitung 71 6.1 VomTextzurIntention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.2 Intent-ErkennungundNamedEntityRecognition . . . . . . . . . . . . . . . . . 72 6.3 SegmentierungvonAussagen . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.3.1 Datensatz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.3.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.3.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.3.4 Ablationsstudien . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3.5 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.4 TextbasierteSentimentanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.4.1 Daten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.4.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.4.3 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 7 DialogManagment 97 7.1 VonderIntentionzurAktion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 7.2 VerfahrenfürdasDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . 98 7.3 EinmodularesDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . . .100 7.4 Sprachsynthese. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103 8 EvaluationdesFrameworks 107 8.1 AufbauundZielsetzung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.2 TestdesignundMethodologie . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.3 TeilnehmendederStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110 8.4 AnalyseundInterpretationderErgebnisse . . . . . . . . . . . . . . . . . . . . .111 8.5 EinschränkungenderStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 8.6 Fazit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118 9 ZusammenfassungundAusblick 121 9.1 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121 9.2 Ausblick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124 / In this dissertation, a Voice User Interface (VUI) for Socially Assistive Robot (SAR) was designed and developed, with the aim of enabling voice-based interaction in care applications. This work fills a research gap by developing a VUI that operates with natural everyday natural everyday German language. The focus was on utilising advances in the field of deep learning-based speech and language processing to fulfil the requirements of robotics and user groups. Two central research questions were addressed: determining the requirements for a VUI for SARs in care applications and the design and implementation of such a VUIs. The work discusses the specific requirements of robotics and the users of a VUI. Furthermore, the planned application scenarios and user groups of the developed VUI, including its application in dementia therapy and in care homes, were described in detail. In the main part of the thesis, the designed VUI was presented, which is characterised by its offline capability and the integration of external sensors and actuators of the robot into the VUI. The thesis also deals with the central building blocks for the implementation of the VUIs, including speech recognition, processing of transcribed texts, sentiment analysis and text segmentation. The dialogue management model developed and the evaluation of current speech synthesis systems were also discussed. In a user study, the applicability of the VUIs was tested. Without specific training, the participants were able to successfully solve 93% of the tasks. Future research and development activities should include longterm evaluations of the VUIs in robotics and the development of a digital assistant. The integration of LLMs and multimodal models in VUIs is another important research focus, as is increasing the efficiency of deep learning models for mobile robots:Zusammenfassung 3 Abstract 4 1 Einleitung 13 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2 Problemstellung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.1 RobotikinderPflege . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.2 AmbientAssistedLiving . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3 ZielsetzungundForschungsfragen . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4 AufbauderArbeit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Grundlagen 19 2.1 SociallyAssistiveRobotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 VoiceUserInterfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 DeepLearningzurSprachverarbeitung . . . . . . . . . . . . . . . . . . . . . . . 25 3 Konzeption 33 3.1 AnforderungenältererMenschenanVUIs . . . . . . . . . . . . . . . . . . . . . 33 3.2 AnforderungenderRobotikanVUIs . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3 Anwendungskontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.1 RobotergestützteMAKS-Therapie . . . . . . . . . . . . . . . . . . . . . . 38 3.3.2 AALWohnung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 Nutzeranalyse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5 EntwicklungszielefürdasVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 Systemarchitektur 45 4.1 Architekturentscheidungen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 KomponentendesVUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.1 Systemkontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.2 AllgemeinesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . 50 4.2.3 DetailliertesKomponentenmodell . . . . . . . . . . . . . . . . . . . . . . 51 4.3 ModulareerweiterbareInteraktion . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3.1 InteraktiondurchSkills . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 7 4.3.2 SchnittstellenmodellderSkills . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.3 ImplementierungsmodellderSkills . . . . . . . . . . . . . . . . . . . . . 56 5 Spracherkennung 59 5.1 VomgeprochenenWortzumText . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.2 VoiceActivityDetection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.3 AutomaticSpeechRecognotion . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.3.1 Evaulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.2 OptimierungderModellefürCPUInferenz . . . . . . . . . . . . . . . . 67 6 Sprachverarbeitung 71 6.1 VomTextzurIntention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.2 Intent-ErkennungundNamedEntityRecognition . . . . . . . . . . . . . . . . . 72 6.3 SegmentierungvonAussagen . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.3.1 Datensatz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.3.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.3.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.3.4 Ablationsstudien . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3.5 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 6.4 TextbasierteSentimentanalyse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.4.1 Daten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.4.2 Modelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.4.3 Ergebnisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 7 DialogManagment 97 7.1 VonderIntentionzurAktion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 7.2 VerfahrenfürdasDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . 98 7.3 EinmodularesDialogManagment . . . . . . . . . . . . . . . . . . . . . . . . . .100 7.4 Sprachsynthese. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103 8 EvaluationdesFrameworks 107 8.1 AufbauundZielsetzung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.2 TestdesignundMethodologie . . . . . . . . . . . . . . . . . . . . . . . . . . . .107 8.3 TeilnehmendederStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110 8.4 AnalyseundInterpretationderErgebnisse . . . . . . . . . . . . . . . . . . . . .111 8.5 EinschränkungenderStudie . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 8.6 Fazit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118 9 ZusammenfassungundAusblick 121 9.1 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121 9.2 Ausblick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124 Sprachinterface, Sprache, Robotik info:eu-repo/classification/ddc/006 ddc:006
10	“Do you want to take a short survey?” : Evaluating and improving the UX and VUI of a survey skill in the social robot Furhat: a qualitative case study Bengtsson, Camilla, Englund, Caroline January 2018 (has links) The purpose of this qualitative case study is to evaluate an early stage survey skill developed for the social robot Furhat, and look into how the user experience (UX) and voice user interface (VUI) of that skill can be improved. Several qualitative methods have been used: expert evaluations using heuristics for human-robot interaction (HRI), user evaluations including observations and interviews, as well as a quantitative questionnaire (RoSAS – Robot Social Attribution Scale). The empirical findings have been classified into the USUS Evaluation Framework for Human-Robot Interaction. The user evaluations were performed in two modes, one group of informants talked and interacted with Furhat with the support of a graphical user interface (GUI), and the other group without the GUI. A positive user experience was identified in both modes, showing that the informants found interacting with Furhat a fun, engaging and interesting experience. The mode with the supportive GUI could be suitable in noisy environments, and for longer surveys with many response alternatives to choose from, whereas the other mode could work better for less noisy environments and for shorter surveys. General improvements that can contribute to a better user experience in both modes were found; such as having the robot adopt a more human-like character when it comes to the dialogue and the facial expressions and movements, along with addressing a number of technical and usability issues. / Syftet med den här kvalitativa fallstudien är att utvärdera en enkätskill för den sociala roboten Furhat. Förutom utvärderingen av denna skill, som är i ett tidigt skede av utvecklingen, är syftet även att undersöka hur användarupplevelsen (UX) och röstgränssnittet (VUI) kan förbättras. Olika kvalitativa metoder har använts: expertutvärderingar med heuristik för MRI (människa-robot-interaktion), användarutvärderingar bestående av observationer och intervjuer, samt ett kvantitativt frågeformulär (RoSAS – Robot Social Attribution Scale). Resultaten från dessa har placerats in i ramverket USUS Evaluation Framework for Human- Robot Interaction. Användarutvärderingarna utfördes i två olika grupper: en grupp pratade och interagerade med Furhat med stöd av ett grafiskt användargränssnitt (GUI), den andra hade inget GUI. En positiv användarupplevelse konstaterades i båda grupperna: informanterna tyckte att det var roligt, engagerande och intressant att interagera med Furhat. Att ha ett GUI som stöd kan passa bättre för bullriga miljöer och för längre enkäter med många svarsalternativ att välja bland, medan ett GUI inte behövs för lugnare miljöer och kortare enkäter. Generella förbättringar som kan bidra till att höja användarupplevelsen hittades i båda grupperna; till exempel att roboten bör agera mer människolikt när det kommer till dialogen och ansiktsuttryck och rörelser, samt att åtgärda ett antal tekniska problem och användbarhetsproblem. human-robot interaction HRI user experience UX user experience evaluation aural survey voice user interface VUI conversational user interface CUI social robot qualitative case study Interaction Technologies Interaktionsteknik

Search results