Spelling suggestions: "subject:"multimodal dataset"" "subject:"multimodala dataset""
1 |
Intelligent System for the Classification of Mental State ParametersChandrasekharan, Jyotsna 25 July 2024 (has links)
Mental health is essential for overall well-being, focusing emotional, psychological, and social aspects. Assessing and managing mental health requires understanding mental state parameters, including cognitive load, cognitive impairment, and emotional state. Advanced technologies like eye tracking provide valuable insights into these parameters, transformed mental health evaluation and enabled more targeted interventions and better outcomes. Thesis focused towards developing intelligent system to monitor mental health, focusing on cognitive load, cognitive impairment, and emotional state. The research has three main objectives, including creating four eye-tracking-based unimodal datasets and a multimodal dataset to address the lack of publicly available mental health assessment datasets. Each dataset is designed to study cognitive load, cognitive impairment, and emotional state classification using varied stimuli. In addition to dataset creation, the thesis excels in feature extraction, introducing novel features to detect mental state parameters and enhancing assessment precision. High-level features such as error rate, scanpath comparison score, and inattentional blindness are incorporated, contributing to find cognitive impairment scores. Five models are developed to detect mental states by separately monitoring the mental state parameters, cognitive load, cognitive impairment, and emotional state. The models employ statistical analysis, machine learning algorithms, fuzzy inference systems, and deep learning techniques to provide detailed insights into an individual's mental state. The first two models, Eye-Tracking Cognitive Load models (ECL-1 and ECL-2) focus on cognitive load assessment during mathematical assessments and Trail Making Test tasks. ECL-1 model utilizes statistical analysis to understand the correlation between eye tracking features like pupil diameter and blink frequency with the cognitive load while performing mathematical assessments. With the identification of relevant features while performing Trail Making Test (TMT), the ECL-2 model effectively classifies low and high cognitive load states with a notable 94% accuracy, utilizing eye-tracking data and machine learning algorithms. The third model, the ETMT (Eye tracking based Trail Making Test) model, uses a fuzzy inference system and adaptive neuro-fuzzy inference system to detect mental states associated with cognitive impairment. It provides detailed scores in visual search speed and focused attention, important for understanding the exact cognitive deficits of a patient. This greatly aids in understanding the cognitive states of an individual and addresses deficits in executive functioning, memory, motor function, attentional disengagement, neuropsychological function, processing speed, and visual attention. The fourth model, PredictEYE, utilizes a deep learning time-series univariate regression model based on Long Short-Term Memory (LSTM) to predict future sequences of each feature. Machine learning-based Random Forest algorithm is applied on the predicted features for mental state prediction and identifying the mental state as calm or stressful based on a person's emotional state. The personalized time series methodology makes use of the power of time series analysis, identifying patterns and changes in data over time to enable more precise and individualized mental health assessments and monitoring. Notably, PredictEYE outperforms ARIMA with an accuracy of 86.4%. The fifth model introduced in this study is based on a multimodal dataset, incorporating physiological measures such as ECG, GSR, PPG, and respiratory signals, along with eye tracking data. Two separate models, one based on eye tracking data and the other based on all other physiological measures developed for understanding the emotional state of a person. These models demonstrate comparable performance, with notable proficiency in binary classification based on arousal and valence. Particularly, the Binary-Valence model achieves slightly higher accuracy when utilizing eye tracking data, while other physiological measures exhibit stronger classification performance for the Binary-Arousal model. The thesis makes substantial progress in mental health monitoring by providing accurate, non-intrusive evaluations of an individual's mental state. It emphasizes mental state parameters such as cognitive load, impairment, and emotional state, with AI-based methods incorporated to improve the precision in detection of mental state.
|
2 |
Zpracování multimodálních obrazových dat v analýze uměleckých děl / Multimodal Image Processing in Art InvestigationBlažek, Jan January 2018 (has links)
A B S T R A C T title: Multimodal Image Processing in Art Investigation author: Jan Blažek department: Department of Image Processing, IITA of the CAS supervisor: RNDr. Barbara Zitová PhD., Institute of Information Theory and Automation supervisor's e-mail address: zitova@utia.cas.cz abstract: Art investigation and digital image processing demar- cate an interdisciplinary field of the presented thesis. Over the past 8 years we have published thirteen papers belonging to this field of research. This thesis presents the current state of the art and puts these papers into context. Our research is focused on modalities in the visible and near-infrared parts of the spectrum and affects vari- ous tasks of art investigation. For studying the spectral response of paint materials, we suggest a low-cost mobile multi-band acquisition system and a calibration method extended by a light source with an adjustable wavelength. We created the m3art database of the spectral responses of pigments, available for comparison and public use. The central point of our research is underdrawing detection and visual- ization. For this purpose we have developed: acquisition guidelines based on optical properties of the topmost non-transparent layer, a visualization technique for comparison of modalities, and a signal separation technique...
|
3 |
Gaze based weakly supervised localization for image classification : application to visual recognition in a food dataset / Apprentissage faiblement supervisé basé sur le regard : application à la reconnaissance visuelle dans un ensemble de données sur l'alimentationWang, Xin 29 September 2017 (has links)
Dans cette dissertation, nous discutons comment utiliser les données du regard humain pour améliorer la performance du modèle d'apprentissage supervisé faible dans la classification des images. Le contexte de ce sujet est à l'ère de la technologie de l'information en pleine croissance. En conséquence, les données à analyser augmentent de façon spectaculaire. Étant donné que la quantité de données pouvant être annotées par l'humain ne peut pas tenir compte de la quantité de données elle-même, les approches d'apprentissage supervisées bien développées actuelles peuvent faire face aux goulets d'étranglement l'avenir. Dans ce contexte, l'utilisation de annotations faibles pour les méthodes d'apprentissage à haute performance est digne d'étude. Plus précisément, nous essayons de résoudre le problème à partir de deux aspects: l'un consiste à proposer une annotation plus longue, un regard de suivi des yeux humains, comme une annotation alternative par rapport à l'annotation traditionnelle longue, par exemple boîte de délimitation. L'autre consiste à intégrer l'annotation du regard dans un système d'apprentissage faiblement supervisé pour la classification de l'image. Ce schéma bénéficie de l'annotation du regard pour inférer les régions contenant l'objet cible. Une propriété utile de notre modèle est qu'elle exploite seulement regardez pour la formation, alors que la phase de test est libre de regard. Cette propriété réduit encore la demande d'annotations. Les deux aspects isolés sont liés ensemble dans nos modèles, ce qui permet d'obtenir des résultats expérimentaux compétitifs. / In this dissertation, we discuss how to use the human gaze data to improve the performance of the weak supervised learning model in image classification. The background of this topic is in the era of rapidly growing information technology. As a consequence, the data to analyze is also growing dramatically. Since the amount of data that can be annotated by the human cannot keep up with the amount of data itself, current well-developed supervised learning approaches may confront bottlenecks in the future. In this context, the use of weak annotations for high-performance learning methods is worthy of study. Specifically, we try to solve the problem from two aspects: One is to propose a more time-saving annotation, human eye-tracking gaze, as an alternative annotation with respect to the traditional time-consuming annotation, e.g. bounding box. The other is to integrate gaze annotation into a weakly supervised learning scheme for image classification. This scheme benefits from the gaze annotation for inferring the regions containing the target object. A useful property of our model is that it only exploits gaze for training, while the test phase is gaze free. This property further reduces the demand of annotations. The two isolated aspects are connected together in our models, which further achieve competitive experimental results.
|
Page generated in 0.0464 seconds