Global ETD Search

1	Machine Learning Explainability on Multi-Modal Data using Ecological Momentary Assessments in the Medical Domain / Erklärbarkeit von maschinellem Lernen unter Verwendung multi-modaler Daten und Ecological Momentary Assessments im medizinischen Sektor Allgaier, Johannes January 2024 (has links) (PDF) Introduction. Mobile health (mHealth) integrates mobile devices into healthcare, enabling remote monitoring, data collection, and personalized interventions. Machine Learning (ML), a subfield of Artificial Intelligence (AI), can use mHealth data to confirm or extend domain knowledge by finding associations within the data, i.e., with the goal of improving healthcare decisions. In this work, two data collection techniques were used for mHealth data fed into ML systems: Mobile Crowdsensing (MCS), which is a collaborative data gathering approach, and Ecological Momentary Assessments (EMA), which capture real-time individual experiences within the individual’s common environments using questionnaires and sensors. We collected EMA and MCS data on tinnitus and COVID-19. About 15 % of the world’s population suffers from tinnitus. Materials & Methods. This thesis investigates the challenges of ML systems when using MCS and EMA data. It asks: How can ML confirm or broad domain knowledge? Domain knowledge refers to expertise and understanding in a specific field, gained through experience and education. Are ML systems always superior to simple heuristics and if yes, how can one reach explainable AI (XAI) in the presence of mHealth data? An XAI method enables a human to understand why a model makes certain predictions. Finally, which guidelines can be beneficial for the use of ML within the mHealth domain? In tinnitus research, ML discerns gender, temperature, and season-related variations among patients. In the realm of COVID-19, we collaboratively designed a COVID-19 check app for public education, incorporating EMA data to offer informative feedback on COVID-19-related matters. This thesis uses seven EMA datasets with more than 250,000 assessments. Our analyses revealed a set of challenges: App user over-representation, time gaps, identity ambiguity, and operating system specific rounding errors, among others. Our systematic review of 450 medical studies assessed prior utilization of XAI methods. Results. ML models predict gender and tinnitus perception, validating gender-linked tinnitus disparities. Using season and temperature to predict tinnitus shows the association of these variables with tinnitus. Multiple assessments of one app user can constitute a group. Neglecting these groups in data sets leads to model overfitting. In select instances, heuristics outperform ML models, highlighting the need for domain expert consultation to unveil hidden groups or find simple heuristics. Conclusion. This thesis suggests guidelines for mHealth related data analyses and improves estimates for ML performance. Close communication with medical domain experts to identify latent user subsets and incremental benefits of ML is essential. / Einleitung. Unter Mobile Health (mHealth) versteht man die Nutzung mobiler Geräte wie Handys zur Unterstützung der Gesundheitsversorgung. So können Ärzt:innen z. B. Gesundheitsinformationen sammeln, die Gesundheit aus der Ferne überwachen, sowie personalisierte Behandlungen anbieten. Man kann maschinelles Lernen (ML) als System nutzen, um aus diesen Gesundheitsinformationen zu lernen. Das ML-System versucht, Muster in den mHealth Daten zu finden, um Ärzt:innen zu helfen, bessere Entschei- dungen zu treffen. Zur Datensammlung wurden zwei Methoden verwendet: Einerseits trugen zahlreiche Personen zur Sammlung von umfassenden Informationen mit mo- bilen Geräten bei (sog. Mobile Crowdsensing), zum anderen wurde den Mitwirkenden digitale Fragebögen gesendet und Sensoren wie GPS eingesetzt, um Informationen in einer alltäglichen Umgebung zu erfassen (sog. Ecologcial Momentary Assessments). Diese Arbeit verwendet Daten aus zwei medizinischen Bereichen: Tinnitus und COVID-19. Schätzungen zufolge leidet etwa 15 % der Menschheit an Tinnitus. Materialien & Methoden. Die Arbeit untersucht, wie ML-Systeme mit mHealth Daten umgehen: Wie können diese Systeme robuster werden oder neue Dinge lernen? Funktion- ieren die neuen ML-Systeme immer besser als einfache Daumenregeln, und wenn ja, wie können wir sie dazu bringen, zu erklären, warum sie bestimmte Entscheidungen treffen? Welche speziellen Regeln sollte man außerdem befolgen, wenn man ML-Systeme mit mHealth Daten trainiert? Während der COVID-19-Pandemie entwickelten wir eine App, die den Menschen helfen sollte, sich über das Virus zu informieren. Diese App nutzte Daten der Krankheitssymptome der App Nutzer:innen, um Handlungsempfehlungen für das weitere Vorgehen zu geben. Ergebnisse. ML-Systeme wurden trainiert, um Tinnitus vorherzusagen und wie er mit geschlechtsspezifischen Unterschieden zusammenhängen könnte. Die Verwendung von Faktoren wie Jahreszeit und Temperatur kann helfen, Tinnitus und seine Beziehung zu diesen Faktoren zu verstehen. Wenn wir beim Training nicht berücksichtigen, dass ein App User mehrere Datensätze ausfüllen kann, führt dies zu einer Überanpassung und damit Verschlechterung des ML-Systems. Interessanterweise führen manchmal einfache Regeln zu robusteren und besseren Modellen als komplexe ML-Systeme. Das zeigt, dass es wichtig ist, Experten auf dem Gebiet einzubeziehen, um Überanpassung zu vermeiden oder einfache Regeln zur Vorhersage zu finden. Fazit. Durch die Betrachtung verschiedener Langzeitdaten konnten wir neue Empfehlun- gen zur Analyse von mHealth Daten und der Entwicklung von ML-Systemen ableiten. Dabei ist es wichtig, medizinischen Experten mit einzubeziehen, um Überanpassung zu vermeiden und ML-Systeme schrittweise zu verbessern. Maschinelles Lernen Explainable Artificial Intelligence ddc:000 ddc:610
2	What do you mean? : The consequences of different stakeholders’ logics in machine learning and how disciplinary differences should be managed within an organization Eliasson, Nina January 2022 (has links) This research paper identifies the disciplinary differences of stakeholders and its effects on working cross-functional in the context of machine learning. This study specifically focused on 1) how stakeholders with disciplinary differences interpret a search system, and 2) how the multi-disciplines should be managed in an organization. This was studied through 12 interviews with stakeholders from design disciplines, product management, data science and machine learning engineering, followed by a focus group with a participant from each of the different disciplines. The findings were analyzed through a thematic analysis and institutional logics and concluded that the different logics had a high impact on the stakeholders’ understanding of the search system. The research also concluded that bridging the gap between the multi-disciplinary stakeholders are of high importance in context of machine learning. Institutional Logics Multi-Disciplinary Organizations Explainable Artificial Intelligence (XAI Human Computer Interaction
3	CEFYDRA: Cluster-first Explainable FuzzY-based Deep Reorganizing Algorithm Viana, Javier 23 August 2022 (has links) No description available. Artificial Intelligence Explainable Artificial Intelligence Artificial Intelligence Fuzzy Logic Machine Learning Deep Learning Airport
4	Explainable Intrusion Detection Systems using white box techniques Ables, Jesse 08 December 2023 (has links) (PDF) Artificial Intelligence (AI) has found increasing application in various domains, revolutionizing problem-solving and data analysis. However, in decision-sensitive areas like Intrusion Detection Systems (IDS), trust and reliability are vital, posing challenges for traditional black box AI systems. These black box IDS, while accurate, lack transparency, making it difficult to understand the reasons behind their decisions. This dissertation explores the concept of eXplainable Intrusion Detection Systems (X-IDS), addressing the issue of trust in X-IDS. It explores the limitations of common black box IDS and the complexities of explainability methods, leading to the fundamental question of trusting explanations generated by black box explainer modules. To address these challenges, this dissertation presents the concept of white box explanations, which are innately explainable. While white box algorithms are typically simpler and more interpretable, they often sacrifice accuracy. However, this work utilized white box Competitive Learning (CL), which can achieve competitive accuracy in comparison to black box IDS. We introduce Rule Extraction (RE) as another white box technique that can be applied to explain black box IDS. It involves training decision trees on the inputs, weights, and outputs of black box models, resulting in human-readable rulesets that serve as global model explanations. These white box techniques offer the benefits of accuracy and trustworthiness, which are challenging to achieve simultaneously. This work aims to address gaps in the existing literature, including the need for highly accurate white box IDS, a methodology for understanding explanations, small testing datasets, and comparisons between white box and black box models. To achieve these goals, the study employs CL and eclectic RE algorithms. CL models offer innate explainability and high accuracy in IDS applications, while eclectic RE enhances trustworthiness. The contributions of this dissertation include a novel X-IDS architecture featuring Self-Organizing Map (SOM) models that adhere to DARPA’s guidelines for explainable systems, an extended X-IDS architecture incorporating three CL-based algorithms, and a hybrid X-IDS architecture combining a Deep Neural Network (DNN) predictor with a white box eclectic RE explainer. These architectures create more explainable, trustworthy, and accurate X-IDS systems, paving the way for enhanced AI solutions in decision-sensitive domains. Intrusion Detection Artificial Intelligence Explainable Artificial Intelligence Explainabile Intrusion Detection Systems Competitive Learning Rule Extraction
5	Machine Learning Survival Models : Performance and Explainability Alabdallah, Abdallah January 2023 (has links) Survival analysis is an essential statistics and machine learning field in various critical applications like medical research and predictive maintenance. In these domains understanding models' predictions is paramount. While machine learning techniques are increasingly applied to enhance the predictive performance of survival models, they simultaneously sacrifice transparency and explainability. Survival models, in contrast to regular machine learning models, predict functions rather than point estimates like regression and classification models. This creates a challenge regarding explaining such models using the known off-the-shelf machine learning explanation techniques, like Shapley Values, Counterfactual examples, and others. Censoring is also a major issue in survival analysis where the target time variable is not fully observed for all subjects. Moreover, in predictive maintenance settings, recorded events do not always map to actual failures, where some components could be replaced because it is considered faulty or about to fail in the future based on an expert's opinion. Censoring and noisy labels create problems in terms of modeling and evaluation that require to be addressed during the development and evaluation of the survival models. Considering the challenges in survival modeling and the differences from regular machine learning models, this thesis aims to bridge this gap by facilitating the use of machine learning explanation methods to produce plausible and actionable explanations for survival models. It also aims to enhance survival modeling and evaluation revealing a better insight into the differences among the compared survival models. In this thesis, we propose two methods for explaining survival models which rely on discovering survival patterns in the model's predictions that group the studied subjects into significantly different survival groups. Each pattern reflects a specific survival behavior common to all the subjects in their respective group. We utilize these patterns to explain the predictions of the studied model in two ways. In the first, we employ a classification proxy model that can capture the relationship between the descriptive features of subjects and the learned survival patterns. Explaining such a proxy model using Shapley Values provides insights into the feature attribution of belonging to a specific survival pattern. In the second method, we addressed the "what if?" question by generating plausible and actionable counterfactual examples that would change the predicted pattern of the studied subject. Such counterfactual examples provide insights into actionable changes required to enhance the survivability of subjects. We also propose a variational-inference-based generative model for estimating the time-to-event distribution. The model relies on a regression-based loss function with the ability to handle censored cases. It also relies on sampling for estimating the conditional probability of event times. Moreover, we propose a decomposition of the C-index into a weighted harmonic average of two quantities, the concordance among the observed events and the concordance between observed and censored cases. These two quantities, weighted by a factor representing the balance between the two, can reveal differences between survival models previously unseen using only the total Concordance index. This can give insight into the performances of different models and their relation to the characteristics of the studied data. Finally, as part of enhancing survival modeling, we propose an algorithm that can correct erroneous event labels in predictive maintenance time-to-event data. we adopt an expectation-maximization-like approach utilizing a genetic algorithm to find better labels that would maximize the survival model's performance. Over iteration, the algorithm builds confidence about events' assignments which improves the search in the following iterations until convergence. We performed experiments on real and synthetic data showing that our proposed methods enhance the performance in survival modeling and can reveal the underlying factors contributing to the explainability of survival models' behavior and performance. Survival Analysis Explainable Artificial Intelligence Survival Patterns Counterfactual Explanations Evaluation Metrics Concordance Index Signal Processing Signalbehandling
6	Interpretable Outlier Detection in Financial Data : Implementation of Isolation Forest and Model-Specific Feature Importance Söderström, Vilhelm, Knudsen, Kasper January 2022 (has links) Market manipulation has increased in line with the number of active players in the financialmarkets. The most common methods for monitoring financial markets are rule-based systems,which are limited to previous knowledge of market manipulation. This work was carried out incollaboration with the company Scila, which provides surveillance solutions for the financialmarkets.In this thesis, we will try to implement a complementary method to Scila's pre-existing rule-based systems to objectively detect outliers in all available data and present the result onsuspect transactions and customer behavior to an operator. Thus, the method needs to detectoutliers and show the operator why a particular market participant is considered an outlier. Theoutlier detection method needs to implement interpretability. This led us to the formulation of ourresearch question as: How can an outlier detection method be implemented as a tool for amarket surveillance operator to identify potential market manipulation outside Scila's rule-basedsystems?Two models, an outlier detection model Isolation Forest, and a feature importance model (MI-Local-DIFFI and its subset Path Length Indicator) were chosen to fulfill the purpose of the study.The study used three datasets, two synthetic datasets, one scattered and one clustered, andone dataset from Scila.The results show that Isolation Forest has an excellent ability to find outliers in the various datadistributions we investigated. We used a feature importance model to make Isolation Forest’sscoring of outliers interpretable. Our intention was that the feature importance model wouldspecify how important different features were in the process of an observation being defined asan outlier. Our results have a relatively high degree of interpretability for the scattered datasetbut worse for the clustered dataset. The Path Length Indicator achieved better performancethan MI-Local-DIFFI for both datasets. We noticed that the chosen feature importance model islimited by the process of how Isolation Forest isolates an outlier. Outlier Detection Explainable artificial intelligence XAI Isolation Forest MI-Local-DIFFI Probability Theory and Statistics Sannolikhetsteori och statistik
7	Increasing the Trustworthiness ofAI-based In-Vehicle IDS usingeXplainable AI Lundberg, Hampus January 2022 (has links) An in-vehicle intrusion detection system (IV-IDS) is one of the protection mechanisms used to detect cyber attacks on electric or autonomous vehicles where anomaly-based IDS solution have better potential at detecting the attacks especially zero-day attacks. Generally, the IV-IDS generate false alarms (falsely detecting normal data as attacks) because of the difficulty to differentiate between normal and attack data. It can lead to undesirable situations, such as increased laxness towards the system, or uncertainties in the event-handling following a generated alarm. With the help of sophisticated Artificial Intelligence (AI) models, the IDS improves the chances of detecting attacks. However, the use of such a model comes at the cost of decreased interpretability, a trait that is argued to be of importance when ascertaining various other valuable desiderata, such as a model’s trust, causality, and robustness. Because of the lack of interpretability in sophisticated AI-based IV-IDSs, it is difficult for humans to trust such systems, let alone know what actions to take when an IDS flags an attack. By using tools found in the area of eXplainable AI (XAI), this thesis aims to explore what kind of explanations could be produced in accord with model predictions, to further increase the trustworthiness of AI-based IV-IDSs. Through a comparative survey, aspects related to trustworthiness and explainability are evaluated on a custom, pseudo-global, visualization-based explanation (”VisExp”), and a rule based explanation. The results show that VisExp increase the trustworthiness,and enhanced the explainability of the AI-based IV-IDS. Intrusion Detection System In-Vehicle Intrusion Detection System Machine Learning Deep Learning Explainable Artificial Intelligence Trustworthiness. Computer Systems Datorsystem
8	ARTIFICIAL INTELLIGENCE APPLICATIONS FOR IDENTIFYING KEY FEATURES TO REDUCE BUILDING ENERGY CONSUMPTION Lakmini Rangana Senarathne (16642119) 07 August 2023 (has links) <p>The International Energy Agency (IEA) estimates that residential and commercial buildings consume 40% of global energy and emit 24% of CO2. A building's design parameters and location significantly impact its energy usage. Adjusting the building parameters and features in an optimum way helps to reduce energy usage and to build energy-efficient buildings. Hence, analyzing the impact of influencing factors is critical to reduce building energy usage.</p> <p>Towards this, artificial intelligence applications, such as Explainable Artificial Intelligence (XAI) and machine learning (ML) identified the key building features to reduce building energy. This is done by analyzing the efficiencies of various building features that impact building energy consumption. For this, the relative importance of input features impacting commercial building energy usage is investigated. Also analyzed is the parametric analysis of the impact of input variables on residential building energy usage. Furthermore, the dependencies and relationships between the design variables of residential buildings were examined. Finally, the study analyzed the impact of location features on cooling energy usage in commercial buildings.</p> <p>For the purpose of energy consumption data analysis, three datasets, named the Commercial Building Energy Consumption Survey (CBECS) datasets gathered in 2012 and 2018, University of California Irvine (UCI) energy efficiency dataset, and Commercial Load Data (CLD) were utilized. For this, Python and WEKA were used. Random Forest, Linear Regression, Bayesian Networks, and Logistic Regression predicted energy consumption using datasets. Moreover, statistical tests, such as the Wilcoxon-rank sum test were analyzed for the significant differences between specific datasets. Shapash, a Python library, created the feature important graphs.</p> <p>The results indicated that cooling degree days are the most important feature in predicting cooling load with contribution values 34.29% (2018) and 19.68% (2012). Also, analyzing the impact of building parameters on energy usage indicated that 50% of overall height reduction achieves a reduction of heating load by 64.56% and cooling load by 57.47%. Also, the Wilcoxon-rank sum test indicated that the location of the building also impacts energy consumption with a 0.05 error margin. The proposed analysis is beneficial for real-world applications and energy-efficient building construction.</p> Energy Efficiency Buildings Shapash Data Analysis Machine Learning
9	Comparison of Logistic Regression and an Explained Random Forest in the Domain of Creditworthiness Assessment Ankaräng, Marcus, Kristiansson, Jakob January 2021 (has links) As the use of AI in society is developing, the requirement of explainable algorithms has increased. A challenge with many modern machine learning algorithms is that they, due to their often complex structures, lack the ability to produce human-interpretable explanations. Research within explainable AI has resulted in methods that can be applied on top of non- interpretable models to motivate their decision bases. The aim of this thesis is to compare an unexplained machine learning model used in combination with an explanatory method, and a model that is explainable through its inherent structure. Random forest was the unexplained model in question and the explanatory method was SHAP. The explainable model was logistic regression, which is explanatory through its feature weights. The comparison was conducted within the area of creditworthiness and was based on predictive performance and explainability. Furthermore, the thesis intends to use these models to investigate what characterizes loan applicants who are likely to default. The comparison showed that no model performed significantly better than the other in terms of predictive performance. Characteristics of bad loan applicants differed between the two algorithms. Three important aspects were the applicant’s age, where they lived and whether they had a residential phone. Regarding explainability, several advantages with SHAP were observed. With SHAP, explanations on both a local and a global level can be produced. Also, SHAP offers a way to take advantage of the high performance in many modern machine learning algorithms, and at the same time fulfil today’s increased requirement of transparency. / I takt med att AI används allt oftare för att fatta beslut i samhället, har kravet på förklarbarhet ökat. En utmaning med flera moderna maskininlärningsmodeller är att de, på grund av sina komplexa strukturer, sällan ger tillgång till mänskligt förståeliga motiveringar. Forskning inom förklarar AI har lett fram till metoder som kan appliceras ovanpå icke- förklarbara modeller för att tolka deras beslutsgrunder. Det här arbetet syftar till att jämföra en icke- förklarbar maskininlärningsmodell i kombination med en förklaringsmetod, och en modell som är förklarbar genom sin struktur. Den icke- förklarbara modellen var random forest och förklaringsmetoden som användes var SHAP. Den förklarbara modellen var logistisk regression, som är förklarande genom sina vikter. Jämförelsen utfördes inom området kreditvärdighet och grundades i prediktiv prestanda och förklarbarhet. Vidare användes dessa modeller för att undersöka vilka egenskaper som var kännetecknande för låntagare som inte förväntades kunna betala tillbaka sitt lån. Jämförelsen visade att ingen av de båda metoderna presterande signifikant mycket bättre än den andra sett till prediktiv prestanda. Kännetecknande särdrag för dåliga låntagare skiljde sig åt mellan metoderna. Tre viktiga aspekter var låntagarens °ålder, vart denna bodde och huruvida personen ägde en hemtelefon. Gällande förklarbarheten framträdde flera fördelar med SHAP, däribland möjligheten att kunna producera både lokala och globala förklaringar. Vidare konstaterades att SHAP gör det möjligt att dra fördel av den höga prestandan som många moderna maskininlärningsmetoder uppvisar och samtidigt uppfylla dagens ökade krav på transparens. Classification Creditworthiness Explainable Artificial Intelligence Logistic Regression Machine Learning Random Forest SHAP XAI Computer and Information Sciences Data- och informationsvetenskap
10	Evolving Rule Based Explainable Artificial Intelligence for Decision Support System of Unmanned Aerial Vehicles Keneni, Blen M., Keneni 14 December 2018 (has links) No description available. Electrical Engineering Computer Engineering

Search results