Spelling suggestions: "subject:"explain""
1 |
Toward Designing Active ORR Catalysts via Interpretable and Explainable Machine LearningOmidvar, Noushin 22 September 2022 (has links)
The electrochemical oxygen reduction reaction (ORR) is a very important catalytic process that is directly used in carbon-free energy systems like fuel cells. However, the lack of active, stable, and cost-effective ORR cathode materials has been a major impediment to the broad adoption of these technologies. So, the challenge for researchers in catalysis is to find catalysts that are electrochemically efficient to drive the reaction, made of earth-abundant elements to lower material costs and allow scalability, and stable to make them last longer.
The majority of commercial catalysts that are now being used have been found through trial and error techniques that rely on the chemical intuition of experts. This method of empirical discovery is, however, very challenging, slow, and complicated because the performance of the catalyst depends on a myriad of factors. Researchers have recently turned to machine learning (ML) to find and design heterogeneous catalysts faster with emerging catalysis databases. Black-box models make up a lot of the ML models that are used in the field to predict the properties of catalysts that are important to their performance, such as their adsorption energies to reaction intermediates. However, as these black-box models are based on very complicated mathematical formulas, it is very hard to figure out how they work and the underlying physics of the desired catalyst properties remains hidden. As a way to open up these black boxes and make them easier to understand, more attention is being paid to interpretable and explainable ML. This work aims to speed up the process of screening and optimizing Pt monolayer alloys for ORR while gaining physical insights. We use a theory-infused machine learning framework in combination with a high-throughput active screening approach to effectively find promising ORR Pt monolayer catalysts. Furthermore, an explainability game-theory approach is employed to find electronic factors that control surface reactivity. The novel insights in this study can provide new design strategies that could shape the paradigm of catalyst discovery. / Doctor of Philosophy / The electrochemical oxygen reduction reaction (ORR) is a very important catalytic process that is used directly in carbon-free energy systems like fuel cells. But the lack of ORR cathode materials that are active, stable, and cheap has made it hard for these technologies to be widely used. Most of the commercially used catalysts have been found through trial-and-error methods that rely on the chemical intuition of experts. This method of finding out through experience is hard, slow, and complicated, though, because the performance of the catalyst depends on a variety of factors. Researchers are now using machine learning (ML) and new catalysis databases to find and design heterogeneous catalysts faster. But because black-box ML models are based on very complicated mathematical formulas, it is very hard to figure out how they work, and the physics behind the desired catalyst properties remains hidden.
In recent years, more attention has been paid to ML that can be understood and explained as a way to decode these "black boxes" and make them easier to understand. The goal of this work is to speed up the screening and optimization of Pt monolayer alloys for ORR. We find promising ORR Pt monolayer catalysts by using a machine learning framework that is based on theory and a high-throughput active screening method. A game-theory approach is also used to find the electronic factors that control surface reactivity. The new ideas in this study can lead to new ways of designing that could alter how researchers find catalysts.
|
2 |
Algebraic Learning: Towards Interpretable Information ModelingYang, Tong January 2021 (has links)
Thesis advisor: Jan Engelbrecht / Along with the proliferation of digital data collected using sensor technologies and a boost of computing power, Deep Learning (DL) based approaches have drawn enormous attention in the past decade due to their impressive performance in extracting complex relations from raw data and representing valuable information. At the same time, though, rooted in its notorious black-box nature, the appreciation of DL has been highly debated due to the lack of interpretability. On the one hand, DL only utilizes statistical features contained in raw data while ignoring human knowledge of the underlying system, which results in both data inefficiency and trust issues; on the other hand, a trained DL model does not provide to researchers any extra insight about the underlying system beyond its output, which, however, is the essence of most fields of science, e.g. physics and economics. The interpretability issue, in fact, has been naturally addressed in physics research. Conventional physics theories develop models of matter to describe experimentally observed phenomena. Tasks in DL, instead, can be considered as developing models of information to match with collected datasets. Motivated by techniques and perspectives in conventional physics, this thesis addresses the issue of interpretability in general information modeling. This thesis endeavors to address the two drawbacks of DL approaches mentioned above. Firstly, instead of relying on an intuition-driven construction of model structures, a problem-oriented perspective is applied to incorporate knowledge into modeling practice, where interesting mathematical properties emerge naturally which cast constraints on modeling. Secondly, given a trained model, various methods could be applied to extract further insights about the underlying system, which is achieved either based on a simplified function approximation of the complex neural network model, or through analyzing the model itself as an effective representation of the system. These two pathways are termed as guided model design (GuiMoD) and secondary measurements, respectively, which, together, present a comprehensive framework to investigate the general field of interpretability in modern Deep Learning practice. Remarkably, during the study of GuiMoD, a novel scheme emerges for the modeling practice in statistical learning: Algebraic Learning (AgLr). Instead of being restricted to the discussion of any specific model structure or dataset, AgLr starts from idiosyncrasies of a learning task itself and studies the structure of a legitimate model class in general. This novel modeling scheme demonstrates the noteworthy value of abstract algebra for general artificial intelligence, which has been overlooked in recent progress, and could shed further light on interpretable information modeling by offering practical insights from a formal yet useful perspective. / Thesis (PhD) — Boston College, 2021. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Physics.
|
3 |
Explainable Interactive Projections for Image DataHan, Huimin 12 January 2023 (has links)
Making sense of large collections of images is difficult. Dimension reductions (DR) assist by organizing images in a 2D space based on similarities, but provide little support for explaining why images were placed together or apart in the 2D space. Additionally, they do not provide support for modifying and updating the 2D space to explore new relationships and organizations of images. To address these problems, we present an interactive DR method for images that uses visual features extracted by a deep neural network to project the images into 2D space and provides visual explanations of image features that contributed to the 2D location. In addition, it allows people to directly manipulate the 2D projection space to define alternative relationships and explore subsequent projections of the images. With an iterative cycle of semantic interaction and explainable-AI feedback, people can explore complex visual relationships in image data. Our approach to human-AI interaction integrates visual knowledge from both human mental models and pre-trained deep neural models to explore image data. Two usage scenarios are provided to demonstrate that our method is able to capture human feedback and incorporate it into the model. Our visual explanations help bridge the gap between the feature space and the original images to illustrate the knowledge learned by the model, creating a synergy between human and machine that facilitates a more complete analysis experience. / Master of Science / High-dimensional data is everywhere. A spreadsheet with many columns, text documents, images, ... ,etc. Exploring and visualizing high-dimensional data can be challenging. Dimension reduction (DR) techniques can help. High dimensional data can be projected into 3d or 2d space and visualized as a scatter plot.Additionally, DR tool can be interactive to help users better explore data and understand underlying algorithms. Designing such interactive DR tool is challenging for images. To address this problem, this thesis presents a tool that can visualize images to a 2D plot, data points that are considered similar are projected close to each other and vice versa. Users can manipulate images directly on this scatterplot-like visualization based on own knowledge to update the display, saliency maps are provided to reflect model's re-projection reasoning.
|
4 |
Feature Relevance Explainers in Tabular Anomaly Detection / Merkmal-Relevanz-Erklärer in tabellarischer Anomalie-ErkennungTritscher, Julian January 2024 (has links) (PDF)
Within companies, the ongoing digitization makes protection of data from unauthorized access and manipulation increasingly relevant. Here, artificial intelligence offers means to automatically detect such anomalous events. However, as the capabilities of these automated anomaly detection systems grow, so does their complexity, making it challenging to understand their decisions. Subsequently, many methods to explain these decisions have been proposed in recent research. The most popular techniques in this area are feature relevance explainers that explain a decision made by an artificial intelligence system by distributing relevance scores across the inputs given to the system, thus highlighting which given information had the most impact on the decision. These explainers, although present in anomaly detection, are not systematically and quantitatively evaluated. This is especially problematic, as explainers are inherently approximations that simplify the underlying artificial intelligence and thus may not always provide high-quality explanations.
This thesis makes a contribution towards the systematic evaluation of feature relevance explainers in anomaly detection on tabular data. We first review the existing literature for available feature relevance explainers and suitable evaluation schemes. We find that multiple feature relevance explainers with different internal functioning are employed in anomaly detection, but that many existing evaluation schemes are not applicable to this domain. As a result, we construct a novel evaluation setup based on ground truth explanations. Since these ground truth explanations are not commonly available for anomaly detection data, we also provide methods to obtain ground truth explanations across different scenarios of data availability, allowing us to generate multiple labeled data sets with ground truth explanations.
Multiple experiments across the aggregated data and explainers reveal that explanation quality varies strongly and that explainers can achieve both very high-quality and near-random explanations. Furthermore, high explanation quality does not transfer across different data and anomaly detection models, resulting in no best feature relevance explainer that can be applied without performance evaluations.
As evaluation appears necessary to ensure high-quality explanations, we propose a framework that enables the optimization of explainers on unlabeled data through expert simulations. Further, to aid explainers in consistently achieving high-quality explanations in applications where expert simulations are not available, we provide two schemes for setting explainer hyperparameters specifically suitable for anomaly detection. / In Unternehmen wird durch die voranschreitende Digitalisierung der Schutz von Daten vor unberechtigtem Zugriff und Manipulation immer relevanter. Hier bietet Künstliche Intelligenz eine automatische Erkennung solcher anomaler Ereignisse. Mit der zunehmenden Leistungsfähigkeit dieser automatisierten Systeme zur Erkennung von Anomalien wächst jedoch auch deren Komplexität, so dass es schwierig ist, ihre Entscheidungen nachzuvollziehen. In der jüngsten Forschung wurden daher zahlreiche Methoden zur Erklärung solcher Entscheidungen vorgeschlagen. Die populärsten Techniken in diesem Bereich sind Merkmalsrelevanz-Erklärer, die eine von einer Künstlichen Intelligenz getroffene Entscheidung erklären, indem sie Relevanzwerte über die dem System gegebenen Eingaben verteilen und so hervorheben, welche Informationen den größten Einfluss auf die Entscheidung hatten. Diese Erklärer sind zwar in der Anomalieerkennung vorhanden, werden aber nicht systematisch und quantitativ ausgewertet. Dies ist besonders problematisch, da Erklärer inhärent die zugrundeliegende Künstliche Intelligenz vereinfachen, und daher nicht automatisch hochwertige Erklärungen liefern.
Diese Arbeit leistet einen Beitrag zur systematischen Evaluierung von Merkmalsrelevanz-Erklärern im Bereich der Anomalieerkennung. Zunächst wird ein Überblick über die bestehende Literatur zu verfügbaren Merkmalsrelevanz-Erklärern und geeigneten Evaluationsschemata gegeben. Wir stellen fest, dass mehrere Merkmalsrelevanz-Erklärer mit unterschiedlicher interner Funktionsweise in der Anomalieerkennung eingesetzt werden, viele der bestehenden Evaluationsschemata dort aber nicht anwendbar sind. Deshalb konstruieren wir ein Evaluierungssystem auf Basis von Ground-Truth-Erklärungen. Da solche Ground-Truth-Erklärungen für Anomalie-Erkennungsdaten allgemein nicht verfügbar sind, liefern wir auch Methoden, um GroundTruth-Erklärungen bei unterschiedlicher Datenverfügbarkeit zu erhalten, was uns ermöglicht, gelabelte Datensätze mit Ground-Truth-Erklärungen zu generieren.
Mehrere Experimente mit den aggregierten Daten und Erklärern zeigen, dass die Qualität der Erklärungen stark variiert und dass Erklärer sowohl hochwertige als auch nahezu zufällige Erklärungen liefern können. Darüber hinaus lässt sich eine hohe Erklärungsqualität nicht auf verschiedene Daten und Anomalieerkennungsmodelle übertragen, was dazu führt, dass kein hochperformante Erklärer existiert, der ohne
Leistungsevaluierung eingesetzt werden kann.
Da Evaluierungen notwendig erscheinen, um qualitativ hochwertige Erklärungen zu gewährleisten, schlagen wir ein Framework vor, das die Optimierung von Erklärern auf ungelabelten Daten durch Expertensimulationen ermöglicht. Um Erklärer dabei zu unterstützen, konsistent hochwertige Erklärungen in Anwendungen zu erzielen, in denen keine Expertensimulation verfügbar ist, schlagen wir zwei Verfahren zur
Einstellung von Erklärer-Hyperparametern in der Anomalieerkennung vor.
|
5 |
Incorporating Domain Experts' Knowledge into Machine Learning for Enhancing Reliability to Human Users / 領域専門家の知識活用によるユーザへの親和性を重視した機械学習LI, JIARUI 24 January 2022 (has links)
京都大学 / 新制・課程博士 / 博士(工学) / 甲第23615号 / 工博第4936号 / 新制||工||1771(附属図書館) / 京都大学大学院工学研究科機械理工学専攻 / (主査)教授 椹木 哲夫, 教授 松野 文俊, 教授 藤本 健治 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DFAM
|
6 |
The Effects of System Transparency and Reliability on Drivers' Perception and Performance Towards Intelligent Agents in Level 3 Automated VehiclesZang, Jing 05 July 2023 (has links)
In the context of automated vehicles, transparency of in-vehicle intelligent agents (IVIAs) is an important contributor to drivers' perception, situation awareness (SA), and driving performance. However, the effects of agent transparency on driver performance when the agent is unreliable have not been fully examined yet. The experiments in this Thesis focused on different aspects of IVIA's transparency, such as interaction modes and information levels, and explored their impact on drivers considering different system reliability. In Experiment 1, a 2 x 2 mixed factorial design was used in this study, with transparency (Push: proactive vs. Pull: on-demand) as a within-subjects variable and reliability (high vs. low) as a between-subjects variable. In a driving simulator, twenty-seven young drivers drove with two types of in-vehicle agents during Level 3 automated driving. Results suggested that participants generally preferred the Push-type agent, as it conveyed a sense of intelligence and competence. The high-reliability agent was associated with higher situation awareness and less workload, compared to the low-reliability agent.
Although Experiment 1 explored the effects of transparency by changing the interaction mode and the accuracy of the information, a theoretical framework was not well outlined regarding how much information should be conveyed and how unreliable information influenced drivers. Thus, Experiment 2 further studied the transparency regrading information level, and the impact of reliability on its effect. A 3 x 2 mixed factorial design was used in this study, with transparency (T1, T2, T3) as a between-subject variable and reliability (high vs. low) as a within-subjects variable. Fifty-three participants were recruited. Results suggested that transparency influenced drivers' takeover time, lane keeping, and jerk. The high-reliability agent was associated with the higher perception of system accuracy and response speed, and longer takeover time than the low-reliability agent. Participants in T2 transparency showed higher cognitive trust, lower workload, and higher situation awareness only when system reliability was high. The results of this study may have significant effects on the ongoing creation and advancement of intelligent agent design in automated vehicles. / Master of Science / This thesis explores the effects of system's transparency and reliability of the in-vehicle intelligent agents (IVIAs) on drivers' performance and perception in the context of automated vehicles. Transparency is defined as the amount of information and the way to be shared with the operator about the function of the system. Reliability refers to the accuracy of the agent's statements. The experiments focused on different aspects of IVIA's transparency, such as interaction modes (proactive vs. on-demand) and information composition (small vs. medium vs. large), and how they impact drivers considering different system reliability. In the experiment, participants were required to drive in the driving simulator and follow the voice command from the IVIAs. A theoretical model called Situation Awareness-based Agent Transparency Model was adopted to build the agent's interactive scripts.
In Experiment 1, 27 young drivers drove with two types of in-vehicle agents during Level 3 automated driving. Results suggested that participants generally preferred the agent that provided information proactively, and it conveyed a sense of intelligence and competence. Also, when the system's reliability is high, participants were found to have higher situation awareness of the environment and spent less effort on the driving tasks, compared to when the system's reliability is low. Our result also showed that these two factors can jointly influence participants' driving performance when they need to take over control from the automated system.
Experiment 2 further studied the transparency regarding the information composition of the agent's voice prompt and the impact of reliability on its effect. A total of 53 participants were recruited, and the results suggested that transparency influenced drivers' takeover time, lane keeping, and jerk. The high-reliability agent was associated with a higher perception of system accuracy and response speed and a longer time to take over when requested than the low-reliability agent. Participants in the medium transparency condition showed higher cognitive trust toward the system, perceived lower workload when driving, and higher situation awareness only when system reliability was high.
Overall, this research highlights the importance of transparency in IVIAs for improving drivers' performance, perception, and situation awareness. The results may have significant implications for the design and advancement of intelligent agents in automated vehicles.
|
7 |
Generative models meet similarity search: efficient, heuristic-free and robust retrievalDoan, Khoa Dang 23 September 2021 (has links)
The rapid growth of digital data, especially visual and textual contents, brings many challenges to the problem of finding similar data. Exact similarity search, which aims to exhaustively find all relevant items through a linear scan in a dataset, is impractical due to its high computational complexity. Approximate-nearest-neighbor (ANN) search methods, especially the Learning-to-hash or Hashing methods, provide principled approaches that balance the trade-offs between the quality of the guesses and the computational cost for web-scale databases. In this era of data explosion, it is crucial for the hashing methods to be both computationally efficient and robust to various scenarios such as when the application has noisy data or data that slightly changes over time (i.e., out-of-distribution).
This Thesis focuses on the development of practical generative learning-to-hash methods and explainable retrieval models. We first identify and discuss the various aspects where the framework of generative modeling can be used to improve the model designs and generalization of the hashing methods. Then we show that these generative hashing methods similarly enjoy several appealing empirical and theoretical properties of generative modeling. Specifically, the proposed generative hashing models generalize better with important properties such as low-sample requirement, and out-of-distribution and data-corruption robustness. Finally, in domains with structured data such as graphs, we show that the computational methods in generative modeling have an interesting utility beyond estimating the data distribution and describe a retrieval framework that can explain its decision by borrowing the algorithmic ideas developed in these methods.
Two subsets of generative hashing methods and a subset of explainable retrieval methods are proposed. For the first hashing subset, we propose a novel adversarial framework that can be easily adapted to a new problem domain and three training algorithms that learn the hash functions without several hyperparameters commonly found in the previous hashing methods. The contributions of our work include: (1) Propose novel algorithms, which are based on adversarial learning, to learn the hash functions; (2) Design computationally efficient Wasserstein-related adversarial approaches which have low computational and sample efficiency; (3) Conduct extensive experiments on several benchmark datasets in various domains, including computational advertising, and text and image retrieval, for performance evaluation. For the second hashing subset, we propose energy-based hashing solutions which can improve the generalization and robustness of existing hashing approaches. The contributions of our work for this task include: (1) Propose data-synthesis solutions to improve the generalization of existing hashing methods; (2) Propose energy-based hashing solutions which exhibit better robustness against out-of-distribution and corrupted data; (3) Conduct extensive experiments for performance evaluations on several benchmark datasets in the image retrieval domain.
Finally, for the last subset of explainable retrieval methods, we propose an optimal alignment algorithm that achieves a better similarity approximation for a pair of structured objects, such as graphs, while capturing the alignment between the nodes of the graphs to explain the similarity calculation. The contributions of our work for this task include: (1) Propose a novel optimal alignment algorithm for comparing two sets of bag-of-vectors embeddings; (2) Propose a differentiable computation to learn the parameters of the proposed optimal alignment model; (3) Conduct extensive experiments, for performance evaluation of both the similarity approximation task and the retrieval task, on several benchmark graph datasets. / Doctor of Philosophy / Searching for similar items, or similarity search, is one of the fundamental tasks in this information age, especially when there is a rapid growth of visual and textual contents. For example, in a search engine such as Google, a user searches for images with similar content to a referenced image; in online advertising, an advertiser finds new users, and eventually targets these users with advertisements, where the new users have similar profiles to some referenced users who have previously responded positively to the same or similar advertisements; in the chemical domain, scientists search for proteins with a similar structure to a referenced protein. The practical search applications in these domains often face several challenges, especially when these datasets or databases can contain a large number (e.g., millions or even billions) of complex-structured items (e.g., texts, images, and graphs). These challenges can be organized into three central themes: search efficiency (the economical use of resources such as computation and time) and model-design effort (the ease of building the search model). Besides search efficiency and model-design effort, it is increasingly a requirement of a search model to possess the ability to explain the search results, especially in the scientific domains where the items are structured objects such as graphs.
This dissertation tackles the aforementioned challenges in practical search applications by using the computational techniques that learn to generate data. First, we overcome the need to scan the entire large dataset for similar items by considering an approximate similarity search technique called hashing. Then, we propose an unsupervised hashing framework that learns the hash functions with simpler objective functions directly from raw data. The proposed retrieval framework can be easily adapted into new domains with significantly lower effort in model design. When labeled data is available but is limited (which is a common scenario in practical search applications), we propose a hashing network that can synthesize additional data to improve the hash function learning process. The learned model also exhibits significant robustness against data corruption and slight changes in the underlying data. Finally, in domains with structured data such as graphs, we propose a computation approach that can simultaneously estimate the similarity of structured objects, such as graphs, and capture the alignment between their substructures, e.g., nodes. The alignment mechanism can help explain the reason why two objects are similar or dissimilar. This is a useful tool for domain experts who not only want to search for similar items but also want to understand how the search model makes its predictions.
|
8 |
Machine Learning Explainability on Multi-Modal Data using Ecological Momentary Assessments in the Medical Domain / Erklärbarkeit von maschinellem Lernen unter Verwendung multi-modaler Daten und Ecological Momentary Assessments im medizinischen SektorAllgaier, Johannes January 2024 (has links) (PDF)
Introduction.
Mobile health (mHealth) integrates mobile devices into healthcare, enabling remote monitoring, data collection, and personalized interventions. Machine Learning (ML), a subfield of Artificial Intelligence (AI), can use mHealth data to confirm or extend domain knowledge by finding associations within the data, i.e., with the goal of improving healthcare decisions. In this work, two data collection techniques were used for mHealth data fed into ML systems: Mobile Crowdsensing (MCS), which is a collaborative data gathering approach, and Ecological Momentary Assessments (EMA), which capture real-time individual experiences within the individual’s common environments using questionnaires and sensors. We collected EMA and MCS data on tinnitus and COVID-19. About 15 % of the world’s population suffers from tinnitus.
Materials & Methods.
This thesis investigates the challenges of ML systems when using MCS and EMA data. It asks: How can ML confirm or broad domain knowledge? Domain knowledge refers to expertise and understanding in a specific field, gained through experience and education. Are ML systems always superior to simple heuristics and if yes, how can one reach explainable AI (XAI) in the presence of mHealth data? An XAI method enables a human to understand why a model makes certain predictions. Finally, which guidelines can be beneficial for the use of ML within the mHealth domain? In tinnitus research, ML discerns gender, temperature, and season-related variations among patients. In the realm of COVID-19, we collaboratively designed a COVID-19 check app for public education, incorporating EMA data to offer informative feedback on COVID-19-related matters. This thesis uses seven EMA datasets with more than 250,000 assessments. Our analyses revealed a set of challenges: App user over-representation, time gaps, identity ambiguity, and operating system specific rounding errors, among others. Our systematic review of 450 medical studies assessed prior utilization of XAI methods.
Results.
ML models predict gender and tinnitus perception, validating gender-linked tinnitus disparities. Using season and temperature to predict tinnitus shows the association of these variables with tinnitus. Multiple assessments of one app user can constitute a group. Neglecting these groups in data sets leads to model overfitting. In select instances, heuristics outperform ML models, highlighting the need for domain expert consultation to unveil hidden groups or find simple heuristics.
Conclusion.
This thesis suggests guidelines for mHealth related data analyses and improves estimates for ML performance. Close communication with medical domain experts to identify latent user subsets and incremental benefits of ML is essential. / Einleitung.
Unter Mobile Health (mHealth) versteht man die Nutzung mobiler Geräte
wie Handys zur Unterstützung der Gesundheitsversorgung. So können Ärzt:innen z. B.
Gesundheitsinformationen sammeln, die Gesundheit aus der Ferne überwachen, sowie
personalisierte Behandlungen anbieten. Man kann maschinelles Lernen (ML) als System
nutzen, um aus diesen Gesundheitsinformationen zu lernen. Das ML-System versucht,
Muster in den mHealth Daten zu finden, um Ärzt:innen zu helfen, bessere Entschei-
dungen zu treffen. Zur Datensammlung wurden zwei Methoden verwendet: Einerseits
trugen zahlreiche Personen zur Sammlung von umfassenden Informationen mit mo-
bilen Geräten bei (sog. Mobile Crowdsensing), zum anderen wurde den Mitwirkenden
digitale Fragebögen gesendet und Sensoren wie GPS eingesetzt, um Informationen in
einer alltäglichen Umgebung zu erfassen (sog. Ecologcial Momentary Assessments). Diese
Arbeit verwendet Daten aus zwei medizinischen Bereichen: Tinnitus und COVID-19.
Schätzungen zufolge leidet etwa 15 % der Menschheit an Tinnitus.
Materialien & Methoden.
Die Arbeit untersucht, wie ML-Systeme mit mHealth Daten
umgehen: Wie können diese Systeme robuster werden oder neue Dinge lernen? Funktion-
ieren die neuen ML-Systeme immer besser als einfache Daumenregeln, und wenn ja, wie
können wir sie dazu bringen, zu erklären, warum sie bestimmte Entscheidungen treffen?
Welche speziellen Regeln sollte man außerdem befolgen, wenn man ML-Systeme mit
mHealth Daten trainiert? Während der COVID-19-Pandemie entwickelten wir eine App,
die den Menschen helfen sollte, sich über das Virus zu informieren. Diese App nutzte
Daten der Krankheitssymptome der App Nutzer:innen, um Handlungsempfehlungen
für das weitere Vorgehen zu geben.
Ergebnisse.
ML-Systeme wurden trainiert, um Tinnitus vorherzusagen und wie er mit
geschlechtsspezifischen Unterschieden zusammenhängen könnte. Die Verwendung von
Faktoren wie Jahreszeit und Temperatur kann helfen, Tinnitus und seine Beziehung zu
diesen Faktoren zu verstehen. Wenn wir beim Training nicht berücksichtigen, dass ein
App User mehrere Datensätze ausfüllen kann, führt dies zu einer Überanpassung und
damit Verschlechterung des ML-Systems. Interessanterweise führen manchmal einfache
Regeln zu robusteren und besseren Modellen als komplexe ML-Systeme. Das zeigt, dass
es wichtig ist, Experten auf dem Gebiet einzubeziehen, um Überanpassung zu vermeiden
oder einfache Regeln zur Vorhersage zu finden.
Fazit.
Durch die Betrachtung verschiedener Langzeitdaten konnten wir neue Empfehlun-
gen zur Analyse von mHealth Daten und der Entwicklung von ML-Systemen ableiten.
Dabei ist es wichtig, medizinischen Experten mit einzubeziehen, um Überanpassung zu
vermeiden und ML-Systeme schrittweise zu verbessern.
|
9 |
Towards Explainable Decision-making Strategies of Deep Convolutional Neural Networks : An exploration into explainable AI and potential applications within cancer detectionHammarström, Tobias January 2020 (has links)
The influence of Artificial Intelligence (AI) on society is increasing, with applications in highly sensitive and complicated areas. Examples include using Deep Convolutional Neural Networks within healthcare for diagnosing cancer. However, the inner workings of such models are often unknown, limiting the much-needed trust in the models. To combat this, Explainable AI (XAI) methods aim to provide explanations of the models' decision-making. Two such methods, Spectral Relevance Analysis (SpRAy) and Testing with Concept Activation Methods (TCAV), were evaluated on a deep learning model classifying cat and dog images that contained introduced artificial noise. The task was to assess the methods' capabilities to explain the importance of the introduced noise for the learnt model. The task was constructed as an exploratory step, with the future aim of using the methods on models diagnosing oral cancer. In addition to using the TCAV method as introduced by its authors, this study also utilizes the CAV-sensitivity to introduce and perform a sensitivity magnitude analysis. Both methods proved useful in discerning between the model’s two decision-making strategies based on either the animal or the noise. However, greater insight into the intricacies of said strategies is desired. Additionally, the methods provided a deeper understanding of the model’s learning, as the model did not seem to properly distinguish between the noise and the animal conceptually. The methods thus accentuated the limitations of the model, thereby increasing our trust in its abilities. In conclusion, the methods show promise regarding the task of detecting visually distinctive noise in images, which could extend to other distinctive features present in more complex problems. Consequently, more research should be conducted on applying these methods on more complex areas with specialized models and tasks, e.g. oral cancer.
|
10 |
Learning acyclic probabilistic logic programs from data. / Aprendizado de programas lógico-probabilísticos acíclicos.Faria, Francisco Henrique Otte Vieira de 12 December 2017 (has links)
To learn a probabilistic logic program is to find a set of probabilistic rules that best fits some data, in order to explain how attributes relate to one another and to predict the occurrence of new instantiations of these attributes. In this work, we focus on acyclic programs, because in this case the meaning of the program is quite transparent and easy to grasp. We propose that the learning process for a probabilistic acyclic logic program should be guided by a scoring function imported from the literature on Bayesian network learning. We suggest novel techniques that lead to orders of magnitude improvements in the current state-of-art represented by the ProbLog package. In addition, we present novel techniques for learning the structure of acyclic probabilistic logic programs. / O aprendizado de um programa lógico probabilístico consiste em encontrar um conjunto de regras lógico-probabilísticas que melhor se adequem aos dados, a fim de explicar de que forma estão relacionados os atributos observados e predizer a ocorrência de novas instanciações destes atributos. Neste trabalho focamos em programas acíclicos, cujo significado é bastante claro e fácil de interpretar. Propõe-se que o processo de aprendizado de programas lógicos probabilísticos acíclicos deve ser guiado por funções de avaliação importadas da literatura de aprendizado de redes Bayesianas. Neste trabalho s~ao sugeridas novas técnicas para aprendizado de parâmetros que contribuem para uma melhora significativa na eficiência computacional do estado da arte representado pelo pacote ProbLog. Além disto, apresentamos novas técnicas para aprendizado da estrutura de programas lógicos probabilísticos acíclicos.
|
Page generated in 0.0692 seconds