Spelling suggestions: "subject:"explainable"" "subject:"maintainability""
21 |
Improving Visual Question Answering by Leveraging Depth and Adapting Explainability / Förbättring av Visual Question Answering (VQA) genom utnyttjandet av djup och anpassandet av förklaringsförmåganPanesar, Amrita Kaur January 2022 (has links)
To produce smooth human-robot interactions, it is important for robots to be able to answer users’ questions accurately and provide a suitable explanation for why they arrive to the answer they provide. However, in the wild, the user may ask the robot questions relating to aspects of the scene that the robot is unfamiliar with and hence be unable to answer correctly all of the time. In order to gain trust in the robot and resolve failure cases where an incorrect answer is provided, we propose a method that uses Grad-CAM explainability on RGB-D data. Depth is a critical component in producing more intelligent robots that can respond correctly most of the time as some questions might rely on spatial relations within the scene, for which 2D RGB data alone would be insufficient. To our knowledge, this work is the first of its kind to leverage depth and an explainability module to produce an explainable Visual Question Answering (VQA) system. Furthermore, we introduce a new dataset for the task of VQA on RGB-D data, VQA-SUNRGBD. We evaluate our explainability method against Grad-CAM on RGB data and find that ours produces better visual explanations. When we compare our proposed model on RGB-D data against the baseline VQN network on RGB data alone, we show that ours outperforms, particularly in questions relating to depth such as asking about the proximity of objects and relative positions of objects to one another. / För att skapa smidiga interaktioner mellan människa och robot är det viktigt för robotar att kunna svara på användarnas frågor korrekt och ge en lämplig förklaring till varför de kommer fram till det svar de ger. Men i det vilda kan användaren ställa frågor till roboten som rör aspekter av miljön som roboten är obekant med och därmed inte kunna svara korrekt hela tiden. För att få förtroende för roboten och lösa de misslyckade fall där ett felaktigt svar ges, föreslår vi en metod som använder Grad-CAM-förklarbarhet på RGB-D-data. Djup är en kritisk komponent för att producera mer intelligenta robotar som kan svara korrekt för det mesta, eftersom vissa frågor kan förlita sig på rumsliga relationer inom scenen, för vilka enbart 2D RGB-data skulle vara otillräcklig. Såvitt vi vet är detta arbete det första i sitt slag som utnyttjar djup och en förklaringsmodul för att producera ett förklarabart Visual Question Answering (VQA)-system. Dessutom introducerar vi ett nytt dataset för uppdraget av VQA på RGB-D-data, VQA-SUNRGBD. Vi utvärderar vår förklaringsmetod mot Grad-CAM på RGB-data och finner att vår modell ger bättre visuella förklaringar. När vi jämför vår föreslagna modell för RGB-Ddata mot baslinje-VQN-nätverket på enbart RGB-data visar vi att vår modell överträffar, särskilt i frågor som rör djup, som att fråga om objekts närhet och relativa positioner för objekt jämntemot varandra.
|
22 |
Explainable AI - Visualization of Neuron Functionality in Recurrent Neural Networks for Text Prediction / Förklarande AI - Visualisering av Neuronfunktionalitet i Rekurrenta Neurala Nätverk för TextpredikteringDahlberg, John January 2019 (has links)
Artificial Neural Networks are successfully solving a wide range of problems with impressive performance. Nevertheless, often very little or nothing is understood in the workings behind these black-box solutions as they are hard to interpret, let alone to explain. This thesis proposes a set of complementary interpretable visualization models of neural activity, developed through prototyping, to answer the research question ”How may neural activity of Recurrent Neural Networks for text sequence prediction be represented, transformed and visualized during the inference process to explain interpretable functionality with respect to the text domain of some individual hidden neurons, as well as automatically detect these?”. Specifically, a Vanilla and a Long Short-Term Memory architecture are utilized for character respectively word prediction as testbeds. The research method is experimental; causalities between text features triggering neurons and detected patterns of corresponding nerve impulses are investigated. The result reveals not only that there exist neurons with clear and consistent feature-specific patterns of activity, but also that the proposed models of visualization successfully may automatically detect and interpretably present some of these. / Artificiella Neurala Nätverk löser framgångsrikt ett brett spektrum av problem med imponerande prestanda. Ändå är det ofta mycket lite eller ingenting som går att förstå bakom dessa svart-låda-lösningar, eftersom de är svåra att tolka och desto svårare att förklara. Den här uppsatsen föreslår en uppsättning komplementerande tolkningsbara visualiseringsmodeller av neural aktivitet, utvecklad genom prototypering, för att besvara forskningsfrågan ”Hur kan användningsprocessen av Rekurrenta Neurala Nätverk för textgenerering visualiseras på ett sätt för att automatiskt detektera och förklara tolkningsbar funktionalitet hos några enskilda dolda neuroner?”. Specifikt används en standardoch en LSTM (långt korttidsminne)-arkitektur för teckenrespektive ordprediktering som testbäddar. Forskningsmetoden är experimentell; orsakssamband mellan specifika typer av tecken/ord i texten som triggar neuroner, och detekterade mönster av motsvarande nervimpulser undersöks. Resultatet avslöjar inte bara att neuroner med tydliga och konsekventa tecken/ord-specifika aktivitetsmönster existerar, men också att de utvecklade modellerna för visualisering framgångsrikt kan automatiskt upptäcka och tolkningsbart presentera några av dessa.
|
23 |
Deep Learning Classification and Model Explainability for Prediction of Mental Health Patients Emergency Department Visit / Emergency Department Resource Prediction Using Explainable Deep LearningRashidiani, Sajjad January 2022 (has links)
The rate of Emergency Department (ED) visits due to mental health and drug abuse among children and youth has been increasing for more than a decade and is projected to become the leading cause of ED visits. Identifying high-risk patients well before an ED visit will enable mental health care providers to better predict ED resource utilization, improve their service, and ultimately reduce the risk of a future ED visit. Many studies in the literature utilized medical history to predict future hospitalization. However, in mental health care, the medical history of new patients is not always available from the first visit and it is crucial to identify high risk patients from the beginning as the rate of drop-out is very high in mental health treatment. In this study, a new approach of creating a text representation of questionnaire data for deep learning analysis is proposed. Employing this new text representation has enabled us to use transfer learning and develop a deep Natural Language Processing (NLP) model that estimates the possibility of 6-month ED visit among children and youth using mental health patient reported outcome measures (PROM). The proposed method achieved an Area Under Receiver Operating Characteristic Curve of 0.75 for classification of 6-month ED visit. In addition, a novel method was proposed to identify the words that carry the highest amount of information related to the outcome of the deep NLP models. This measurement of word information using Entropy Gain increases the explainability of the model by providing insight to the model attention. Finally, the results of this method were analyzed to explain how the deep NLP model achieved a high classification performance. / Dissertation / Master of Applied Science (MASc) / In this document, an Artificial Intelligence (AI) approach for predicting 6-month Emergency Department (ED) visits is proposed. In this approach, the questionnaires gathered from children and youth admitted to an outpatient or inpatient clinic are converted to a text representation called Textionnaire. Next, AI is utilized to analyze the Textionnaire and predict the possibility of a future ED visit. This method was successful in about 75% of the time. In addition to the AI solution, an explainability component is introduced to explain how the natural language processing algorithm identifies the high risk patients.
|
24 |
Explainable Reinforcement Learning for Remote Electrical Tilt OptimizationMirzaian, Artin January 2022 (has links)
Controlling antennas’ vertical tilt through Remote Electrical Tilt (RET) is an effective method to optimize network performance. Reinforcement Learning (RL) algorithms such as Deep Reinforcement Learning (DRL) have been shown to be successful for RET optimization. One issue with DRL is that DRL models have a black box nature where it is difficult to ’explain’ the decisions made in a human-understandable way. Explanations of a model’s decisions are beneficial for a user not only to understand but also to intervene and modify the RL model. In this work, a state-ofthe-art Explainable Reinforcement Learning (XRL) method is evaluated on the RET optimization problem. More specifically, the chosen XRL method is the Embedded Self-Prediction (ESP) model proposed by Lin, Lam, and Fern [16] which can generate contrastive explanations in terms of why an action is preferred over the other. The ESP model was evaluated on two different RET optimization scenarios. The first scenario is formulated as a single agent RL problem in a ’simple’ environment whereas the second scenario is formulated as a multi agent RL problem with a more complex environment. In both scenarios, the results show little to no difference in performance compared to a baseline Deep Q-Network (DQN) algorithm. Finally, the explanations of the model were validated by comparing them to action outcomes. The conclusions of this work is that the ESP model offers explanations of its behaviour with no performance decrease compared to a baseline DQN and the generated explanations offer value in debugging and understanding the given problem. / Att styra antenners vertikala lutning genom RET är en effektiv metod för att optimera nätverksprestanda. RL-algoritmer som DRL har visat sig vara framgångsrika för REToptimering. Ett problem med DRL är att DRL-modeller är som en svart låda där det är svårt att ’förklara’ de beslut som fattas på ett sätt som är begripligt för människor. Förklaringar av en modells beslut är fördelaktiga för en användare inte bara för att förstå utan också för att ingripa och modifiera RL-modellen. I detta arbete utvärderas en toppmodern XRL-metod på RET-optimeringsproblemet. Mer specifikt är den valda XRL-metoden ESP-modellen som föreslagits av Lin, Lam och Fern [16] som kan generera kontrastiva förklaringar i termer av varför en handling föredras framför den andra. ESP-modellen utvärderades på två olika RET-optimeringsscenarier. Det första scenariot är formulerat som ett problem med en enstaka agent i en ’enkel’ miljö medan det andra scenariot är formulerat som ett problem med flera agenter i en mer komplex miljö. I båda scenarierna visar resultaten liten eller ingen skillnad i prestanda jämfört med en DQN-algoritm. Slutligen validerades modellens förklaringar genom att jämföra dem med handlingsresultat. Slutsatserna av detta arbete är att ESPmodellen erbjuder förklaringar av dess beteende utan prestandaminskning jämfört med en DQN och de genererade förklaringarna ger värde för att felsöka och förstå det givna problemet.
|
25 |
Exploring attribution methods explaining atrial fibrillation predictions from sinus ECGs : Attributions in Scale, Time and Frequency / Undersökning av attributionsmetoder för att förklara förmaksflimmerprediktioner från EKG:er i sinusrytm : Attribution i skala, tid och frekvensSörberg, Svante January 2021 (has links)
Deep Learning models are ubiquitous in machine learning. They offer state-of- the-art performance on tasks ranging from natural language processing to image classification. The drawback of these complex models is their black box nature. It is difficult for the end-user to understand how a model arrives at its prediction from the input. This is especially pertinent in domains such as medicine, where being able to trust a model is paramount. In this thesis, ways of explaining a model predicting paroxysmal atrial fibrillation from sinus electrocardiogram (ECG) data are explored. Building on the concept of feature attributions, the problem is approached from three distinct perspectives: time, scale, and frequency. Specifically, one method based on the Integrated Gradients framework and one method based on Shapley values are used. By perturbing the data, retraining the model, and evaluating the retrained model on the perturbed data, the degree of correspondence between the attributions and the meaningful information in the data is evaluated. Results indicate that the attributions in scale and frequency are somewhat consistent with the meaningful information in the data, while the attributions in time are not. The conclusion drawn from the results is that the task of predicting atrial fibrillation for the model in question becomes easier as the level of scale is increased slightly, and that high-frequency information is either not meaningful for the task of predicting atrial fibrillation, or that if it is, the model is unable to learn from it. / Djupinlärningsmodeller förekommer på många håll inom maskininlärning. De erbjuder bästa möjliga prestanda i olika domäner såsom datorlingvistik och bildklassificering. Nackdelen med dessa komplexa modeller är deras “svart låda”-egenskaper. Det är svårt för användaren att förstå hur en modell kommer fram till sin prediktion utifrån indatan. Detta är särskilt relevant i domäner såsom sjukvård, där tillit till modellen är avgörande. I denna uppsats utforskas sätt att förklara en modell som predikterar paroxysmalt förmaksflimmer från elektrokardiogram (EKG) som uppvisar normal sinusrytm. Med utgångspunkt i feature attribution (särdragsattribution) angrips problemet från tre olika perspektiv: tid, skala och frekvens. I synnerhet används en metod baserad på Integrated Gradients och en metod baserad på Shapley-värden. Genom att perturbera datan, träna om modellen, och utvärdera den omtränader modellen på den perturberade datan utvärderas graden av överensstämmelse mellan attributionerna och den meningsfulla informationen i datan. Resultaten visar att attributioner i skala- och frekvensdomänerna delvis stämmer överens med den meningsfulla informationen i datan, medan attributionerna i tidsdomänen inte gör det. Slutsatsen som dras utifrån resultaten är att uppgiften att prediktera förmaksflimmer blir enklare när skalnivån ökas något, samt att högre frekvenser antingen inte är betydelsefullt för att prediktera förmaksflimmer, eller att om det är det, så saknar modellen förmågan att lära sig detta.
|
26 |
Tools and Methods for Companies to Build Transparent and Fair Machine Learning Systems / Verktyg och metoder för företag att utveckla transparenta och rättvisa maskininlärningssystemSchildt, Alexandra, Luo, Jenny January 2020 (has links)
AI has quickly grown from being a vast concept to an emerging technology that many companies are looking to integrate into their businesses, generally considered an ongoing “revolution” transforming science and society altogether. Researchers and organizations agree that AI and the recent rapid developments in machine learning carry huge potential benefits. At the same time, there is an increasing worry that ethical challenges are not being addressed in the design and implementation of AI systems. As a result, AI has sparked a debate about what principles and values should guide its development and use. However, there is a lack of consensus about what values and principles should guide the development, as well as what practical tools should be used to translate such principles into practice. Although researchers, organizations and authorities have proposed tools and strategies for working with ethical AI within organizations, there is a lack of a holistic perspective, tying together the tools and strategies proposed in ethical, technical and organizational discourses. The thesis aims to contribute with knowledge to bridge this gap by addressing the following purpose: to explore and present the different tools and methods companies and organizations should have in order to build machine learning applications in a fair and transparent manner. The study is of qualitative nature and data collection was conducted through a literature review and interviews with subject matter experts. In our findings, we present a number of tools and methods to increase fairness and transparency. Our findings also show that companies should work with a combination of tools and methods, both outside and inside the development process, as well as in different stages of the machine learning development process. Tools used outside the development process, such as ethical guidelines, appointed roles, workshops and trainings, have positive effects on alignment, engagement and knowledge while providing valuable opportunities for improvement. Furthermore, the findings suggest that it is crucial to translate high-level values into low-level requirements that are measurable and can be evaluated against. We propose a number of pre-model, in-model and post-model techniques that companies can and should implement in each other to increase fairness and transparency in their machine learning systems. / AI har snabbt vuxit från att vara ett vagt koncept till en ny teknik som många företag vill eller är i färd med att implementera. Forskare och organisationer är överens om att AI och utvecklingen inom maskininlärning har enorma potentiella fördelar. Samtidigt finns det en ökande oro för att utformningen och implementeringen av AI-system inte tar de etiska riskerna i beaktning. Detta har triggat en debatt kring vilka principer och värderingar som bör vägleda AI i dess utveckling och användning. Det saknas enighet kring vilka värderingar och principer som bör vägleda AI-utvecklingen, men också kring vilka praktiska verktyg som skall användas för att implementera dessa principer i praktiken. Trots att forskare, organisationer och myndigheter har föreslagit verktyg och strategier för att arbeta med etiskt AI inom organisationer, saknas ett helhetsperspektiv som binder samman de verktyg och strategier som föreslås i etiska, tekniska och organisatoriska diskurser. Rapporten syftar till överbrygga detta gap med följande syfte: att utforska och presentera olika verktyg och metoder som företag och organisationer bör ha för att bygga maskininlärningsapplikationer på ett rättvist och transparent sätt. Studien är av kvalitativ karaktär och datainsamlingen genomfördes genom en litteraturstudie och intervjuer med ämnesexperter från forskning och näringsliv. I våra resultat presenteras ett antal verktyg och metoder för att öka rättvisa och transparens i maskininlärningssystem. Våra resultat visar också att företag bör arbeta med en kombination av verktyg och metoder, både utanför och inuti utvecklingsprocessen men också i olika stadier i utvecklingsprocessen. Verktyg utanför utvecklingsprocessen så som etiska riktlinjer, utsedda roller, workshops och utbildningar har positiva effekter på engagemang och kunskap samtidigt som de ger värdefulla möjligheter till förbättringar. Dessutom indikerar resultaten att det är kritiskt att principer på hög nivå översätts till mätbara kravspecifikationer. Vi föreslår ett antal verktyg i pre-model, in-model och post-model som företag och organisationer kan implementera för att öka rättvisa och transparens i sina maskininlärningssystem.
|
27 |
Exploring User Trust in Natural Language Processing Systems : A Survey Study on ChatGPT UsersAronsson Bünger, Morgan January 2024 (has links)
ChatGPT has become a popular technology among people and gained a considerable user base, because of its power to effectively generate responses to users requests. However, as ChatGPT’s popularity has grown and as other natural language processing systems (NLPs) are being developed and adopted, several concerns have been raised about the technology that could have implications on user trust. Because trust plays a central role in user willingness to adopt artificial intelligence (AI) systems and there is no consensus in research on what facilitates trust, it is important to conduct more research to identify the factors that affect user trust in artificial intelligence systems, especially modern technologies such as NLPs. The aim of the study was therefore to identify the factors that affect user trust in NLPs. The findings from the literature within trust and artificial intelligence indicated that there may exist a relationship between trust and transparency, explainability, accuracy, reliability, automation, augmentation, anthropomorphism and data privacy. These factors were quantitatively studied together in order to uncover what affects user trust in NLPs. The result from the study indicated that transparency, accuracy, reliability, automation, augmentation, anthropomorphism and data privacy all have a positive impact on user trust in NLPs, which both supported and opposed previous findings from literature.
|
28 |
Trustworthy AI: Ensuring Explainability and AcceptanceDavinder Kaur (17508870) 03 January 2024 (has links)
<p dir="ltr">In the dynamic realm of Artificial Intelligence (AI), this study explores the multifaceted landscape of Trustworthy AI with a dedicated focus on achieving both explainability and acceptance. The research addresses the evolving dynamics of AI, emphasizing the essential role of human involvement in shaping its trajectory.</p><p dir="ltr">A primary contribution of this work is the introduction of a novel "Trustworthy Explainability Acceptance Metric", tailored for the evaluation of AI-based systems by field experts. Grounded in a versatile distance acceptance approach, this metric provides a reliable measure of acceptance value. Practical applications of this metric are illustrated, particularly in a critical domain like medical diagnostics. Another significant contribution is the proposal of a trust-based security framework for 5G social networks. This framework enhances security and reliability by incorporating community insights and leveraging trust mechanisms, presenting a valuable advancement in social network security.</p><p dir="ltr">The study also introduces an artificial conscience-control module model, innovating with the concept of "Artificial Feeling." This model is designed to enhance AI system adaptability based on user preferences, ensuring controllability, safety, reliability, and trustworthiness in AI decision-making. This innovation contributes to fostering increased societal acceptance of AI technologies. Additionally, the research conducts a comprehensive survey of foundational requirements for establishing trustworthiness in AI. Emphasizing fairness, accountability, privacy, acceptance, and verification/validation, this survey lays the groundwork for understanding and addressing ethical considerations in AI applications. The study concludes with exploring quantum alternatives, offering fresh perspectives on algorithmic approaches in trustworthy AI systems. This exploration broadens the horizons of AI research, pushing the boundaries of traditional algorithms.</p><p dir="ltr">In summary, this work significantly contributes to the discourse on Trustworthy AI, ensuring both explainability and acceptance in the intricate interplay between humans and AI systems. Through its diverse contributions, the research offers valuable insights and practical frameworks for the responsible and ethical deployment of AI in various applications.</p>
|
29 |
Brain Tumor Grade Classification in MR images using Deep Learning / Klassificering av hjärntumör-grad i MR-bilder genom djupinlärningChatzitheodoridou, Eleftheria January 2022 (has links)
Brain tumors represent a diverse spectrum of cancer types which can induce grave complications and lead to poor life expectancy. Amongst the various brain tumor types, gliomas are primary brain tumors that compose about 30% of adult brain tumors. They are graded according to the World Health Organization into Grades 1 to 4 (G1-G4), where G4 is the highest grade with the highest malignancy and poor prognosis. Early diagnosis and classification of brain tumor grade is very important since it can improve the treatment procedure and (potentially) prolong a patient's life, since life expectancy largely depends on the level of malignancy and the tumor's histological characteristics. While clinicians have diagnostic tools they use as a gold standard, such as biopsies these are either invasive or costly. A widely used example of a non-invasive technique is magnetic resonance imaging, due to its ability to produce images with different soft-tissue contrast and high spatial resolution thanks to multiple imaging sequences. However, the examination of such images can be overwhelming for radiologists due to the overall large amount of data. Deep learning approaches, on the other hand, have shown great potential in brain tumor diagnosis and can assist radiologists in the decision-making process. In this thesis, brain tumor grade classification in MR images is performed using deep learning. Two popular pre-trained CNN models (VGG-19, ResNet50) were employed using single MR modalities and combinations of them to classify gliomas into three grades. All models were trained using data augmentation on 2D images from the TCGA dataset, which consisted of 3D volumes from 142 anonymized patients. The models were evaluated based on accuracy, precision, recall, F1-score, AUC score, as well as the Wilcoxon Signed-Rank test to establish if one classifier was statistically significantly better than the other. Since deep learning models are typically 'black box' models and can be difficult to interpret by non-experts, Gradient-weighted Class Activation Mapping (Grad-CAM) was used in order to address model explainability. For single modalities, VGG-19 displayed the highest performance with a test accuracy of 77.86%, whilst for combinations of two and three modalities T1ce, FLAIR and T2, T1ce, FLAIR were the best performing ones for VGG-19 with a test accuracy of 74.48%, 75.78%, respectively. Statistical comparisons indicated that for single MR modalities and combinations of two MR modalities, there was not a statistically significant difference between the two classifiers, whilst for combination of three modalities, one model was better than the other. However, given the small size of the test population, these comparisons have low statistical power. The use of Grad-CAM for model explainability indicated that ResNet50 was able to localize the tumor region better than VGG-19.
|
30 |
Human-Centered Explainability Attributes In Ai-Powered Eco-Driving : Understanding Truck Drivers' PerspectiveGjona, Ermela January 2023 (has links)
The growing presence of algorithm-generated recommendations in AI-powered services highlights the importance of responsible systems that explain outputs in a human-understandable form, especially in an automotive context. Implementing explainability in recommendations of AI-powered eco-driving is important in ensuring that drivers understand the underlying reasoning behind the recommendations. Previous literature on explainable AI (XAI) has been primarily technological-centered, and only a few studies involve the end-user perspective. There is a lack of knowledge of drivers' needs and requirements for explainability in an AI-powered eco-driving context. This study addresses the attributes that make a “satisfactory” explanation, i,e., a satisfactory interface between humans and AI. This study uses scenario-based interviews to understand the explainability attributes that influence truck drivers' intention to use eco-driving recommendations. The study used thematic analysis to categorize seven attributes into context-dependent (Format, Completeness, Accuracy, Timeliness, Communication) and generic (Reliability, Feedback loop) categories. The study contributes context-dependent attributes along three design dimensions: Presentational, Content-related, and Temporal aspects of explainability. The findings of this study present an empirical foundation into end-users' explainability needs and provide valuable insights for UX and system designers in eliciting end-user requirements.
|
Page generated in 0.0748 seconds