Return to search

Towards gradient faithfulness and beyond

The riveting interplay of industrialization, informalization, and exponential technological growth of recent years has shifted the attention from classical machine learning techniques to more sophisticated deep learning approaches; yet its intrinsic black-box nature has been impeding its widespread adoption in transparency-critical operations. In this rapidly evolving landscape, where the symbiotic relationship between research and practical applications has never been more interwoven, the contribution of this paper is twofold: advancing gradient faithfulness of CAM methods and exploring new frontiers beyond it. In the first part, we theorize three novel gradient-based CAM formulations, aimed at replacing and superseding traditional Grad-CAM-based methods by tackling and addressing the intricately and persistent vanishing and saturating gradient problems. As a consequence, our work introduces novel enhancements to Grad-CAM that reshape the conventional gradient computation by incorporating a customized and adapted technique inspired by the well-established and provably Expected Gradients’ difference-from-reference approach. Our proposed techniques– Expected Grad-CAM, Expected Grad-CAM++and Guided Expected Grad-CAM– as they operate directly on the gradient computation, rather than the recombination of the weighing factors, are designed as a direct and seamless replacement for Grad-CAM and any posterior work built upon it. In the second part, we build on our prior proposition and devise a novel CAM method that produces both high-resolution and class-discriminative explanation without fusing other methods, while addressing the issues of both gradient and CAM methods altogether. Our last and most advanced proposition, Hyper Expected Grad-CAM, challenges the current state and formulation of visual explanation and faithfulness and produces a new type of hybrid saliencies that satisfy the notion of natural encoding and perceived resolution. By rethinking faithfulness and resolution is possible to generate saliencies which are more detailed, localized, and less noisy, but most importantly that are composed of only concepts that are encoded by the layerwise models’ understanding. Both contributions have been quantitatively and qualitatively compared and assessed in a 5 to 10 times larger evaluation study on the ILSVRC2012 dataset against nine of the most recent and performing CAM techniques across six metrics. Expected Grad-CAM outperformed not only the original formulation but also more advanced methods, resulting in the second-best explainer with an Ins-Del score of 0.56. Hyper Expected Grad-CAM provided remarkable results across each quantitative metric, yielding a 0.15 increase in insertion when compared to the highest-scoring explainer PolyCAM, totaling to an Ins-Del score of 0.72.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:hh-51341
Date January 2023
CreatorsBuono, Vincenzo, Åkesson, Isak
PublisherHögskolan i Halmstad, Akademin för informationsteknologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0026 seconds