Global ETD Search

1	Reliable graph predictions : Conformal prediction for Graph Neural Networks Bååw, Albin January 2022 (has links) We have seen a rapid increase in the development of deep learning algorithms in recent decades. However, while these algorithms have unlocked new business areas and led to great development in many fields, they are usually limited to Euclidean data. Researchers are increasingly starting to find out that they can better represent the data used in many real-life applications as graphs. Examples include high-risk domains such as finding the side effects when combining medicines using a protein-protein network. In high-risk domains, there is a need for trust and transparency in the results returned by deep learning algorithms. In this work, we explore how we can quantify uncertainty in Graph Neural Network predictions using conventional methods for conformal prediction as well as novel methods exploiting graph connectivity information. We evaluate the methods on both static and dynamic graphs and find that neither of the novel methods offers any clear benefits over the conventional methods. However, we see indications that using the graph connectivity information can lead to more efficient conformal predictors and a lower prediction latency than the conventional methods on large data sets. We propose that future work extend the research on using the connectivity information, specifically the node embeddings, to boost the performance of conformal predictors on graphs. / De senaste årtiondena har vi sett en drastiskt ökad utveckling av djupinlärningsalgoritmer. Även fast dessa algoritmer har skapat nya potentiella affärsområden och har även lett till nya upptäckter i flera andra fält, är dessa algoritmer dessvärre oftast begränsade till Euklidisk data. Samtidigt ser vi att allt fler forskare har upptäckt att data i verklighetstrogna applikationer oftast är bättre representerade i form av grafer. Exempel inkluderar hög-risk domäner som läkemedelsutveckling, där man förutspår bieffekter från mediciner med hjälp av protein-protein nätverk. I hög-risk domäner finns det ett krav på tillit och att resultaten från djupinlärningsalgoritmer är transparenta. I den här tesen utforskar vi hur man kan kvantifiera osäkerheten i resultaten hos Neurala Nätverk för grafer (eng. Graph Neural Networks) med hjälp av konform prediktion (eng. Conformal Prediction). Vi testar både konventionella metoder för konform prediktion, samt originella metoder som utnyttjar strukturell information från grafen. Vi utvärderar metoderna både på statiska och dynamiska grafer, och vi kommer fram till att de originella metoderna varken är bättre eller sämre än de konventionella metoderna. Däremot finner vi indikationer på att användning av den strukturella informationen från grafen kan leda till effektivare prediktorer och till lägre svarstid än de konventionella metoderna när de används på stora grafer. Vi föreslår att framtida arbete i området utforskar vidare hur den strukturella informationen kan användas, och framförallt nod representationerna, kan användas för att öka prestandan i konforma prediktorer för grafer. Conformal prediction Graph Neural Networks Dynamic graphs Distribution shift Coverage gap Konform prediktion Neurala Nätverk för Grafer Dynamiska grafer Distributionsförändring täckningsgap Computer and Information Sciences Data- och informationsvetenskap
2	Performative prediction : expanding theoretical horizons Mofakhami, Mehrnaz 07 1900 (has links) Cette thèse aborde certaines des limitations du cadre de la prédiction performative, qui consiste à apprendre des modèles influençant les données qu’ils sont censés prédire. Je propose des solutions pour repousser les limites de ce cadre, en explorant et en identifiant de nouveaux domaines où son application peut être étendue. La thèse est structurée en trois chapitres, comme décrit ci-après. Le Chapitre 1 offre un aperçu complet du cadre de la prédiction performative, y compris une vue d’ensemble détaillée de la notation préliminaire (Section 1.1) et des concepts nécessaires à la compréhension du cadre, y compris les concepts de solution (Section 1.2) et l’algorithme de Minimisation de Risque Répété (Section 1.3). La notation de ce chapitre est tirée de l’article original sur la prédiction performative afin de garantir une compréhension fondamentale. De plus, la Section 1.4 introduit la relation entre la prédiction performative et les inégalités variationnelles, qui seront abordées plus en détail au Chapitre 3. Le Chapitre 2 présente la contribution principale de cette thèse, en analysant le cadre de la prédiction performative en présence de réseaux neuronaux avec une fonction de perte non convexe. L’accent est mis sur la recherche de classificateurs performativement stables, c’est-à-dire optimaux pour la distribution de données qu’ils induisent. Ce chapitre introduit de nouvelles hypothèses et des garanties de convergence significativement plus fortes pour la méthode RRM (Section 2.3). Ces garanties sont les premières à démontrer l’applicabilité de RRM aux réseaux neuronaux, qui sont difficiles à analyser en raison de leur non-convexité. En guise d’illustration, nous introduisons une procédure de rééchantillonnage qui modélise des changements de distribution réalistes et montrons qu’elle satisfait nos hypothèses (Section 2.4). Nous étayons notre théorie en montrant qu’il est possible d’apprendre des classificateurs performativement stables avec des réseaux neuronaux faisant des prédictions sur des données réelles qui changent selon notre procédure proposée (Section 2.5). Ce travail représente une étape cruciale pour combler le fossé entre la prédiction performative théorique et les applications pratiques. Le Chapitre 3 conclut la thèse en résumant les principales conclusions et contributions et en esquissant de futures directions de recherches. Notamment, il explore l’utilisation des inégalités variationnelles pour aborder et surmonter une limitation significative des travaux antérieurs qui régissent la force des effets performatifs. Cette recherche vise à étendre l’analyse à des scénarios avec des effets performatifs importants et à élargir l’applicabilité du cadre, ouvrant la voie à des solutions plus complètes dans la prédiction performative. / This thesis addresses some of the limitations in the framework of performative prediction, which involves learning models that influence the data they intend to predict. I provide solutions to push the boundaries of this framework, exploring and identifying new domains where its application can be extended. The thesis is structured into three chapters, as described in the following. Chapter 1 offers a comprehensive background on the framework of performative prediction, including a detailed overview of the preliminary notation (Section 1.1) and concepts necessary for understanding the framework, including the solution concepts (Section 1.2) and the Repeated Risk Minimization algorithm (Section 1.3). The notation in this chapter is sourced from the original performative prediction paper to ensure a foundational understanding. Furthermore, Section 1.4 introduces the relationship between performative prediction and variational inequalities, which will be further discussed in Chapter 3. Chapter 2 introduces the main contribution of this thesis, analyzing the performative prediction framework in the presence of neural networks with non-convex loss functions. The focus is on finding classifiers that are performatively stable, meaning they are optimal for the data distribution they induce. This chapter introduces new assumptions and significantly stronger convergence guarantees for the RRM method (Section 2.3). These guarantees are the first to demonstrate the applicability of RRM to neural networks, which are challenging to analyze due to their non-convexity. As an illustration, we introduce a resampling procedure that models realistic distribution shifts and show that it satisfies our assumptions (Section 2.4). We support our theory by showing that one can learn performative stable classifiers with neural networks making predictions about real data that shift according to our proposed procedure (Section 2.5). This work represents a critical step towards bridging the gap between theoretical performative prediction and practical applications. Chapter 3 concludes the thesis by summarizing the key findings and contributions and outlining future research directions. Notably, it explores leveraging variational inequalities to address and overcome a significant limitation in prior work that governs the strength of performative effects. This research aims to extend the analysis to scenarios with large performative effects and broaden the framework’s applicability, paving the way for more comprehensive solutions in performative prediction. Supervised learning Performative prediction Distribution shift Strategic classification Neural networks Variational inequalities Apprentissage supervisé Prédiction performative Changement de distribution Classification stratégique Réseaux neuronaux Inégalités variationnelles
3	Toward trustworthy deep learning : out-of-distribution generalization and few-shot learning Gagnon-Audet, Jean-Christophe 04 1900 (has links) L'intelligence artificielle est un domaine en pleine évolution. Au premier plan des percées récentes se retrouve des approches connues sous le nom d'apprentissage automatique. Cependant, bien que l'apprentissage automatique ait montré des performances remarquables dans des tâches telles que la reconnaissance et la génération d'images, la génération et la traduction de textes et le traitement de la parole, il est connu pour échouer silencieusement dans des conditions courantes. Cela est dû au fait que les algorithmes modernes héritent des biais des données utilisées pour les créer, ce qui conduit à des prédictions incorrectes lorsqu'ils rencontrent de nouvelles données différentes des données d'entraînement. Ce problème est connu sous le nom de défaillance hors-distribution. Cela rend l'intelligence artificielle moderne peu fiable et constitue un obstacle important à son déploiement sécuritaire et généralisé. Ignorer l'échec de généralisation hors-distribution de l'apprentissage automatique pourrait entraîner des situations mettant des vies en danger. Cette thèse vise à aborder cette question et propose des solutions pour assurer le déploiement sûr et fiable de modèles d'intelligence artificielle modernes. Nous présentons trois articles qui couvrent différentes directions pour résoudre l'échec de généralisation hors-distribution de l'apprentissage automatique. Le premier article propose une approche directe qui démontre une performance améliorée par rapport à l'état de l'art. Le deuxième article établie les bases de recherches futures en généralisation hors distribution dans les séries temporelles, tandis que le troisième article fournit une solution simple pour corriger les échecs de généralisation des grands modèles pré-entraînés lorsqu'entraîné sur tes tâches en aval. Ces articles apportent des contributions précieuses au domaine et fournissent des pistes prometteuses pour la recherche future en généralisation hors distribution. / Artificial Intelligence (AI) is a rapidly advancing field, with data-driven approaches known as machine learning, at the forefront of many recent breakthroughs. However, while machine learning have shown remarkable performance in tasks such as image recognition and generation, text generation and translation, and speech processing, they are known to silently fail under common conditions. This is because modern AI algorithms inherit biases from the data used to train them, leading to incorrect predictions when encountering new data that is different from the training data. This problem is known as distribution shift or out-of-distribution (OOD) failure. This causes modern AI to be untrustworthy and is a significant barrier to the safe widespread deployment of AI. Failing to address the OOD generalization failure of machine learning could result in situations that put lives in danger or make it impossible to deploy AI in any significant manner. This thesis aims to tackle this issue and proposes solutions to ensure the safe and reliable deployment of modern deep learning models. We present three papers that cover different directions in solving the OOD generalization failure of machine learning. The first paper proposes a direct approach that demonstrates improved performance over the state-of-the-art. The second paper lays the groundwork for future research in OOD generalization in time series, while the third paper provides a straightforward solution for fixing generalization failures of large pretrained models when finetuned on downstream tasks. These papers make valuable contributions to the field and provide promising avenues for future research in OOD generalization. apprentissage automatique apprentissage profond réseaux de neurones apprentissage de représentation déplacement de distribution généralisation hors-distribution modèles fondamentaux apprentissage à quelques exemples machine learning deep learning neural networks representation learning domain generalization distribution shift out-of-distribution generalization foundation models few-shot learning généralisation de domaine

1

Page generated in 0.1113 seconds