Global ETD Search

1	FETA : fairness enforced verifying, training, and predicting algorithms for neural networks Mohammadi, Kiarash 06 1900 (has links) L’automatisation de la prise de décision dans des applications qui affectent directement la qualité de vie des individus grâce aux algorithmes de réseaux de neurones est devenue monnaie courante. Ce mémoire porte sur les enjeux d’équité individuelle qui surviennent lors de la vérification, de l’entraînement et de la prédiction des réseaux de neurones. Une approche populaire pour garantir l’équité consiste à traduire une notion d’équité en contraintes sur les paramètres du modèle. Néanmoins, cette approche ne garantit pas toujours des prédictions équitables des modèles de réseaux de neurones entraînés. Pour relever ce défi, nous avons développé une technique de post-traitement guidée par les contre-exemples afin de faire respecter des contraintes d’équité lors de la prédiction. Contrairement aux travaux antérieurs qui ne garantissent l’équité qu’aux points entourant les données de test ou d’entraînement, nous sommes en mesure de garantir l’équité sur tous les points du domaine. En outre, nous proposons une technique de prétraitement qui repose sur l’utilisation de l’équité comme biais inductif. Cette technique consiste à incorporer itérativement des contre-exemples plus équitables dans le processus d’apprentissage à travers la fonction de perte. Les techniques que nous avons développé ont été implémentées dans un outil appelé FETA. Une évaluation empirique sur des données réelles indique que FETA est non seulement capable de garantir l’équité au moment de la prédiction, mais aussi d’entraîner des modèles précis plus équitables. / Algorithmic decision-making driven by neural networks has become very prominent in applications that directly affect people’s quality of life. This paper focuses on the problem of ensuring individual fairness in neural network models during verification, training, and prediction. A popular approach for enforcing fairness is to translate a fairness notion into constraints over the parameters of the model. However, such a translation does not always guarantee fair predictions of the trained neural network model. To address this challenge, we develop a counterexample-guided post-processing technique to provably enforce fairness constraints at prediction time. Contrary to prior work that enforces fairness only on points around test or train data, we are able to enforce and guarantee fairness on all points in the domain. Additionally, we propose a counterexample guided loss as an in-processing technique to use fairness as an inductive bias by iteratively incorporating fairness counterexamples in the learning process. We have implemented these techniques in a tool called FETA. Empirical evaluation on real-world datasets indicates that FETA is not only able to guarantee fairness on-the-fly at prediction time but also is able to train accurate models exhibiting a much higher degree of individual fairness. Fairness Bias Mitigation Neural Networks Verification Équité Réseaux de Neurones Vérification
2	Interactive Mitigation of Biases in Machine Learning Models Kelly M Van Busum (18863677) 03 September 2024 (has links) <p dir="ltr">Bias and fairness issues in artificial intelligence algorithms are major concerns as people do not want to use AI software they cannot trust. This work uses college admissions data as a case study to develop methodology to define and detect bias, and then introduces a new method for interactive bias mitigation.</p><p dir="ltr">Admissions data spanning six years was used to create machine learning-based predictive models to determine whether a given student would be directly admitted into the School of Science under various scenarios at a large urban research university. During this time, submission of standardized test scores as part of a student’s application became optional which led to interesting questions about the impact of standardized test scores on admission decisions. We developed and analyzed predictive models to understand which variables are important in admissions decisions, and how the decision to exclude test scores affects the demographics of the students who are admitted.</p><p dir="ltr">Then, using a variety of bias and fairness metrics, we analyzed these predictive models to detect biases the models may carry with respect to three variables chosen to represent sensitive populations: gender, race, and whether a student was the first in his/her family to attend college. We found that high accuracy rates can mask underlying algorithmic bias towards these sensitive groups.</p><p dir="ltr">Finally, we describe our method for bias mitigation which uses a combination of machine learning and user interaction. Because bias is intrinsically a subjective and context-dependent matter, it requires human input and feedback. Our approach allows the user to iteratively and incrementally adjust bias and fairness metrics to change the training dataset for an AI model to make the model more fair. This interactive bias mitigation approach was then used to successfully decrease the biases in three AI models in the context of undergraduate student admissions.</p> Semi- and unsupervised learning Machine Learning Bias Mitigation Human-in-the-loop Test-Optional Admissions Policies
3	Adversarial Risks and Stereotype Mitigation at Scale in Generative Models Jha, Akshita 07 March 2025 (has links) Generative models have rapidly evolved to produce coherent text, realistic images, and functional code. Yet these remarkable capabilities also expose critical vulnerabilities -- ranging from subtle adversarial attacks to harmful stereotypes -- that pose both technical and societal challenges. This research investigates these challenges across three modalities (code, text, and vision) before focusing on strategies to mitigate biases specifically in generative language models. First, we reveal how programming language (PL) models rely on a `natural channel' of code, such as human-readable tokens and structure, that adversaries can exploit with minimal perturbations. These attacks expose the fragility of state-of-the-art PL models, highlighting how superficial patterns and hidden assumptions in training data can lead to unanticipated vulnerabilities. Extending this analysis to textual and visual domains, we show how over-reliance on patterns seen in training data manifests as ingrained biases and harmful stereotypes. To enable more inclusive and globally representative model evaluations, we introduce SeeGULL, a large-scale benchmark of thousands of stereotypes spanning diverse cultures and identity groups worldwide. We also develop ViSAGe, a benchmark for identifying visual stereotypes at scale in text-to-image (T2I) models, illustrating the persistence of stereotypes in generated images even when prompted otherwise. Building on these findings, we propose two complementary approaches to mitigate stereotypical outputs in language models. The first is an explicit method that uses fairness constraints for model pruning, ensuring essential bias-mitigating features remain intact. The second is an implicit bias mitigation framework that makes a crucial distinction between comprehension failures and inherently learned stereotypes. This approach uses instruction tuning on general-purpose datasets and mitigates stereotypes implicitly without relying on targeted debiasing techniques. Extensive evaluations on state-of-the-art models demonstrate that our methods substantially reduce harmful stereotypes across multiple identity dimensions, while preserving downstream performance. / Doctor of Philosophy / AI systems, especially generative models that create text, images, and code, have advanced rapidly. They can write essays, generate realistic pictures, and assist with programming. However, these impressive capabilities also come with vulnerabilities that pose both technical and societal challenges. Some of these models can be subtly manipulated into making errors, while others unknowingly reinforce harmful stereotypes present in their training data. This research examines these challenges across three types of generative models: those that generate code, text, and images. First, we investigate how generative models that generate code rely on human-readable patterns that attackers can subtly manipulate, revealing hidden weaknesses in even the most advanced models. Extending this analysis to text and image generation, we show how these models often over-rely on patterns from their training data, leading to harmful stereotypes. To systematically study these issues, we introduce two large-scale benchmarks: SeeGULL, a dataset that identifies stereotypes across cultures and identity groups in AI-generated text, and ViSAGe, a dataset that uncovers hidden biases in AI-generated images. Building on these insights, we propose two complementary solutions to reduce biases in generative language models. The first method explicitly removes biased patterns from compressed AI models by introducing filtering techniques that ensure fairness while keeping the model's accuracy intact. The second takes an implicit approach by improving how generative models interpret instructions, making them less likely to generate biased responses in under-informative scenarios. By improving models' general-purpose understanding, this method helps reduce biases without relying on direct debiasing techniques. Our evaluations show that these strategies significantly reduce harmful stereotypes across multiple identity dimensions, making AI systems more fair and reliable while ensuring they remain effective in real-world applications. generative models adversarial attacks bias mitigation stereotype evaluation llm-human collaboration visual stereotypes instruction-tuning cross-cultural bias
4	From whistleblowing tools to AI-supported data analysis: A compliance practitioner`s view on IT-tools for different aspects of investigations Endres, Markus 28 November 2023 (has links) The text discusses the evolving digital workplace, emphasizing the rise of cybercrime and the need for innovative investigative approaches. It explores the surge in web-based whistleblowing tools in Europe, driven by legislation, and delves into the functionalities and challenges of these tools, including issues of anonymity and data protection. The paper also highlights the role of AI-based forensic tools in government agencies, covering their benefits and potential risks. The use of AI in law enforcement is explored, acknowledging its effectiveness but also cautioning against biases and associated risks. The conclusion stresses the importance of balancing opportunities and risks, particularly in the context of legal and ethical considerations. info:eu-repo/classification/ddc/343 ddc:343
5	INVESTIGATING DATA ACQUISITION TO IMPROVE FAIRNESS OF MACHINE LEARNING MODELS Ekta (18406989) 23 April 2024 (has links) <p dir="ltr">Machine learning (ML) algorithms are increasingly being used in a variety of applications and are heavily relied upon to make decisions that impact people’s lives. ML models are often praised for their precision, yet they can discriminate against certain groups due to biased data. These biases, rooted in historical inequities, pose significant challenges in developing fair and unbiased models. Central to addressing this issue is the mitigation of biases inherent in the training data, as their presence can yield unfair and unjust outcomes when models are deployed in real-world scenarios. This study investigates the efficacy of data acquisition, i.e., one of the stages of data preparation, akin to the pre-processing bias mitigation technique. Through experimental evaluation, we showcase the effectiveness of data acquisition, where the data is acquired using data valuation techniques to enhance the fairness of machine learning models.</p> Algorithmic Fairness Bias influence functions Data Acquisition Fairness Machine Learning Models Bias mitigation German credit data Adult census dataset COMPAS dataset
6	Benchmarking bias mitigation algorithms in representation learning through fairness metrics Reddy, Charan 07 1900 (has links) Le succès des modèles d’apprentissage en profondeur et leur adoption rapide dans de nombreux domaines d’application ont soulevé d’importantes questions sur l’équité de ces modèles lorsqu’ils sont déployés dans le monde réel. Des études récentes ont mis en évidence les biais encodés par les algorithmes d’apprentissage des représentations et ont remis en cause la fiabilité de telles approches pour prendre des décisions. En conséquence, il existe un intérêt croissant pour la compréhension des sources de biais dans l’apprentissage des algorithmes et le développement de stratégies d’atténuation des biais. L’objectif des algorithmes d’atténuation des biais est d’atténuer l’influence des caractéristiques des données sensibles sur les décisions d’éligibilité prises. Les caractéristiques sensibles sont des caractéristiques privées et protégées d’un ensemble de données telles que le sexe ou la race, qui ne devraient pas affecter les décisions de sortie d’éligibilité, c’està-dire les critères qui rendent un individu qualifié ou non qualifié pour une tâche donnée, comme l’octroi de prêts ou l’embauche. Les modèles d’atténuation des biais visent à prendre des décisions d’éligibilité sur des échantillons d’ensembles de données sans biais envers les attributs sensibles des données d’entrée. La difficulté des tâches d’atténuation des biais est souvent déterminée par la distribution de l’ensemble de données, qui à son tour est fonction du déséquilibre potentiel de l’étiquette et des caractéristiques, de la corrélation des caractéristiques potentiellement sensibles avec d’autres caractéristiques des données, du décalage de la distribution de l’apprentissage vers le phase de développement, etc. Sans l’évaluation des modèles d’atténuation des biais dans diverses configurations difficiles, leurs mérites restent incertains. Par conséquent, une analyse systématique qui comparerait différentes approches d’atténuation des biais sous la perspective de différentes mesures d’équité pour assurer la réplication des résultats conclus est nécessaire. À cette fin, nous proposons un cadre unifié pour comparer les approches d’atténuation des biais. Nous évaluons différentes méthodes d’équité formées avec des réseaux de neurones profonds sur un ensemble de données synthétiques commun et un ensemble de données du monde réel pour obtenir de meilleures informations sur le fonctionnement de ces méthodes. En particulier, nous formons environ 3000 modèles différents dans diverses configurations, y compris des configurations de données déséquilibrées et corrélées, pour vérifier les limites des modèles actuels et mieux comprendre dans quelles configurations ils sont sujets à des défaillances. Nos résultats montrent que le biais des modèles augmente à mesure que les ensembles de données deviennent plus déséquilibrés ou que les attributs des ensembles de données deviennent plus corrélés, le niveau de dominance des caractéristiques des ensembles de données sensibles corrélées a un impact sur le biais, et les informations sensibles restent dans la représentation latente même lorsque des algorithmes d’atténuation des biais sont appliqués. Résumant nos contributions - nous présentons un ensemble de données, proposons diverses configurations d’évaluation difficiles et évaluons rigoureusement les récents algorithmes prometteurs d’atténuation des biais dans un cadre commun et publions publiquement cette référence, en espérant que la communauté des chercheurs le considérerait comme un point d’entrée commun pour un apprentissage en profondeur équitable. / The rapid use and success of deep learning models in various application domains have raised significant challenges about the fairness of these models when used in the real world. Recent research has shown the biases incorporated within representation learning algorithms, raising doubts about the dependability of such decision-making systems. As a result, there is a growing interest in identifying the sources of bias in learning algorithms and developing bias-mitigation techniques. The bias-mitigation algorithms aim to reduce the impact of sensitive data aspects on eligibility choices. Sensitive features are private and protected features of a dataset, such as gender of the person or race, that should not influence output eligibility decisions, i.e., the criteria that determine whether or not an individual is qualified for a particular activity, such as lending or hiring. Bias mitigation models are designed to make eligibility choices on dataset samples without bias toward sensitive input data properties. The dataset distribution, which is a function of the potential label and feature imbalance, the correlation of potentially sensitive features with other features in the data, the distribution shift from training to the development phase, and other factors, determines the difficulty of bias-mitigation tasks. Without evaluating bias-mitigation models in various challenging setups, the merits of deep learning approaches to these tasks remain unclear. As a result, a systematic analysis is required to compare different bias-mitigation procedures using various fairness criteria to ensure that the final results are replicated. In order to do so, this thesis offers a single paradigm for comparing bias-mitigation methods. To better understand how these methods work, we compare alternative fairness algorithms trained with deep neural networks on a common synthetic dataset and a real-world dataset. We train around 3000 distinct models in various setups, including imbalanced and correlated data configurations, to validate the present models’ limits and better understand which setups are prone to failure. Our findings show that as datasets become more imbalanced or dataset attributes become more correlated, model bias increases, the dominance of correlated sensitive dataset features influence bias, and sensitive data remains in the latent representation even after bias-mitigation algorithms are applied. In summary, we present a dataset, propose multiple challenging assessment scenarios, rigorously analyse recent promising bias-mitigation techniques in a common framework, and openly disclose this benchmark as an entry point for fair deep learning. Atténuation des biais Apprentissage par représentation Évaluation du modèle d’équité Apprentissage en profondeur équitable Équité contradictoire Fairness in machine learning Bias mitigation Representation learning Fairness model evaluation Fair deep learning Adversarial fairness

1

Page generated in 0.0756 seconds