Spelling suggestions: "subject:"far machine learning"" "subject:"fait machine learning""
1 |
<b>IMPROVING MACHINE LEARNING FAIRNESS BY REPAIRING MISLABELED DATA</b>Shashank A Thandri (20161635) 15 November 2024 (has links)
<p dir="ltr">As Machine learning (ML) and Artificial intelligence (AI) are becoming increasingly prevalent in high-stake decision-making, fairness has emerged as a critical societal issue. Individuals belonging to diverse groups receive different algorithmic outcomes largely due to the inherent errors and biases in the underlying training data, thus resulting in violations of group fairness or bias. </p><p dir="ltr">This study investigates the problem of resolving group fairness by detecting mislabeled data and flipping the label instances in the training data. Four solutions are proposed to obtain an ordering in which the labels of training data instances should be flipped to reduce the bias in predictions of a model trained over the modified data. Through experimental evaluation, we showcase the effectiveness of repairing mislabeled data using mislabel detection techniques to improve the fairness of machine learning models.</p>
|
2 |
Perception of biases in machine learning in production research: a structured literature review dissecting bias categoriesGötte, Gesa, Antons, Oliver, Herzog, Andreas, Arlinghaus, Julia C. 04 November 2024 (has links)
Factories are evolving into Cyber-Physical Production Systems, producing vast data volumes that can be leveraged using computational power. However, an easy and sorrowless integration of machine learning (ML) can lead to too simplistic or false pattern extraction, i.e. biased ML applications. Especially when trained on big data this poses a significant risk when deploying ML. Research has shown that there are sources for undesired biases among the whole ML life cycle and feedback loop between human, data and the ML model. Methods to detect, mitigate and prevent those undesired biases in order to achieve ''fair'' ML solutions have been developed and established in tool boxes in the past years.
In this article, we utilize a structured literature review to address the underappreciated biases in ML for production application and highlight the ambiguity of the term bias.
It emphasizes the necessity for research on ML biases in production and shows off the most relevant blind spots so far. Filling those blind spots with research and guidelines to incorporate bias screening, treatment and risk assessment in the ML life cycle of industrial applications promises to enhance their robustness, resilience and trustworthiness.
|
3 |
Machine Learning Approaches for Speech ForensicsAmit Kumar Singh Yadav (19984650) 31 October 2024 (has links)
<p dir="ltr">Several incidents report misuse of synthetic speech for impersonation attacks, spreading misinformation, and supporting financial frauds. To counter such misuse, this dissertation focuses on developing methods for speech forensics. First, we present a method to detect compressed synthetic speech. The method uses comparatively 33 times less information from compressed bit stream than used by existing methods and achieve high performance. Second, we present a transformer neural network method that uses 2D spectral representation of speech signals to detect synthetic speech. The method shows high performance on detecting both compressed and uncompressed synthetic speech. Third, we present a method using an interpretable machine learning approach known as disentangled representation learning for synthetic speech detection. Fourth, we present a method for synthetic speech attribution. It identifies the source of a speech signal. If the speech is spoken by a human, we classify it as authentic/bona fide. If the speech signal is synthetic, we identify the generation method used to create it. We examine both closed-set and open-set attribution scenarios. In a closed-set scenario, we evaluate our approach only on the speech generation methods present in the training set. In an open-set scenario, we also evaluate on methods which are not present in the training set. Fifth, we propose a multi-domain method for synthetic speech localization. It processes multi-domain features obtained from a transformer using a ResNet-style MLP. We show that with relatively less number of parameters, the proposed method performs better than existing methods. Finally, we present a new direction of research in speech forensics <i>i.e.</i>, bias and fairness of synthetic speech detectors. By bias, we refer to an action in which a detector unfairly targets a specific demographic group of individuals and falsely labels their bona fide speech as synthetic. We show that existing synthetic speech detectors are gender, age and accent biased. They also have bias against bona fide speech from people with speech impairments such as stuttering. We propose a set of augmentations that simulate stuttering in speech. We show that synthetic speech detectors trained with proposed augmentation have less bias relative to detector trained without it.</p>
|
Page generated in 0.09 seconds