Global ETD Search

Return to search

Data-based Explanations of Random Forest using Machine Unlearning

<p dir="ltr">Tree-based machine learning models, such as decision trees and random forests, are one of the most widely used machine learning models primarily because of their predictive power in supervised learning tasks and ease of interpretation. Despite their popularity and power, these models have been found to produce unexpected or discriminatory behavior. Given their overwhelming success for most tasks, it is of interest to identify root causes of the unexpected and discriminatory behavior of tree-based models. However, there has not been much work on understanding and debugging tree-based classifiers in the context of fairness. We introduce FairDebugger, a system that utilizes recent advances in machine unlearning research to determine training data subsets responsible for model unfairness. Given a tree-based model learned on a training dataset, FairDebugger identifies the top-k training data subsets responsible for model unfairness, or bias, by measuring the change in model parameters when parts of the underlying training data are removed. We describe the architecture of FairDebugger and walk through real-world use cases to demonstrate how FairDebugger detects these patterns and their explanations.</p>

10.25394/pgs.24712758.v1

Data engineering and data science

Data quality

Model Debugging

Example-based explanations

Random Forest Debugging

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/24712758
Date	03 December 2023
Creators	Tanmay Laxman Surve (17537112)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/Data-based_Explanations_of_Random_Forest_using_Machine_Unlearning/24712758

Page generated in 0.0022 seconds

Data-based Explanations of Random Forest using Machine Unlearning

Description

Links & Downloads

Tags

Additional Fields