skip to main content
research-article

A Methodology for Direct and Indirect Discrimination Prevention in Data Mining

Published: 01 July 2013 Publication History

Abstract

Data mining is an increasingly important technology for extracting useful knowledge hidden in large collections of data. There are, however, negative social perceptions about data mining, among which potential privacy invasion and potential discrimination. The latter consists of unfairly treating people on the basis of their belonging to a specific group. Automated data collection and data mining techniques such as classification rule mining have paved the way to making automated decisions, like loan granting/denial, insurance premium computation, etc. If the training data sets are biased in what regards discriminatory (sensitive) attributes like gender, race, religion, etc., discriminatory decisions may ensue. For this reason, antidiscrimination techniques including discrimination discovery and prevention have been introduced in data mining. Discrimination can be either direct or indirect. Direct discrimination occurs when decisions are made based on sensitive attributes. Indirect discrimination occurs when decisions are made based on nonsensitive attributes which are strongly correlated with biased sensitive ones. In this paper, we tackle discrimination prevention in data mining and propose new techniques applicable for direct or indirect discrimination prevention individually or both at the same time. We discuss how to clean training data sets and outsourced data sets in such a way that direct and/or indirect discriminatory decision rules are converted to legitimate (nondiscriminatory) classification rules. We also propose new metrics to evaluate the utility of the proposed approaches and we compare these approaches. The experimental evaluations demonstrate that the proposed techniques are effective at removing direct and/or indirect discrimination biases in the original data set while preserving data quality.

Cited By

View all
  • (2024)FairGAT: Fairness-Aware Graph Attention NetworksACM Transactions on Knowledge Discovery from Data10.1145/364509618:7(1-20)Online publication date: 12-Feb-2024
  • (2024)The Conflict Between Algorithmic Fairness and Non-Discrimination: An Analysis of Fair Automated HiringProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659015(1907-1916)Online publication date: 3-Jun-2024
  • (2024)Fairness in Machine Learning: A SurveyACM Computing Surveys10.1145/361686556:7(1-38)Online publication date: 9-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering  Volume 25, Issue 7
July 2013
241 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 July 2013

Author Tags

  1. Antidiscrimination
  2. Data engineering
  3. Data mining
  4. Itemsets
  5. Knowledge engineering
  6. Training
  7. Training data
  8. data mining
  9. direct and indirect discrimination prevention
  10. privacy
  11. rule generalization
  12. rule protection

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FairGAT: Fairness-Aware Graph Attention NetworksACM Transactions on Knowledge Discovery from Data10.1145/364509618:7(1-20)Online publication date: 12-Feb-2024
  • (2024)The Conflict Between Algorithmic Fairness and Non-Discrimination: An Analysis of Fair Automated HiringProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659015(1907-1916)Online publication date: 3-Jun-2024
  • (2024)Fairness in Machine Learning: A SurveyACM Computing Surveys10.1145/361686556:7(1-38)Online publication date: 9-Apr-2024
  • (2024)FairSample: Training Fair and Accurate Graph Convolutional Neural Networks EfficientlyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.330637836:4(1537-1551)Online publication date: 1-Apr-2024
  • (2023)Bias Mitigation for Machine Learning Classifiers: A Comprehensive SurveyACM Journal on Responsible Computing10.1145/36313261:2(1-52)Online publication date: 1-Nov-2023
  • (2023)Not So Fair: The Impact of Presumably Fair Machine Learning ModelsProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3600211.3604699(297-311)Online publication date: 8-Aug-2023
  • (2023)Add-Remove-or-Relabel: Practitioner-Friendly Bias Mitigation via Influential FairnessProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency10.1145/3593013.3594039(736-752)Online publication date: 12-Jun-2023
  • (2023)'Choose your Data Wisely': Active Learning based Selection with Multi-Objective Optimisation for Mitigating StereotypesProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615261(3768-3772)Online publication date: 21-Oct-2023
  • (2023)Should Learning Analytics Models Include Sensitive Attributes? Explaining the WhyIEEE Transactions on Learning Technologies10.1109/TLT.2022.322647416:4(560-572)Online publication date: 1-Aug-2023
  • (2023)Detection and evaluation of bias-inducing features in machine learningEmpirical Software Engineering10.1007/s10664-023-10409-529:1Online publication date: 13-Dec-2023
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media