Sage Journals: Discover world-class research

Abstract

Machine learning fairness enhancement methods based on data bias correction are usually divided into two processes: The determination of sensitive attributes (such as race and gender) and the correction of data bias. In terms of determining sensitive attributes, existing studies tend to rely too heavily on sociological knowledge and neglect the importance of exploring potential sensitive attributes directly from the data itself. The accuracy of this approach is limited when dealing with data that cannot be fully explained by sociological factors. Regarding data bias correction, existing methods are primarily categorized into causality-based and association-based methods. The former requires a deep understanding of the underlying causal structure in the dataset, which is often difficult to achieve in practice. The latter method correlates sensitive attributes with algorithmic results through statistical measures, but this approach often tends to ignore the impact of sensitive attributes on other attributes. In this paper, we formalize the identification of sensitive attributes as a problem solvable through data analysis, without relying on commonly recognized knowledge in social science. We also propose a data pre-processing method that considers the effects of attributes correlated with sensitive attributes to enhance algorithmic fairness by combining the association-based bias reduction method. We evaluated our proposed method on a public dataset. The evaluation results indicate that our method can accurately identify sensitive attributes and improve the fairness of machine learning algorithms compared to existing methods.

Keywords

Algorithmic fairness bias mitigation machine learning sensitive attribute

1. Introduction

The application of machine learning algorithms has brought significant progress to various public affairs, such as finance, anti-terrorism, taxation, justice, medical care, and insurance, directly impacting the well-being of citizens. However, in recent years, issues of unfairness and discrimination have caused by widely applied machine learning algorithms in areas such as credit scoring (Khandani et al., 2010), crime prediction (Brennan et al., 2009), and loan evaluation (Mahoney & Mohen, 2007) . As a result, the ethics of algorithm, especially concerning the fairness of machine learning algorithms, has gained considerable attention from the public and the government (Kearns & Roth, 2019).

The problem of algorithmic fairness may exacerbate the bias to the groups that have historically been discriminated against. For example, in 2014, a team at Amazon developed an automated hiring system to screen the resumes of the job applicants. According to Reuters (Dastin, 2018), the hiring system was trained based on 10 years of Amazon’s hiring data and it gives a score from 1 to 5 to each job applicant. However, in 2015, the team realized that the system showed a significant gender bias for male candidates and female candidates due to historical discrimination (bias) in the training data. Although Amazon improved the system to hide gender attributes, there was no guarantee that there are biases still in other ways. Therefore, the project was abandoned entirely in 2017. Furthermore, similar examples include gender bias in online advertising and Google image search for occupations (Kay et al., 2015).

Based on the above examples, it is known that the analytical judgments supported by machine learning systems may influence the decision maker. The discrimination presented in these machine learning systems are caused by the bias in training data. And, this discrimination will be reinforced and legitimized by the increasing deployment of machine learning algorithms. How to avoid perpetuating and amplifying the discrimination by machine learning systems have become a critical issue of the algorithmic fairness. The methods of discrimination reduction for machine learning systems are divided into three categories, pre-processing methods (Feldman et al., 2015; Kamishima et al., 2012a), in-processing methods (Kilbertus et al., 2017; Russell et al., 2017) and post-processing (Hardt et al., 2016; Woodworth et al., 2017) methods as shown in Figure 1. Pre-processing methods apply certain operations, like , to remove bias from the training dataset; in-processing methods most commonly add tuning constraints to the training model, such as adjusting the hyperparameters of the classifier; post-processing methods remove bias by adjusting the results of prediction and classification.

Figure 1.

Strategies for algorithmic fairness.

Data bias correction algorithms, also known as algorithmic fairness pre-processing methods, consist of two main processes, determining sensitive attributes and correcting biases caused by sensitive attributes (Yan & Kao, 2020). Sensitive attributes refer to those attributes that may be associated with sensitive characteristics or groups, such as age, gender, race, etc. These attributes may have an impact on the results of the algorithm, leading to unfair results, and therefore require special treatment. After finding the sensitive attributes, the goal of improving the fairness of the algorithm can be achieved by adjusting the algorithm’s input, output or processing process to ensure that the algorithm does not discriminate against the sensitive attributes when processing the data. In the aspect of identifying sensitive attributes, we find that many current studies have resorted to social science help such as seeking expert advice, conducting interviews, surveys (Mehrabi et al., 2021), etc. in identifying sensitive attributes in the dataset. But not to explore the dataset as the first step to reveal the data biases and discover the possible sensitive attributes.

Once the sensitive attributes are identified, the next procedure is bias reduction from dataset. Usually, there are two main bias reduction algorithms, causality based methods (Galhotra et al., 2017; Nabi & Shpitser, 2018) and association based methods (Calders & Verwer, 2010; Dwork et al., 2012). The causality based methods need the expert knowledge of the underlying causal structure in the dataset. This approach is not practical for applying in different areas without domain knowledge. The association based methods require applying heuristic restrictions in bias reduction process, without considering the influence of attributes that correlated with sensitive attributes. When performing bias reduction operations on sensitive attributes, two different strategies can be applied. One is horizontal approach (Salimi et al., 2019), which performs operations on the tuples of the dataset. The other is vertical approach (Salazar et al., 2021), which performs operations on the attributes of the dataset. However, the horizontal approach can be considered invasive because it changes the distribution of the dataset. In practice, the vertical approach is the common way to remove the identified sensitive features directly (Grgic-Hlaca et al., 2016). Doing by this can ensure fairness without tampering the dataset. However, there are multiple attributes correlated with identified sensitive features. If we do not consider the impact of indirect sensitive attributes and remove their effects on fairness, the discrimination reduction operation cannot achieve the expected effectiveness. The discrimination problem of the machine learning system will exist as before. Furthermore, the problem of discrimination detection and processing will become more complex.

In summary, finding a unique approach to optimize the original dataset and maintain the accuracy and fairness of machine learning algorithms is a challenge. In order to reduce the discrimination of machine learning algorithms at the root and increase their fairness, in this paper, we combine the method of principal component analysis to determine the sensitive attributes in the dataset, and select the pre-processing algorithm in algorithmic fairness, and optimize the correlation method in bias reduction algorithm to improve fairness. Overall, the contributions of the work are listed as following.

We demonstrate that the identification of sensitive attributes can be achieved by prior analysis of the dataset and combining the results with the principal component analysis algorithm without directly using sociological scientific methods.

We propose an algorithm combining feature deletion and correlation removal between features to improve fairness, and measure the performance of our algorithm in terms of fairness and accuracy.

We conducted a series of experiments on a known dataset to demonstrate the various steps of our algorithm, including sensitive attribute determination, fairness enhancement, and performance evaluation.

2. Related Works

In this section, we will provide a review of relevant works of fairness definitions, three bias reduction algorithms and problems in existing works. In the following chapters, we will improve on the shortcomings of the existing methods to realize our approach.

2.1. Fairness Definitions and Research

Fairness machine learning algorithms need to consider two closely related aspects: Firstly, how fairness is defined in a given social scenario, and secondly, the level of social acceptability. If sensitive attribute are not used in a machine learning algorithm for classification and prediction, the algorithm satisfies fairness without awareness (fairness through unawareness [FTU]). Individual fairness, on the other hand, was proposed by Grgic-Hlaca et al. (2016) in 2012 and is achieved when the algorithm predicts the same outcome for similar individuals. In other words, if two individuals are similar according to a certain metric, their predictions should also be similar (Joseph et al., 2016; Zemel et al., 2013). Kim et al. improved on this concept by introducing preference-informed individual fairness, which allows for some deviation from individual fairness to meet personal preferences and provide more favorable solutions for individuals.

In legal contexts, fairness of decision-making processes is typically evaluated based on two main criteria: differential treatment and differential impact. The above definitions have inspired various researchers to explore ways to promote fairness in decision-making processes. For instance, Zafar et al. (2017) have investigated how to remove sensitive attributes from decision-making to avoid differential treatment, and how to add fairness constraints to eliminate differential impact. They have also introduced covariance to transform non-convex problems into convex shapes and examined the sensitive attributes of multi-classification and the analysis of multiple sensitive attributes. On the other hand, Beretta et al. (2019) have combined different democratic ideals with the concept of fairness to propose evaluation criteria for fairness that are suitable for different democratic backgrounds. They have suggested that counterfactual fairness, unconscious fairness, and fairness based on group conditional fairness are more suitable for competitive democracy, while individual fairness is more appropriate for liberal democracy, and preference-based fairness is more fitting for egalitarian democracy.

Salimi et al. (2019) introduced a user-centric approach for feature classification by allowing users to categorize features as sensitive, acceptable, or unacceptable. Acceptable features are those that the user allows to influence the classifier’s predictions, while unacceptable features are those that may introduce biases based on sensitive attributes. They also proposed a Capuchin (CA) system that can repair data that does not conform to the user’s feature classifications by adding or removing tuples. This system is designed to provide users with greater control over the fairness of the model by allowing them to specify which features are considered sensitive and ensuring that the model is not influenced by them. The CA system can also help to reduce the impact of biases by repairing the data that may leak sensitive attribute biases.

2.2. Bias Reduction Algorithm

To address the problem of algorithmic discrimination, different bias reduction strategies are discussed and categorized in pre-processing methods, in-process processing methods, and post-processing methods.

2.2.1. Pre-processing Methods

The unfairness present in the training data is learned by the algorithm, and pre-processing fairness can be obtained if the training algorithm is made incapable of learning that bias, which can be categorized into two types: Firstly, changing the values of sensitive attributes or class labels of individual items in the training data, secondly mapping the training data into a transformation space where the dependency between sensitive attributes and class labels disappears. Feldman et al. (2015) modified each attribute so that the marginal distributions based on a given subset of sensitive attributes are all equal, and this change does not affect other variables. The transformed data retains most of the feature signals of the non-sensitive attributes. Cross-sensitive attributes are also proposed and the effects of the two sensitive attributes are not superimposed. Other approaches include having binary sensitive attributes and binary classification problems, improvements in pre-processing techniques, suppressing sensitive attributes, adapting the dataset by changing the class labels, re-weighting or re-sampling the data to eliminate discrimination without re-labeling the instances Calders (2012). Calmon et al proposed a convex optimization for learning data transformations with the objectives of controlling discrimination, limiting the individual data samples in distortion, and preserving utility.

2.2.2. In-processing Methods

The most common improvement to a specific machine learning algorithm is to attach constraints to the algorithm. Kusner et al. (2017) introduced causal modeling to the algorithms and gave three ways to achieve fairness in different classes of algorithms. (1) Modeling by using attributes that are not directly or indirectly related to the sensitive attributes; (2) modeling by latent variables, which are non-deterministic elements of observable variables ; and (3) modeling through deterministic models with latent variables (e.g., additive error models). Grgi-Hlaa et al. (2019) improved logistic regression and support vector machine algorithms under different misclassification rates based on the absence of bias in the historical information, which provides a flexible trade-off between fairness and accuracy based on different misclassification rates. This method works better when sensitive attribute information is not available. Zemel et al. (2013) combined preprocessing and algorithm modification to learn canonical data representation to achieve efficiency in classification while achieving independence from sensitive attribute values. Kearns et al. (2017) combined ex ante fairness and ex post fairness by utilizing the cumulative distribution function of different individuals given a set of individual scores based on the candidate’s empirical values to provide confidence intervals, and then assigning the scores to the candidates with the used bias bounds, running the NoisyTop algorithm to provide an approximation of fairness. Kamishima et al. (2012a) introduced a regularization term centered on fairness and applied it to a logistic regression classification algorithm. Calders and Verwer (2010) constructed a separate model for each value of a sensitive attribute and based on the corresponding values of the input attributes to appropriately select the model, and evaluated the fairness of the iterative combined model under the CV metric. Bose and Hamilton (2019) addressed the problem that existing graph embedding algorithms could not handle the fairness constraints, and imposed fairness constraints on graph embeddings by introducing an adversarial framework that removes more sensitive information using a composite framework under the condition of ensuring that the learned representations are not correlated with the sensitive attributes.

2.2.3. Post-processing Methods

Hardt et al. (2016) consider post-processing the probability estimates of unfair categories in the case of sensitive attributes by learning different decision thresholds for different sensitive attributes and applying these specific thresholds at the time of decision making Kamishima et al. (2012b) satisfy the fairness constraints by modifying the leaf labels in the decision tree after training. Woodworth et al. (2017) take the first order moments of statistical and computational theory to learn nondiscriminatory predictions, and proposed a statistically optimal second-order moment procedure, while being more relaxed about nondiscrimination on second-order moments, making the algorithm easy to learn.

2.3. Problems in Existing Works

Although the existing methods can improve the fairness of the algorithm to some extent, there are still some problems. For example, there are some shortcomings in the definition of fairness, such as Unawareness proposed by Zafar et al., which overemphasizes the constraint of sensitive attributes while ignoring the agent attributes highly related to sensitive attributes. As a result, the generated model cannot improve fairness well. The individual fairness proposed by DWork et al. cannot properly quantify the gap between individuals. The most advanced bias reduction algorithm, CA, breaks the causal chain of these attributes by adding and removing tuples. However, this horizontal approach can be considered invasive because it alters the data distribution. The vertical method in addition to the horizontal method is to completely remove the sensitive features. While this would ensure fairness and not tampering with data, it could also compromise the accuracy of machine learning.

3. Problem Definition

In this section, we introduce several definitions about algorithm fairness, sensitive attributes and indirect sensitive attributes. The symbols used in definitions are listed in Table1.

Table 1.
Symbols Used in the Paper.

Symbol Meaning

$D$ The dataset instance

$A$ The attribute of dataset

$S_{a}^{^{'}}$ Suspected sensitive attribute

$S_{a}$ Sensitive attribute

$I_{a}$ Indirect sensitive attribute

$T_{a}$ Target attributes

$M$ Machine learning model

$S T$ Stability of grouping

Symbol	Meaning
$D$	The dataset instance
$A$	The attribute of dataset
$S_{a}^{^{'}}$	Suspected sensitive attribute
$S_{a}$	Sensitive attribute
$I_{a}$	Indirect sensitive attribute
$T_{a}$	Target attributes
$M$	Machine learning model
$S T$	Stability of grouping

3.1. Sensitive Attribute

In contrast to previous works, this paper does not use social science expert experience directly to identify sensitive attributes. Firstly, we used data analysis methods, analyzing and comparing attributes in the dataset to identify sensitive attributes. Following, we used principal component analysis to validate the sensitive attributes we identified. Finally, to prove the credibility of this method, we compared the consistency of the identified sensitive attributes with those identified using social science methods.The specific process is shown in Figure 2.

Figure 2.

Sensitive attributes identification.

The $S T$ is the grouping stability of attributes, we supposed that attribute A has $n$ different values, $p_{n}$ is the probability distribution of the target attribute $T_{a}$ at the different values of attribute A. $p$ is the mean value of these probabilities, then the attribute value grouping stability of this attribute $A$ is

S T = \frac{n}{(p_{1} - p)^{2} + (p_{2} - p)^{2} + \dots + (p_{n} - p)^{2}}

(1)

Sensitive attribute is usually a binary attribute. For each attribute

S_{a} \in D

, let

T_{a}

to be the target variable.

A

are the other attributes excluding

S_{a}

and

T_{a}

. We used P(

T_{a} | S_{a}, A

) to represent the outcome of the machine learning model

M

. The dataset

D

can be divided into two subsets

D_{1}

and

D_{2}

based on the value of

S_{a}

. We use

α

as the score of attribute as .

α = \frac{P_{D_{1}} (T_{a} | S_{a}, A)}{P_{D_{2}} (T_{a} | S_{a}, A)} \times S T

(2)

If the

α

is not unique and the

α

of attribute

S_{a}

is the bigger value among all attributes, we call attribute

S_{a}

as a sensitive attribute.

3.2. Indirect Sensitive Attribute

In previous studies, a lot of association based methods only considered the impact of sensitive attributes $S_{a}$ on fairness. But these methods are not scientific, and the improvement in fairness would not be as desirable. There are many attributes $A$ can represent sensitive attribute $S_{a}$ . For example, in the historical practice of “redlining” in the United States, home mortgages are denied in postal code areas mainly inhabited by ethnic minorities. In this instance, people use postal code represents the race. These attributes $A$ have a higher correlation with sensitive attributes than other attributes, we called these attributes are indirect sensitivity attributes $I_{a}$ . The presence of indirectly sensitive attributes $I_{a}$ can also have an impact on the fairness of the model. So, the next thing we need to do is to eliminate the impact of $I_{a}$ on fairness.

Given a dataset and one of its sensitive attribute $S_{a}$ , we refer the other attribute as the indirect sensitive attribute $I_{a}$ if $Ψ_{S_{a}, I_{a}}$ ranges over a certain interval. $Ψ_{S_{a}, I_{a}}$ is calculated as following equation.

Ψ_{S_{a}, I_{a}} = \frac{c o v (S_{a}, I_{a})}{σ_{S_{a}} σ_{I_{a}}} = \frac{E (S_{a} I_{a}) - E (S_{a}) E (I_{a})}{\sqrt{E (S_{a}^{2}) + E^{2} (S_{a})} \sqrt{E (I_{a}^{2}) + E^{2} (I_{a})}}

(3)

3.3. Improve Fairness

We used the pre-processing method of algorithmic fairness to improve the fairness of machine learning models. Our aim is to reduce the problem of discrimination in machine learning models by reducing bias in the dataset. In practice, we used association based methods among the data bias reduction algorithms to reduce the bias of dataset. Firstly, we will introduce several definitions of fairness.

Relative Fairness: Suppose we have two different groups: Group A and Group B. Let $T_{a}$ be an outcome variable of a machine learning model $M$ that can take the value of positive (+) or negative (-). For Group A, we define $P (T_{a} = + | A)$ as the conditional probability to get a positive outcome from $M$ . For Group B, we define $P (T_{a} = + | B)$ as the conditional probability to get a positive outcome from $M$ . If

P (T_{a} = + | A) \approx P (T_{a} = + | B)

(4)

we claim that these two groups satisfy the relative fairness when applying

M

Statistical Parity/Fairness in a Labeled Dataset: Given a labeled dataset $D$ , the property of fairness in the labeled dataset $D$ is defined as:

P (T_{a} = 1 | S_{a} = 1) = P (T_{a} = 1 | S_{a} = 0)

(5)

The discrimination in a labeled dataset the sensitive attribute

S_{a}

is evaluated by the risk difference as:

bias (D) = \frac{P (T_{a} = 1 | S_{a} = 1)}{P (T_{a} = 1 | S_{a} = 0)}

(6)

The classification fairness on a dataset

D

is achieved if both the disparate treatment and disparate impact are removed from the data. To remove the disparate treatment, the classifier cannot use the sensitive attribute

S_{a}

to make decisions.

Statistical Parity/Fairness in a Classifier: Given a labeled dataset $D$ and a classifier $ι$ : X $\to T_{a}$ , the property of fairness in a classifier $ι$ is defined as:

P (ι (X = 1 | S_{a} = 1)) = P (ι (X = 1 | S_{a} = 0))

(7)

We can then derive the discrimination in a classifier in terms of risk difference as:

bias (ι) = \frac{P (ι (X = 1 | S_{a} = 1))}{P (ι (X = 1 | S_{a} = 0))}

(8)

The algorithm for improving fairness is shown in Algorithm 1, we combine the definition of fairness with an analysis of indirect sensitivity attributes. We have made improvements to the FTU architecture that progressively enhance the fairness of the model.

4. Experiment and Result

We conducted experiments on two datasets commonly used for algorithmic fairness, we present these two datasets separately. The experimental procedure are as follow.

4.1. Dataset Introduction

4.1.1. Adult Dataset

The Adult dataset (Dua et al., 2017)¹ contains information from U.S. Census of the 1994. The prediction task is to determine whether a person’s annual income exceeds 50k dollars. In the above section, we have used the adult dataset to demonstrate how to identify sensitive attributes, and improving the fairness of the prediction model.

4.1.2. Correctional Offender Management Profiling for Alternative Sanctions Dataset

Correctional offender management profiling for alternative sanctions (COMPAS) (Larson et al., 2016)² is a commonly used crime risk assessment tool that is widely used in the criminal justice system in the U.S. The COMPAS dataset is the dataset associated with this tool for training and evaluating crime risk assessment algorithms. It is often used for prediction tasks related to crime, such as whether the offender will reoffend within two years, whether the offender will return to violent crime within two years and whether the defendant will evade court when he appears in court. Afterwords, we used COMPAS dataset to validate our method again.

4.2. Sensitive Attributes Identification

4.2.1. Data Exploration and Analyzing

We perform data exploration on the Adult dataset, which is widely used to predict whether annual income exceeds 50 k dollars. So, we used the income attribute as its target attribute $T_{a}$ for the Adult dataset. We compute the distribution probability of the target attribute $T_{a}$ on different values of the each attribute $A \in D$ . The statistical results of several attributes are shown in Tables 2 and 3. We can see that for the same attribute, the distribution probability of the target attribute varies significantly on different values of each attribute. These differences may have an impact on the stability of this attribute grouping when $T_{a}$ as the target attribute. For simplify, we will refer to people with incomes higher than 50 k as the high-income group and those with incomes less than or equal to 50 k as the non-high-income group.

Table 2.
The Probability Distribution of the High-income Group in Different Values 1.

Attribute Value $p_{n}$

Sex Male 0.30

Female 0.10

Race White 0.25

Black 0.123

Asian-Pac-Islander 0.265

Amer-Indian-Eskimo 0.115

Other 0.09

Marital-status Divorced 0.10

Married-AF-spouse 0.43

Married-civ-spouse 0.44

Married-spouse-absent 0.08

Never-married 0.04

Separated 0.06

Widowed 0.08

Occupation Adm-clerical 0.13

Armed-Forces 0.11

Craft-repair 0.22

Exec-managerial 0.48

Farming-fishing 0.11

Handlers-cleaners 0.06

Machine-op-inspct 0.12

Other-service 0.04

Priv-house-serv 0.00

Prof-specialty 0.34

Protective-serv 0.32

Sales 0.26

Tech-support 0.30

Transport-moving 0.20

Attribute	Value	$p_{n}$
Sex	Male	0.30
	Female	0.10
Race	White	0.25
	Black	0.123
	Asian-Pac-Islander	0.265
	Amer-Indian-Eskimo	0.115
	Other	0.09
Marital-status	Divorced	0.10
	Married-AF-spouse	0.43
	Married-civ-spouse	0.44
	Married-spouse-absent	0.08
	Never-married	0.04
	Separated	0.06
	Widowed	0.08
Occupation	Adm-clerical	0.13
	Armed-Forces	0.11
	Craft-repair	0.22
	Exec-managerial	0.48
	Farming-fishing	0.11
	Handlers-cleaners	0.06
	Machine-op-inspct	0.12
	Other-service	0.04
	Priv-house-serv	0.00
	Prof-specialty	0.34
	Protective-serv	0.32
	Sales	0.26
	Tech-support	0.30
	Transport-moving	0.20

Table 3.

The Probability Distribution of the High-income Group in Different Values 1.

Attribute	Value	$p_{n}$
Relationship	Husband	0.44
	Not-in-family	0.10
	Other-relative	0.03
	Own-child	0.01
	Unmarried	0.06
	Wife	0.47
Workclass	Federal-gov	0.38
	Local-gov	0.29
	Private	0.21
	Self-emp-inc	0.55
	Self-emp-not-inc	0.28
	State-gov	0.27
Education	10th	0.06
	11th	0.05
	12th	0.07
	1st-4th	0.03
	5th-6th	0.04
	7th-8th	0.06
	9th	0.05
	Assoc-acdm	0.24
	Assoc-voc	0.26
	Bachelors	0.41
	Doctorate	0.74
	HS-grad	0.15
	Masters	0.55
	Prof-school	0.73
	Some-college	0.19

Figure 3.

Group attributes.

Take the attribute of race as an example, we can find in Table 2 the shares of high-income groups among different values on race attribute: the White is $25 %$ , the Black is $12.3 %$ , the Asian-Pac-Islander is $26.5 %$ , the Amer-Indian-Eskimo is $11.5 %$ , the Other is $9 %$ . It can be clearly observed that the probability distribution of high-income groups is different among different races. This unbalanced probability distribution can influence the $S T$ of income attribute based on race. In the next step, we divide the values of each attribute into two groups based on their sample size and compute the binary classification stability of income.

4.2.2. Select Sensitive Attributes

In algorithmic fairness, sensitive attributes are usually considered to be a binary attribute. Because in many instances, they can be grouped into two different values or categories. By turning sensitive attributes into binary attributes, fairness criteria can be more easily defined and measured. In our work, we also define sensitive attributes as binary attributes, so we can divide the values of each attributes into two groups. Divide the values with the largest sample size into one group and the other values into another group. For example, for the attribute of race, we divide the race of White into one group as the sample size of White in race is the largest. And then, we divide the races of Black, Asian-Pac-Islander, Amer-Indian-Eskimo and Other into another group. The grouping results are shown in Figure 3, we use group $D_{1}$ and group $D_{2}$ to denote the two different groups of the same attribute.

After that, we compute the stability of the attribute grouping $S T$ of each attribute when $T_{a}$ as the target attribute. The $S T$ can be calculated using equation (1) based on the $p_{n}$ in Table 2. The stability of the grouping of each attribute is used to measure the rationality of the grouping of the attribute. For one attribute, if the probability distribution of target attribute $T_{a}$ in different values of same attribute has obvious fluctuations, we would argue that this attribute cannot be used as a very desirable binary attribute. Similarly, the grouping of this attribute will be less stability. The results of the stability of grouping among attributes are shown in Table 4.

Table 4.
The Grouping Stability of Each Attribute.

Attribute Stability of binary classification

Race 180.95

Sex 103.83

Workclass 79.17

Occupation 58.51

Marital-status 36.48

Relationship 26.52

Education 17.18

Attribute	Stability of binary classification
Race	180.95
Sex	103.83
Workclass	79.17
Occupation	58.51
Marital-status	36.48
Relationship	26.52
Education	17.18

In Table 4, we can observe that the grouping stability of race attribute is the highest, the following is the attribute of sex. We identified race and sex are the more desirable binary attributes. After that, we should compute the probability distribution of the target attribute $T_{a}$ on two groups of the same attribute. We re-compute the probability distribution of the high-income group after grouping attributes. The statistical results are shown in Table 5

Table 5.

The Probability Distribution of $T_{a}$ After Binary Classification and the $α$ of Attribute.

Attribute	P( $T_{a} \| S_{a}, A$ ) $D_{1}$ , $D_{2}$	$α$
Sex	0.30	311.49
	0.10
Marital-status	0.44	267.52
	0.06
Race	0.25	196.68
	0.23
Relationship	0.44	194.48
	0.06
Workclass	0.35	131.95
	0.21
Occupation	0.44	122.59
	0.21
Education	0.28	23.09
	0.15

In this table, we divided values of each attribute into two groups $D_{1}$ and $D_{2}$ , and we statistic the data distribution of high-income group $P_{D_{1}}$ and $P_{D_{2}}$ among $D_{1}$ and $D_{2}$ . And then, we can observe the difference between $P_{D_{1}}$ and $P_{D_{2}}$ . For each attribute, taking high-income as the target attribute $T_{a}$ , we can use the ratio between $P_{D_{1}}$ and $P_{D_{2}}$ as the influence factor of this attribute. The value of influence factor represents the extend of unbalanced distribution of target attribute $T_{a}$ on each attribute. Taking attribute marital-status as an example, by comparing the $P_{D_{1}}$ and $P_{D_{2}}$ , we can find that the effects of the attribute marital-status on data distribution of high-income group after grouping this attribute. The proportion $P_{D_{1}}$ of high-income groups in group $D_{1}$ is $44 %$ , and $P_{D_{2}}$ in group $D_{2}$ is $6 %$ . So, the influence factor of the attribute marital-status to the target attribute $T_{a}$ is $7.3$ .

Afterwards, we calculate $α$ according by equation (2). The calculation results are shown in Table 5. The $α$ is a score of one attribute as a sensitive attribute. It can be computed by the product of influence factor and the stability of grouping. We generally choose the attribute $A$ with the bigger value of $α$ to be the sensitive attribute $S_{a}$ . For the Adult dataset, we can see that the $α$ of attribute sex has the largest value in Table 5. So, the attribute sex was selected as one sensitive attribute of Adult dataset when taking high-income as the target attribute. Next, we will use the method of singular value decomposition (SVD) to validate the selected possible sensitive attribute sex further.

4.2.3. Sensitive Attributes Verification using SVD

SVD (Hoecker & Kartvelishvili, 1996) is used widely in the field of machine learning, which mainly used in dimensionality reduction algorithm. SVD can help us to represent complex datasets more simply. After performing SVD dimensionality reduction on the dataset, we can observe the data relationship more clearly.

To get a clear view of the effects of sensitive attributes on the dataset, We also use the SVD method for dimension reduction of the Adult dataset. First, we use the income attribute as the target $T_{a}$ of dimension reduction. Next, We perform data dimensionality reduction on the original dataset and the dataset after removing the sensitive attribute separately. Afterwards, we compare the data distribution of target attribute $T_{a}$ in difference dimension reduction model.

The change of target attribute $T_{a}$ ’s data distribution is shown in Figure 4. We can find the sensitive attribute $S_{a}$ have an effect on the target attribute $T_{a}$ . If we don’t remove sensitive attribute $S_{a}$ , we will get significantly different dimensionality reduction data when we removed the sensitive attribute $S_{a}$ . This show that sensitive attribute $S_{a}$ have an impact on the classification of target attribute. If the classification is done with sensitive attribute $S_{a}$ , the results will be significantly different, it can also lead to unfair result. This result shown that the sex attribute and marital-status may be sensitive attribute $S_{a}$ . At the same time, in a previous study the sex attribute is also the sensitive attribute $S_{a}$ in the Adult dataset.

Figure 4.

Sensitive attributes verification using singular value decomposition (SVD). (a) original dataset, (b) remove sex, (c) remove marital-status and (d) remove race.

4.3. Improve Fairness

In this section, we used the pre-processing method of algorithmic fairness to improve the fairness of machine learning models. Our aim is to reduce the problem of discrimination in machine learning models by reducing bias in the dataset. In practice, we used association based methods among the data bias reduction algorithms to reduce the bias of dataset. The specific process is shown as follows.

4.3.1. Indirect Sensitive Attributes Identification

Before identifying indirect sensitive attributes, we need to analyze the correlation (Asuero et al., 2006) between other attributes in addition to sensitive attribute $S_{a}$ . The purpose of this step is to find similar attributes with high correlation, and in the subsequent analysis, we only analyze one attribute of these similar attributes. And then, the analysis results of this attribute are used to represent these similar attributes, using this approach can help us reduce the amount of data to be analyzed. Taking the Adult dataset as an example, from the analysis above, we know that the sensitive attribute $S_{a}$ in the Adult dataset is the attribute of sex. We do not consider the attribute of sex, analyzing the correlation between other attributes, the analysis results are shown in Figure5. In Figure 5, we can find the correlation among education attribute and education-num attribute, as well as the correlation among relationship attribute and marital-status attribute are higher than other attributes. So, in the following analysis, we just consider one attribute from these two groups of attributes separately. At the same time, the fnlwgt attribute is for serial number, this attribute have little value for further analysis, we will also not consider this property in the next study.

Figure 5.

Correlations among attributes.

After correlation analysis between each attribute is performed, we need to determine indirect sensitive attributes $I_{a}$ by sensitive attribute $S_{a}$ . We refer to those attributes that have a high correlation with sensitive attributes $S_{a}$ as indirect sensitive attributes $I_{a}$ . We use Definition 3.2 to calculate the correlation between other attributes $A$ and the sensitive attribute $S_{a}$ . The calculation results are shown in Figure 6. In Figure 6, we can find that the attribute of marital-status and the attribute of hours-per-week have higher correlation with the attribute of sex than other attributes. And because the attribute of relationship is the similar attribute with the attribute of marital-status. We considered the attribute of relationship, marital-status and hours-per-week are the possible indirect sensitive attributes $I_{a}$ .

Figure 6.

Correlation of other attributes with $S_{a}$ .

4.3.2. Fairness Enhancement and Evaluation

Having already identified the sensitive attribute $S_{a}$ and indirect sensitive attributes $I_{a}$ , following, we should remove the biases among the dataset. We used association based methods to reduce the data biases. The specific steps are shown as follow.

We will use the Adult dataset to train a predictor, the Adult dataset was originally a categorical dataset used to predict whether a person’s annual income would exceed $$$ 50,000. In this paper, we wanted to make the experiment objectives clearer and more intuitive, so we transform the classification problem of the Adult dataset as a decision problem for loans. We will assume that the value of the target attribute $T_{a}$ indicates whether each individual has repaid a loan in the past. The attribute of income is the target attribute $T_{a}$ in Adult dataset. We set the value of $> 50 k$ in the attribute of income to indicate the person who has repaid the loan in the past, the value of $<= 50 k$ in the attribute of income to indicate the person who has repaid the loan in the past. We use this data to train a predictor to predict whether previously unseen people will repay the loan. Afterwards, we will use the model’s predictions to decide whether the loan should be extended to the individual. We will train the dataset by using a logistic regression (LaValley, 2008) model, the logistic regression is a generalized linear regression analysis model, which belongs to the supervised learning in machine learning. It’s derivation process and calculation is similar to that of regression, but it is mainly used to solve binary classification problems actually.

First of all, We divide the Adult dataset into a training set and a test set, where we use $60 %$ of the data as the training set and the other $40 %$ as the test set. The training objective of the machine model is to predict whether the value of the income attribute is $> 50 k$ as a proof of the ability of people to repay the loan. And then, we train a predictor that does not consider fairness, and we proved it will lead to unfairness decisions under the concept of relative fairness. We use the Metrics Module from Fairlearn (Bird et al., 2020) to evaluate the fairness of the model, the metrics of evaluation include, accuracy score and selection rate, the accuracy score represents the accuracy of the prediction model, the selection rate represents the percentage of people who can obtain loans. We evaluate the overall prediction results firstly, and then evaluated male and female separately according to the sensitive attribute $S_{a}$ of sex. The results are shown in Table 6. In Table 6, we used ratio about the selection rate of male and female to evaluate the fairness of this model, we can find the selection rate of the female and male have a lot of difference. This situation indicates that there is an unfair problem in this prediction model.

Table 6.
The Prediction Results of Predictor that does not Consider Fairness.

Group Accuracy score Selection rate Count

All groups 0.851666 0.196601 19537.0

Female group 0.925960 0.074040 6537.0

Male group 0.814538 0.258231 13000.0

Group	Accuracy score	Selection rate	Count
All groups	0.851666	0.196601	19537.0
Female group	0.925960	0.074040	6537.0
Male group	0.814538	0.258231	13000.0

Next, we will improve the fairness of this model $M$ , we use the method of removing attributes to increase the fairness of the machine model. If attributes in the dataset are to be deleted, the dataset will be changed, so it may not be appropriate to directly compare the accuracy of the predicted results before and after the attributes are deleted. Therefore, when making comparisons, we need to use some measures to ensure that comparisons can be made between the predicted results before and after attribute deletion, as well as preserving the credibility of the comparison results. Before removing the attributes, we split the dataset into two parts: a training set and a test set. Then we train the model on the training set and test the accuracy of the model on the test set. After removing the attributes, it is also necessary to split the dataset into a training set and a test set, and then test the accuracy of the model on the new dataset using the same machine learning model and the same evaluation metrics. We enhance predictive model fairness based on the definition of FTU (Grgic-Hlaca et al., 2016) firstly. The definition of FTU is that if the sensitive attribute $S_{a}$ is not used in the algorithm for training and prediction, the fairness of this algorithm satisfies FTU. Since the attribute of sex is the sensitive attribute $S_{a}$ of the Adult dataset, we have to separate the attribute of sex from the dataset. After that, we divide the Adult dataset into a training set and a test set after removing the sensitive attribute sex. We use a logistic regression model to train the modified dataset, and then we also use the Metrics Module to evaluate the accuracy and fairness among the new machine mold. We evaluated the accuracy and fairness of female and male separately. The evaluation results are shown in Table 7.

Table 7.

The Prediction Results of Predictor under FTU.

Group	Accuracy score	Selection rate	Count
All groups	0.851820	0.194707	19537.0
Female group	0.925960	0.076794	6537.0
Male group	0.814538	0.254000	13000.0

FTU = fairness through unawareness.

In the Table 7 we can find, the fairness don’t have significant improvement, the error rate for male is about three times greater than female, and the more interesting phenomenon is that the selection rate of male is also three times greater than female, this means that probability of getting a loan of male are three times more than female. Although we removed the sensitive attribute of sex from the training data, our predictor still discriminates based on sex. This indicates that simply ignoring a sensitive attribute $S_{a}$ when fitting a predictor is not enough. So we also need to remove indirect sensitive attributes $I_{a}$ that are highly correlated with sensitive attribute $S_{a}$ . In the previous subsection, we have tentatively identified several indirect sensitive attributes, they are attributes of relationship, marital-status and hours-per-week, we removed these attributes to observe changes of accuracy and fairness. The trend is shown in Figure 7. In Figure 7, We use the ratio of the selection rate of female to male as a measure of fairness. The original model in the horizontal coordinate represents that the prediction model does not understand the fairness. The FTU in the horizontal coordinate represents that the prediction model under FTU. The other values in the horizontal coordinate represent the change of prediction model after removing the indirect sensitive attributes $I_{a}$ . We can find after removing indirect sensitive attribute $I_{a}$ , the accuracy of prediction model have little decrease, at the same time the fairness of prediction model have a significant improvement.

Figure 7.

The variations of accuracy and fairness with sensitive attributes removed from adult dataset. (a) Accuracy and (b) fairness.

5. Discussion

5.1. Experiment on the COMPAS Dataset

Firstly, we demonstrate our method again on the COMPAS dataset. We determined the sensitive attribute $S_{a}$ in the COMPAS dataset, In the first place, we used the attribute of is-recid as the target attribute $T_{a}$ . This attribute used to predict whether a offender will re-offend within two years, it has two different values 0 and 1 used to indicate the offender will not re-offend and will re-offend respectively. Next, we compute the grouping stability $S T$ of other attributes among COMPAS dataset, and then we grouping the each attributes into two groups $D_{1}$ and $D_{2}$ . We count the probability distribution of offender will not re-offend $P_{D_{1}}$ and $P_{D_{2}}$ in different groups $D_{1}$ and $D_{2}$ . We used the ratio of the probability distribution of $P_{D_{1}}$ and $P_{D_{2}}$ in $D_{1}$ and $D_{2}$ as the impact factor of $T_{a}$ . Last, we compute the score of sensitive attribute $S_{a}$ by impact factor and grouping stability. The computation results are shown in Table 8.

Table 8.
The $S_{a}$ Score of Attributes Among COMPAS Dataset.

Attribute Stability of grouping P( $T_{a} | S_{a}, A$ ) $D_{1}$ , $D_{2}$ Impact factor $α$

C-charge-degree 358.98 0.58 1.20 432.58

0.48

Race 261.60 0.59 1.34 350.78

0.44

Sex 254.02 0.62 1.26 321.38

0.49

Age-cat 90.80 0.53 1.06 96.24

0.50

Priors-count 41.16 0.60 1.93 79.66

0.31

Score-text 37.09 0.65 1.85 68.88

0.35

Decile-score 33.93 0.63 1.96 66.79

0.32

Attribute	Stability of grouping	P( $T_{a} \| S_{a}, A$ ) $D_{1}$ , $D_{2}$	Impact factor	$α$
C-charge-degree	358.98	0.58	1.20	432.58
		0.48
Race	261.60	0.59	1.34	350.78
		0.44
Sex	254.02	0.62	1.26	321.38
		0.49
Age-cat	90.80	0.53	1.06	96.24
		0.50
Priors-count	41.16	0.60	1.93	79.66
		0.31
Score-text	37.09	0.65	1.85	68.88
		0.35
Decile-score	33.93	0.63	1.96	66.79
		0.32

COMPAS = correctional offender management profiling for alternative sanctions.

In Table 8, we can find the c-charge-degree attribute and race attribute have high probability of being a sensitive attribute. After verification by SVD, we considered the race attribute is the most likely sensitive attribute $S_{a}$ .

After determining sensitive attribute $S_{a}$ , we should to improve the fairness of machine learning prediction models. We also used the logistic regression models to make predictions, the predicted target is a offender will re-offend within two years. We used association based methods to improve the fairness of logistic regression model. We removed the sensitive attribute $S_{a}$ and indirect sensitive attributes $I_{a}$ to improve the fairness, the result are shown in Figure 8. In this figure, we can find the accuracy of prediction have a little growth, at the same time, the fairness of prediction are also growth. But the growing of fairness to Adult dataset is clear than COMPAS dataset, this difference is caused by the amount of data.

Figure 8.

The variations of accuracy and fairness with sensitive attributes removed from compas dataset. (a) Accuracy and (b) Fairness.

In a previous study, the race attribute is designated as the sensitive attribute $S_{a}$ in COMPAS dataset. At the same time, we can find in Table 9, among the several attributes with high $α$ values including the sensitive attribute identified by social science methods. We can also find a phenomenon in Figures 7 and 8, with the gradual improvement of fairness, the accuracy of the machine learning model will gradually decline, how to further balance the fairness and accuracy of the machine learning model will be our next focus on research issues.

Table 9.

Compare with Social Science Methods.

Dataset	$S_{a}$ by $α$	$S_{a}$ by social science
Adult	Sex	Sex
	Marital-status
	Race
COMPAS	C-charge-degree	Race
	Race
	Sex

COMPAS = correctional offender management profiling for alternative sanctions.

5.2. Compare With Other Methods

To further verify the effectiveness of our method, we compared our method with other association based methods on the UCI Adult dataset and the COMPAS dataset. Those methods respectively:

Capuchin: Is a causal database repair method proposed by Salimi et al. (2019), which is based on the causal structure of the data and uses a data correction method based on a horizontal data processing strategy.

Kamiran-Reweighting: Is a sample reweighting method proposed by Calders (2012), which corrects data bias by giving data samples reweighted to correct data bias.

FC-NSGA-II: Is a feature selection method proposed by Khan and Baig (2015), who used NSGAII to incorporate feature set size and accuracy as the selection method. feature set size and accuracy as conditions for selecting features to construct feature subsets.

In addition, we use three evaluation metrics to compare the fairness among models, which are:

Demographic parity (DP) : $P (C | S_{a} = 1) - P (C | S_{a} = 0)$

True positive rate balance (TPB) : $P (C = 1 | S_{a} = 1, T_{a} = 1) - P (C = 1 | S_{a} = 0, T_{a} = 1)$

True negative rate balance (TNB) : $P (C = 0 | S_{a} = 1, T_{a} = 0) - P (C = 0 | S_{a} = 0, T_{a} = 0)$

Where C is the classifier, the experiment uses a logistic regression model,

S_{a}

is the sensitive attribute and

T_{a}

is the target attribute. The results of different models on fairness improvement are as listed in Table 10.

Table 10.
Experimental Results of Comparison.

Dataset Methods DP TPB TNB

Adult Original dataset 0.20 0.19 0.08

Capuchin 0.10 0.09 0.01

Kamiran-reweighting 0.09 0.09 0.01

FC-NSGA-II 0.17 0.20 0.04

Ours 0.11 0.08 0.01

COMPAS Original dataset 0.32 0.31 0.26

Capuchin 0.22 0.23 0.19

Kamiran-reweighting 0.14 0.20 0.18

FC-NSGA-II 0.30 0.28 0.22

Ours 0.22 0.21 0.17

Dataset	Methods	DP	TPB	TNB
Adult	Original dataset	0.20	0.19	0.08
	Capuchin	0.10	0.09	0.01
	Kamiran-reweighting	0.09	0.09	0.01
	FC-NSGA-II	0.17	0.20	0.04
	Ours	0.11	0.08	0.01
COMPAS	Original dataset	0.32	0.31	0.26
	Capuchin	0.22	0.23	0.19
	Kamiran-reweighting	0.14	0.20	0.18
	FC-NSGA-II	0.30	0.28	0.22
	Ours	0.22	0.21	0.17

COMPAS = correctional offender management profiling for alternative sanctions; DP = demographic parity; TPB = true positive rate balance; TNB = true negative rate balance.

From Table 10, we can find that our proposed method for algorithm fairness can outperform other three methods under different fairness definitions on two different datasets.

6. Conclusion

At first, we propose a method for identifying sensitive attributes based on data analysis; we found that traditional methods for determining sensitive attributes rely too much on social science and the experience of experts; this approach is easily influenced by people’s previous knowledge. However, converting the problem of determining sensitive attributes into a problem of data analysis, this effect of previous knowledge can be avoided by focusing the problem on the dataset itself. Next, we have improved the previous association-based methods, we define attributes that are highly correlated with sensitive attributes $S_{a}$ as indirect sensitive attributes $I_{a}$ . We then removed the sensitive attribute $S_{a}$ and indirect sensitive attributes $I_{a}$ to improve the fairness of predictive models. Different from the causality based methods, our method need not to satisfy the predetermined inference result, the predetermined outcome of this inference may existed biases in itself. At the same time, we reduce the fairness improvement problem into a dataset amendment problem, making our method better applicable.

Although our method achieves good performance and can accurately identify sensitive attributes and improve fairness in the machine learning model, there are still some limitations. Our method cannot perform very well on small datasets. Therefore, in the future, we will explore the space for further development based on some of the ideas presented in this paper.

Footnotes

ORCID iD

Chuitian Rong

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received the following financial support for the research, authorship, and/or publication of this article: This work was supported by the project of the Natural Science Foundation of China (No.61402329, No.61972456) and the Natural Science Foundation of Tianjin( No.19JCYBJC15400, No.21YDTPJC00440).

Notes

References

Asuero

A. G.

Sayago

González

(2006). The correlation coefficient: An overview. Critical Reviews in Analytical Chemistry, 36(1), 41–59.

Beretta

Santangelo

Lepri

Vetrò

De Martin

J. C.

(2019). The invisible power of fairness. How machine learning shapes democracy. In Canadian conference on artificial intelligence (pp. 238–250). Springer.

Bird

Dudík

Edgar

Horn

Lutz

Milan

Sameki

Wallach

Walker

(2020). Fairlearn: A toolkit for assessing and improving fairness in ai. (Tech. Rep. No. MSR-TR-2020-32), Microsoft.

Bose

A. J.

Hamilton

(2019). Compositional fairness constraints for graph embeddings.

Brennan

Dieterich

Ehret

(2009). Evaluating the predictive validity of the compas risk and needs assessment system. Criminal Justice and Behavior, 36(1), 21–40.

Calders

K. T.

(2012). Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33, 1–33.

Calders

Verwer

(2010). Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21, 277–292.

Dastin

(2018). Amazon scraps secret ai recruiting tool that showed bias against women. In Ethics of data and analytics (pp. 296–299). Auerbach Publications.

Dua

Graff

, et al (2017). Uci machine learning repository.

10.

Dwork

Hardt

Pitassi

Reingold

Zemel

(2012). Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference (pp. 214–226).

11.

Feldman

Friedler

S. A.

Moeller

Scheidegger

Venkatasubramanian

(2015). Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 259–268).

12.

Galhotra

Brun

Meliou

(2017). Fairness testing: Testing software for discrimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering (pp. 498–510).

13.

Grgi-Hlaa

Zafar

M. B.

Gummadi

K. P.

Weller

(2019). Fairness beyond non-discrimination: Feature selection for fair decision making *.

14.

Grgic-Hlaca

Zafar

M. B.

Gummadi

K. P.

Weller

(2016). The case for process fairness in learning: Feature selection for fair decision making. In NIPS symposium on machine learning and the law, Vol 1, (pp. 11). Barcelona, Spain.

15.

Hardt

Price

Srebro

(2016). Equality of opportunity in supervised learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16), Red Hook, NY, USA (pp. 3323–3331).

16.

Hoecker

Kartvelishvili

(1996). SVD approach to data unfolding. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 372(3), 469–481.

17.

Joseph

Kearns

Morgenstern

Neel

Roth

(2016). Fair algorithms for infinite and contextual bandits. arXiv preprint arXiv:1610.09559.

18.

Kamishima

Akaho

Asoh

Sakuma

(2012a). Fairness-aware classifier with prejudice remover regularizer. In Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2012, Bristol, UK, September 24-28, 2012. Proceedings, Part II 23 (pp. 35–50). Springer.

19.

Kamishima

Akaho

Asoh

Sakuma

(2012b). Fairness-aware classifier with prejudice remover regularizer. In European conference on machine learning & knowledge discovery in databases.

20.

Kay

Matuszek

Munson

S. A.

(2015). Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd annual acm conference on human factors in computing systems (pp. 3819–3828).

21.

Kearns

Roth

(2019). The ethical algorithm: The science of socially aware algorithm design. Oxford University Press, 2019

22.

Kearns

M. J.

Roth

A. L.

(2017). Meritocratic fairness for cross-population selection. In International conference on machine learning.

23.

Khan

Baig

A. R.

(2015). Multi-objective feature subset selection using non-dominated sorting genetic algorithm. Journal of Applied Research & Technology, 13(1), 145–159.

24.

Khandani

A. E.

Kim

A. J.

A. W.

(2010). Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 34(11), 2767–2787.

25.

Kilbertus

Rojas Carulla

Parascandolo

Hardt

Janzing

Schölkopf

(2017). Avoiding discrimination through causal reasoning. In Proceedings of the 31st international conference on neural information processing systems (NIPS'17) Red Hook, NY, USA (pp. 656–666).

26.

Kusner

M. J.

Loftus

J. R.

Russell

Silva

(2017). Counterfactual fairness.

27.

Larson

Mattu

Kirchner

Angwin

(2016). How we analyzed the compas recidivism algorithm. ProPublica (5 2016), 9(1), 3–3.

28.

LaValley

M. P.

(2008). Logistic regression. Circulation, 117(18), 2395–2399.

29.

Mahoney

J. F.

Mohen

J. M.

(2007). Method and system for loan origination and underwriting. US Patent 7,287,008.

30.

Mehrabi

Morstatter

Saxena

Lerman

Galstyan

(2021). A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6), 1–35.

31.

Nabi

Shpitser

(2018). Fair inference on outcomes. In Proceedings of the AAAI conference on artificial intelligence, Vol 32.

32.

Russell

Kusner

M. J.

Loftus

Silva

(2017). When worlds collide: integrating different counterfactual assumptions in fairness. In Proceedings of the 31st international conference on neural information processing systems (NIPS'17), Red Hook, NY, USA (pp. 6417–6426).

33.

Salazar

Neutatz

Abedjan

(2021). Automated feature engineering for algorithmic fairness. Proceedings of the VLDB Endowment, 14(9), 1694–1702.

34.

Salimi

Rodriguez

Howe

Suciu

(2019). Interventional fairness: Causal database repair for algorithmic fairness. In Proceedings of the 2019 international conference on management of data (pp. 793–810).

35.

Woodworth

Gunasekar

Ohannessian

M. I.

Srebro

(2017). Learning non-discriminatory predictors. In Conference on learning theory (pp. 1920–1953). PMLR.

36.

Yan

Kao

H.-t.

Ferrara

(2020). Fair class balancing: Enhancing model fairness without observing sensitive attributes. In Proceedings of the 29th ACM international conference on information & knowledge management (pp. 1715–1724).

37.

Zafar

M. B.

Valera

Gomez Rodriguez

Gummadi

K. P.

(2017). Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th international conference on world wide web (pp. 1171–1180).

38.

Zemel

Swersky

Pitassi

Dwork

(2013). Learning fair representations. In International conference on machine learning (pp. 325–333). PMLR.

Automated Data Bias Mitigation Technique for Algorithmic Fairness

Abstract

Keywords

1. Introduction

2.1. Fairness Definitions and Research

2.2. Bias Reduction Algorithm

2.2.1. Pre-processing Methods

2.2.2. In-processing Methods

2.2.3. Post-processing Methods

2.3. Problems in Existing Works

3. Problem Definition

Table 1. Symbols Used in the Paper. Symbol Meaning D The dataset instance A The attribute of dataset S a ′ Suspected sensitive attribute S a Sensitive attribute I a Indirect sensitive attribute T a Target attributes M Machine learning model S T Stability of grouping

4.1. Dataset Introduction

4.1.1. Adult Dataset

4.1.2. Correctional Offender Management Profiling for Alternative Sanctions Dataset

4.2. Sensitive Attributes Identification

4.2.1. Data Exploration and Analyzing

Table 4. The Grouping Stability of Each Attribute. Attribute Stability of binary classification Race 180.95 Sex 103.83 Workclass 79.17 Occupation 58.51 Marital-status 36.48 Relationship 26.52 Education 17.18

4.3.1. Indirect Sensitive Attributes Identification

Table 6. The Prediction Results of Predictor that does not Consider Fairness. Group Accuracy score Selection rate Count All groups 0.851666 0.196601 19537.0 Female group 0.925960 0.074040 6537.0 Male group 0.814538 0.258231 13000.0

5.1. Experiment on the COMPAS Dataset

Footnotes

ORCID iD

Declaration of Conflicting Interests

Funding

Notes

References

Table 4.
The Grouping Stability of Each Attribute.

Attribute Stability of binary classification

Race 180.95

Sex 103.83

Workclass 79.17

Occupation 58.51

Marital-status 36.48

Relationship 26.52

Education 17.18

Table 6.
The Prediction Results of Predictor that does not Consider Fairness.

Group Accuracy score Selection rate Count

All groups 0.851666 0.196601 19537.0

Female group 0.925960 0.074040 6537.0

Male group 0.814538 0.258231 13000.0