On Statistical Biases and Biases in Machine Learning

laranguyen811
May 2, 2022
6 min read

Statistical Biases

What are biases? Psychology today defines a bias as a tendency, inclination or prejudice toward or against something or someone. Christopher J. Pannucci, MD and Edwin G. Wilkins believe bias is any tendency to prevent unprejudiced consideration of a question. Biases may occur at any phase of research, including study design or data and in the process of data analysis and publication.

According to Jenny Gutbezahl, ' statistical bias is anything leading to a systematic difference between the true parameters of a population and the statistics used to estimate those parameters. Bias refers to a flaw in the experiment design or data collection process, leading to generating results that don’t accurately represent the population '. In business, we often use statistics to support the decision-making process. Biases can have tremendous impacts.

Types of common statistical bias to avoid:

Sampling bias: most data selection methods are not truly random. World Economic Forum refers to sampling bias as a statistical problem where random data selected from the population do not reflect the real-world population. There can be skewness towards some subset of the group.
Bias in assignment: there must not be pre-existing differences between groups in a well-developed experiment where we treat two or more groups differently and then compare them to each other. Every case in the sample should have the same likelihood of being assigned to each experimental condition. However, it might not be the case in real-world experiments. When we assign a person to a group, they might not have an equal chance of being in each experiment group. Therefore, we might draw a misleading conclusion by looking at the results.
Omitted variables: we might disregard variables crucial to the experiment, including those not accounted for in the experimental design. There may be correlations between two variables, but it does not mean one causes the other. Other additional variables may be the major causes. Correlations do not imply causations. Langat Langs explains this as a phenomenon where a person may encounter a situation where the data has many variables and consider reducing variables since we cannot include many variables in our models. They study each variable and decide its effect on our results. There can be a potential bias in this regard because we may omit variables that we deem as unnecessary but, in fact, necessary when determining our outcomes.
Self-serving bias: when asked to self-report, respondents tend to downplay the qualities they perceive to be less than ideal and overemphasize qualities they perceive to be desirable. According to Dale and Michael, a self-serving bias is any cognitive or perceptual process distorted by the need to maintain and enhance self-esteem.
Experimenter expectations: researchers may have pre-existing ideas about the results of a study. They may unintendedly impact the data, even when they are trying to remain objective. They can subtly influence others. Studies requiring human intervention to gather data usually use blind data collectors who do not know what we are testing.

Biases in machine learning

AI will be what data is fed into it. Analytics India Mag proposes six common biases in machine learning models:

Automation bias: human decision-makers tend to over-rely on automation. They may choose to follow information from automated systems and ignore contradictory information made without automation even when it is correct.
Confirmation bias: 'seeking or interpreting evidence in ways that are preferential to existing beliefs, expectations, or hypothesis' (Nickerson, 1998, p.175). Machine learning practitioners might collect data or label them in a way that would satisfy their unresolved prejudices. It can come from experimenter bias, where the data scientist would train a model till they confirm their previously held hypothesis.
Group attribution bias: we assume that what might be favourable for one person is valuable for the group too seriously. Attributions made in this way rarely reflect real-world data.
Out-group homogeneity bias: the brevity with which assumptions made on groups outside ours leads to out-group homogeneity bias. This perceptual propensity means presuming the members of the other groups are very much alike, especially in comparison to the assumed differentially of the membership of one.
Selection bias: a result of errors in the way we conduct sampling. Following forms of selection bias can appear: coverage bias (the population represented in the dataset not matching the population the machine learning model is making predictions about), sampling bias(occurring when the sample is not random or diverse), non-response bias: usually coming from the data end, when certain sections of the audience choose not to participate, also known as participation bias), reporting bias (this bias emerges out of the way actions are documented).
Reporting bias: the bias emerges from the way we document things. For example, if certain words are more common in the training data, the learning model will conclude that specific actions are more prevalent than others.

Effects of biases on machine learning

According to Tomáš Kliegr, Štěpán Bahník and Johannes Fürnkranz, some unintended cognitive biases may have negative impacts on rule-based machine learning.

Conjunction Fallacy and Representativeness Heuristic: Using prototypicality for judgment of probability. People may judge the conjunction of two statements as more probable than one of the two statements. Representativeness heuristic refers to the tendency to deem the likelihood of an event based on similarity. It may lead to overestimating the probability of condition representative of rule consequent.
Misunderstanding of 'and': we misapprehend conjunction 'and' denoting an intersection as indicating a collection of sets or generally as some other than the actual meaning. As a result, people interpret 'and' differently than logical conjunction.
Averaging heuristic: we tend to use the average of probabilities of two events to estimate the probability of a conjunction of the two events. We may mistake the antecedent probability as the average of conditions' probabilities.
Disjunction fallacy: we tend to judge the probability of an event as higher than the probability of a union of the event with another event. As a result, we prefer more specific conditions over less specific ones.
Base-rate neglect: we tend to underweight evidence provided by base rates. We, therefore, neglect the prior probability of the head of the rule.
Insensitivity to sample size: we tend to underestimate the benefits of larger samples. Analysts do not realize the increased reliability of confidence estimates with increasing support value.
Confirmation bias and positive test strategy: we tend to seek supporting evidence for our current hypothesis. Analysts cherry-pick rules confirming their prior hypothesis.
Availability heuristic: we determine the perceived frequency of a class by the ease with which its instances come to mind. We consider rules for which we recall instances more effortlessly as more plausible.
Reiteration effect: the increase of perceived believability following repetition. Presentation of redundant rules or conditions increases plausibility.
Mere exposure effect: the increase of liking following repetition. Our repeated exposure (even subconscious) to something influences increased preference for it.
Overconfidence and underconfidence: we tend to be overconfident (or underconfident) in our judgments. Consequently, we overrate rules with little support and high confidence.
Recognition heuristic: we use recognition to judge frequency, size, or other attributes. Recognition of a characteristic or its value increases preference.
Information bias: we tend to seek information, even when it is not relevant or helpful. As a result, we believe that more information (rules, conditions) will improve decision making even if it is irrelevant.
Ambiguity aversion: we tend to prefer known risks over unknown ones. Consequently, we favour rules without unknown conditions.
Confusion of the inverse: we confuse the confidence of an implication A → B with its inverse B → A. Therefore, we confuse the difference between the confidence of the rule (Pr(consequent|antecedent) with Pr(antecedent|consequent) )
Context and tradeoff contrast: the context of an alternative may affect our preferences of available options. It may cause our inclination to a rule to be influenced by others.
Negativity bias: we tend to emphasize more negative information than positive one of the same strength. Words with negative valence in the rule cause it to appear more crucial.
Primacy effect: a disproportionate effect of initial information impacts the final assessment. An outcome of this may be that rules or conditions presented to us first has the highest impact.
Weak evidence effect: weak argument favouring a statement can decrease its believability. Therefore, a condition only weakly perceived as predictive of the target minimizes plausibility.
Unit bias: the tendency to give similar weight to each unit rather than weigh it according to its size. As a result, we perceive conditions to have the same importance.

Machine learning models may represent machine learning practitioners who build these models with unintended biases. These biases can perpetuate and reinforce our already existing biases. We need to de-bias ourselves first before de-biasing the models themselves. Future high-performing machine learning models will expectedly be independent of inequitable biases.

An Analysis of Engineering Machine Consciousness

Data Structures and Algorithms - An Overview of The Four Categories of Data Structures and Corresponding Algorithms

Explainable AI with Fuzzy Logic (解释性人工智能跟模糊集合论）

On Statistical Biases and Biases in Machine Learning

Comments