You’ve probably heard the saying, “garbage in, garbage out.”
This is especially true when it comes to data science. If the data you’re using is biased, your conclusions will be too. Unfortunately, most datasets are biased in some way or another, and the problem is worse when it comes to gender imbalance.
This lack of diversity negatively affects the reliability of data and undermines evidence-based policymaking. There must be increased access to STEM education, and we need to ensure that data science is a welcoming and inclusive field for all genders.
It is time to accelerate efforts to increase gender diversity in data science. The potential benefits are huge.
Unconscious Bias Makes a Difference
When women, transgender, non-binary and non-conforming people are involved in data science, they can help to ensure that datasets are more representative and that conclusions are based on a fuller understanding of the world. This, in turn, can help to create policies that are more effective and responsive to the needs of everyone.
Data scientists are not merely objective beings who simply observe and report on what they see. They bring their own values, interests as well as life experiences with them when handling data which influences outcomes in line with this understanding of the world
In one sense then we could say that datasets & algorithms become “encoded sets” or instruments containing certain insights depending upon how these factors manifest themselves throughout each stage along an analytical path from collection through organization into analysis
Data scientists’ choices regarding data measurement, collection, organization and analysis can impact the insights they gain and can potentially introduce bias at every stage of the data process
Data scientists, whether intentionally or not, may incorporate their personal values, interests and experiences into the data they work with, influencing the outcomes in alignment with their own understanding of the world. In this way, datasets and algorithms can be seen as containing “encoded sets of values.” And when the people who create and work with data are not representative of the general population, they can inadvertently introduce bias.
This suggests that if we want datasets and algorithms that are less biased, we need more gender diversity in data science. However, the field is still very male-dominated. In the United States, women make up just 18 percent of data science jobs. The numbers are even lower for transgender, non-binary and non-conforming data scientists.
Heavy Consequences from Lack of Diversity in Data Science
The lack of diversity in data science has far-reaching consequences. It increases the risk of bias in datasets and algorithms, which can lead to inaccurate conclusions and bad policy decisions. It also perpetuates gender inequality by making it harder for women, transgender, non-binary and non-conforming people to get ahead in fields that are linked to the digital economy.
The lack of diversity, particularly the underrepresentation of different genders, in the field of data science increases the likelihood that data-driven policies will be created and implemented in ways that disadvantage or harm marginalized communities. For instance, as highlighted in Carolina Criado-Pérez’s book “Invisible Women: Exposing Data Bias in a World Designed for Men,” biased data can harm women and girls in the following ways:
- In the United Kingdom, a 2013 algorithm used to calculate risk scores for heart disease underestimated the risk for women by up to 50 percent. As a result, many women were not eligible for lifesaving treatments.
- A U.S. Department of Health and Human Services study found that medical research trials are more likely to use male animals than female animals, even though sex differences can affect how drugs work in humans. This bias can lead to less effective or even harmful treatments for women.
- Many workplaces are designed with men in mind, resulting in ergonomic designs that do not consider the needs of women (such as breast milk pumps) and safety hazards that disproportionately affect women (such as exposure to chemicals).
These examples illustrate how bias in data can have harmful real-world consequences. They also highlight the need for more gender diversity in data science so that datasets are more representative and policy decisions are based on a fuller understanding of the world.
Steps That Make a Difference
It’s clear that the gender gap in data science is a problem. But what can be done about it?
There are a number of things that need to be done in order to fix this problem:
- Increase access to STEM education for all people.
- Ensure that data science is a welcoming and inclusive field for all genders.
- Encourage women, transgender, non-binary and non-conforming people to enter into data science.
- Provide support for anyone who wants to pursue a career in data science.
These are just a few of the ways that we can address the gender gap in data science. It’s an issue that requires a multi-pronged approach, and action needs to be taken now if we want to see change.
Additional Resources
- The Continuing Fight For African Women in the Crucial Field of Data Science. https://www.linkedin.com/pulse/continuing-fight-african-women-crucial-field-data-science-castle
- Why the World Needs More Women Data Scientists. https://www.cgdev.org/blog/why-world-needs-more-women-data-scientists
- Transgender People Get Worse Healthcare, Even With More Hours Invested https://ihpi.umich.edu/news/bias-may-affect-providers-knowledge-transgender-health
- Algorigthms Still Negatively Affect Transgender and Black People https://www.dazeddigital.com/science-tech/article/43211/1/trans-algorithm-machine-learning-bias-discrimination-chelsea-manning-edit
- Women in Data Science: Why They’re Critical to the Data Science Workforce https://datasciencedegree.wisconsin.edu/blog/women-in-data-science-why-theyre-critical-to-the-data-science-workforce/
- https://www.burtchworks.com/wp-content/uploads/2020/08/Burtch-Works-Study_DS-PAP-2020.pdf