Runaway Models
  • Home
  • Bias Pipeline
  • Bias Mitigation
  • About

Scroll to follow how text data in the form of a prompt for Alexa travels through the Language Modelling pipeline before Alexa can process and generate a response to it.

Click on the orange buttons if you want to learn further technical details about examples given.

Try me

Big Data

Language Models are first pre-trained on Big Data like Google News or Wikipedia. For example, GPT-3 was trained on 570GB of data. Language Models learn 'meaning' in this data and inherit any flaws, inconsistencies and biases captured in it.

Sample Bias
Untrained GPT-3
Pre-trained GPT-3


Pre-training

Word Embeddings

During pre-training, the model transforms all words and sentences in the data into vectors called Word Embeddings. Word Embeddings capture the meaning of each word in the data as it is used in different contexts. As a result, words that appear closer together in space are semantically similar to each other. For example ‘bank’, ‘banker’, ‘banking’ will be closer together in a word embedding vector space.

Making Predictions

After Language Models are fine-tuned, they are tested and evaluated for their performance on different natural language processing (NLP) tasks like Question Answering and Text Generation, then are utilised to power AI systems like Alexa.

Runaway Loops Racial Bias Gender Bias Religious Bias

Click on the pink buttons to learn more about the biases that can be learned, propagated and amplified in this language modelling pipeline.

Try me

Language Models

Virtual Assistants like Alexa or Siri are AI systems that can perform tasks for a user, based on voice commands or questions. They are powered by Language Models that use text data and Neural Networks to learn underlying patterns of meaning in our languages. Language Models also enable computers to program a website and turn captions into images. Examples of Language Models are Open AI's GPT-3 or Google's BERT.

Pre-Training

Pre-training exposes the Language Model to lots of text data, and it learns to predict the next word or phrase given a few words in a prompt just like ‘Alexa, tell me a story.’

Historical Bias


Explore Gender Bias

Fine Tuning

Language Models are then fine-tuned on a smaller task-specific dataset to make sure they can carry out a given task accurately like Text Generation. This allows Alexa to tell you a story about Harry Potter.

Bias Network Effect

Biases in data and in algorithms like Language Models, as well as the predictions that they make are interrelated. Human data perpetuates human biases, and as Language Models learn from human data, the result is a 'bias network effect'. Language Models therefore unintentionally learn, propagate and even amplify biases that are inherent to our society found in the data used to train them.

Algorithmic Bias

Using Language Models in AI systems that learn from biased networks creates systems can perpetuate harmful stereotypes based on race, gender or religion. These systems in turn can reinforce or amplify injustices and inequities that are already prevalent in society.

Mitigating Bias

Now that we know how biases arise in the Language Modelling pipeline, how can we tackle them?

Discover

You did it!

Great work! You clicked on a pink button. These buttons will provide you with more information on biases.

Continue

You did it!

Great work! You clicked on an orange button. These buttons will take you to external resources to explore on your own.

Continue

Sample Bias and Gender Stereotypes

Sample Bias can occur at this stage if the Big Data used to pre-train the Language Model over-represents or under-represents one group over another.

Word Embeddings (numerical representations of words) generated from a language model showed sample bias by predicting the word 'Homemaker' when used in an analogy task to complete the sentence "Man is to Computer Programmer as Woman is to _____?"

In the data used to train the model, male computer programmers are overrepresented while male homemakers are underrepresented and vice versa for females. This sampling bias promotes the gender stereotype that STEM occupations are typically occupied by men while caregiving and homemaking is typically a woman's job.

Learn More

Historical Bias leading to Gender Bias

Historical bias can also occur at this stage if the data used to train the Language Model no longer accurately reflects the reality of the environment it will be used in.

A recruiting algorithm once used by Amazon was found to have disproportionately favoured male over female candidates as the data used to train the algorithm was based on resumes collected over a 10-year period. Because the tech industry was historically dominated by men, a majority of the resumes were submitted by men. Therefore, the dataset captured the inherent gender bias and passed it on to the recruiting algorithm, thus excluding women from the job opportunity.

Learn More

Gender Bias

The Word2vec word embeddings projector visualises words that are most similar to a particular word of focus. Try searching for 'Man' and then 'Woman'. Notice the words that are associated with 'Man' e.g. 'commander' or 'manager', compared to the words associated with 'Woman' e.g 'virgin' or 'pregnant'. These are stereotypical associations based on gender.

Go Back

Religious Bias in Text Generation

OpenAI's GPT-3 Language Model is found to generate text that tends to associate religious beliefs with various stereotypes. For example, when prompted with “Two Muslims walked into a…”, GPT-3 typically generates sentences with violent descriptions.

Other generated responses from GPT-3 given prompts containing following religion-related words include: 'Christianity' generates text with 'ignorant', 'Atheism' generates text with 'cool' or 'arrogant'.

Learn More

Racial Bias in Predictive Policing

Predictive Policing algorithms are used in the US to predict areas where crime is most likely to occur in order to deploy police officers in those neighbourhoods. The data used to train this model is based on where officers have gone and made arrests, not necessarily where crimes have actually been committed. The model learns this pattern and predicts that crime is likely to occur in neighbourhoods where more police officers have been deployed in the past, typically with more African American and Latinx people, over those that have less deployment, typically with more White people. This runaway feedback loop perpetuates existing racial stereotypes and injustices in America.

Learn More

Gender Bias in Machine Translation

Google's Machine Translation system tends to predict male pronouns for occupations in STEM (Science, Technology, Engineering and Mathematics) in English because it is pre-trained on data with sample and historical bias. If you translate "She is an engineer. He is a nurse" from English to a non-gendered language like Turkish and back to English, the gender pronouns are swapped i.e. Google translates it to "He is an engineer. She is a nurse."

Learn More

Runaway Feedback Loops

At this stage, Language Models can amplify existing biases in the data. Given a few words as input, the model makes predictions on what words or phrases it should generate next. These predictions are fed back into the model as inputs for the next round of text generation predictions. This creates a runaway feedback loop.

Go Back