﻿Exercise - probability

In this exercise we will use Bayes' theorem to calculate a simple probability query.

# Prerequisites

Note

For a variable Gender with states Female and Male, the notation P(Gender) = [0.51, 0.49], means that 0.51 refers to Female, and 0.49 to Male.

Also note that 0.51 is equivalent to 51%.

### The problem

1. We have 2 variables, Gender with states {Female and Male}, and Hair Length with states {Short, Medium, Long}.

2. We know the following about the variables.

• P(Gender) = [0.51, 0.49]

• P(Hair Length | Gender = Female) = [0.1, 0.4 0.5]

• P(Hair Length | Gender = Male) = [0.8, 0.15, 0.05]

3. Calculate P(Gender | Hair Length = Short).

1. Because we know that Hair Length = Short, we only need the first entries from P(Hair Length | Gender = Female) and P(Hair Length | Gender = Male).

Therefore, we can now work with the following probability distributions.

• P(Gender) = [0.51, 0.49]

• P(Hair Length = Short | Gender) = [0.1, 0.8]

2. We want to work out P(Gender | Hair Length = Short), but we have a probability defined in terms of P(Hair Length = Short | Gender).

We can use Bayes' theorem here to invert the conditioned variable, and therefore solve our problem.

Using Bayes' theorem, P(Gender | Hair Length = Short) = P(Hair Length = Short | Gender) * P(Gender) / P(Hair Length = Short).

P(Gender | Hair Length = Short) = [0.51 * 0.1, 0.49 * 0.8] / P(Hair Length = Short)

P(Gender | Hair Length = Short) = [0.051, 0.392] / P(Hair Length = Short)

Tip

The term P(Hair Length = Short) is simply a normalization term, that renders the final distribution a probability. We can therefore just normalize the distribution [0.051, 0.392].

P(Gender | Hair Length = Short) = [0.051, 0.392] / (0.051 + 0.392)

P(Gender | Hair Length = Short) = [0.051 / 0.443, 0.392 / 0.443]

P(Gender | Hair Length = Short) = [0.115, 0.885]

Note

Note that the final distribution sums to 1.

This is saying that P(Gender = Female | Hair Length = Short) is 11.5% and the P(Gender = Male | Hair Length = Short) is 88.5%.

Tip

The Bayesian network in the image below also shows the result.