In this exercise we will use Bayes' theorem to calculate a simple probability query.

Prerequisites

A calculator or spreadsheet software.

Note

For a variable Gender with states Female and Male, the notation P(Gender) = [0.51, 0.49], means that 0.51 refers to Female, and 0.49 to Male.

Also note that 0.51 is equivalent to 51%.

The problem

  1. We have 2 variables, Gender with states {Female and Male}, and Hair Length with states {Short, Medium, Long}.

  2. We know the following about the variables.

    • P(Gender) = [0.51, 0.49]

    • P(Hair Length | Gender = Female) = [0.1, 0.4 0.5]

    • P(Hair Length | Gender = Male) = [0.8, 0.15, 0.05]

  3. Calculate P(Gender | Hair Length = Short).

The answer

  1. Because we know that Hair Length = Short, we only need the first entries from P(Hair Length | Gender = Female) and P(Hair Length | Gender = Male).

    Therefore, we can now work with the following probability distributions.

    • P(Gender) = [0.51, 0.49]

    • P(Hair Length = Short | Gender) = [0.1, 0.8]

  2. We want to work out P(Gender | Hair Length = Short), but we have a probability defined in terms of P(Hair Length = Short | Gender).

    We can use Bayes' theorem here to invert the conditioned variable, and therefore solve our problem.

    Using Bayes' theorem, P(Gender | Hair Length = Short) = P(Hair Length = Short | Gender) * P(Gender) / P(Hair Length = Short).

    P(Gender | Hair Length = Short) = [0.51 * 0.1, 0.49 * 0.8] / P(Hair Length = Short)

    P(Gender | Hair Length = Short) = [0.051, 0.392] / P(Hair Length = Short)

    Tip

    The term P(Hair Length = Short) is simply a normalization term, that renders the final distribution a probability. We can therefore just normalize the distribution [0.051, 0.392].

    P(Gender | Hair Length = Short) = [0.051, 0.392] / (0.051 + 0.392)

    P(Gender | Hair Length = Short) = [0.051 / 0.443, 0.392 / 0.443]

    P(Gender | Hair Length = Short) = [0.115, 0.885]

    Note

    Note that the final distribution sums to 1.

    This is saying that P(Gender = Female | Hair Length = Short) is 11.5% and the P(Gender = Male | Hair Length = Short) is 88.5%.

    Tip

    The Bayesian network in the image below also shows the result.

    Bayes Theorem Exercise