Marginal probability

A marginal probability is a distribution formed by calculating the subset of a larger probability distribution.

Consider the joint probability over the variables Raining and Windy shown below:

P(Raining, Windy)

Raining	Windy = False	Windy = True
False	0.64	0.16
True	0.1	0.1

If someone asks us what is the probability of it raining, we need P(Raining), not P(Raining, Windy).

In order to calculate P(Raining), we would simply sum up all the values for Raining = False, and Raining = True, as shown below.

P(Raining, Windy)

Raining	Windy = False	Windy = True	Sum
False	0.64	0.16	0.8
True	0.1	0.1	0.2

This is known as marginalization, and the result we were after is shown below.

P(Raining)

Raining = False	Raining = True
0.8	0.2

Tip
For discrete variables we sum, whereas for continuous variables we integrate.

Tip
The term marginal is thought to have arisen because joint probability tables were summed along rows or columns, and the result written in the margins of the table.

As a more involved example of calculating a marginal probability, the tables below show the result of marginalizing B and D out of P(A,B,C,D), where A, B, C, D are all discrete variables each with two states. To calculate the first entry, we must sum all the entries in P(A,B,C,D) where A = True and C = True. Therefore P(A = True,C = True) = 0.0036 + 0.0098 + 0.0256 + 0.0432 = 0.0822.

Joint probability distribution P(A,B,C,D)

B	C	D	A = True	A = False
True	True	True	0.0036	0.0054
True	True	False	0.0098	0.0252
True	False	True	0.0024	0.0486
True	False	False	0.0042	0.1008
False	True	True	0.0256	0.0864
False	True	False	0.0432	0.1728
False	False	True	0.0064	0.2016
False	False	False	0.0048	0.2592

Marginal probability P(A,C) created by marginalizing B and D From P(A,B,C,D)

C	A = True	A = False
True	0.0822	0.2898
False	0.0178	0.6102