A marginal probability is a distribution formed by calculating the subset of a larger probability distribution.

Consider the joint probability over the variables Raining and Windy shown below:

P(Raining, Windy)
RainingWindy = FalseWindy = True
False0.640.16
True0.10.1

If someone asks us what is the probability of it raining, we need P(Raining), not P(Raining, Windy).

In order to calculate P(Raining), we would simply sum up all the values for Raining = False, and Raining = True, as shown below.

P(Raining, Windy)
RainingWindy = FalseWindy = TrueSum
False0.640.160.8
True0.10.10.2

This is known as marginalization, and the result we were after is shown below.

P(Raining)
Raining = FalseRaining = True
0.80.2
Tip

For discrete variables we sum, whereas for continuous variables we integrate.

Tip

The term marginal is thought to have arisen because joint probability tables were summed along rows or columns, and the result written in the margins of the table.

As a more involved example of calculating a marginal probability, the tables below show the result of marginalizing B and D out of P(A,B,C,D), where A, B, C, D are all discrete variables each with two states. To calculate the first entry, we must sum all the entries in P(A,B,C,D) where A = True and C = True. Therefore P(A = True,C = True) = 0.0036 + 0.0098 + 0.0256 + 0.0432 = 0.0822.

Joint probability distribution P(A,B,C,D)
BCDA = TrueA = False
TrueTrueTrue0.00360.0054
TrueTrueFalse0.00980.0252
TrueFalseTrue0.00240.0486
TrueFalseFalse0.00420.1008
FalseTrueTrue0.02560.0864
FalseTrueFalse0.04320.1728
FalseFalseTrue0.00640.2016
FalseFalseFalse0.00480.2592

Marginal probability P(A,C) created by marginalizing B and D From P(A,B,C,D)
CA = TrueA = False
True0.08220.2898
False0.01780.6102