A marginal probability is a distribution formed by calculating the subset of a larger probability distribution.
Consider the joint probability over the variables Raining and Windy shown below:
|Raining||Windy = False||Windy = True|
If someone asks us what is the probability of it raining, we need P(Raining), not P(Raining, Windy).
In order to calculate P(Raining), we would simply sum up all the values for Raining = False, and Raining = True, as shown below.
|Raining||Windy = False||Windy = True||Sum|
This is known as marginalization, and the result we were after is shown below.
|Raining = False||Raining = True|
For discrete variables we sum, whereas for continuous variables we integrate.
The term marginal is thought to have arisen because joint probability tables were summed along rows or columns, and the result written in the margins of the table.
As a more involved example of calculating a marginal probability, the tables below show the result of marginalizing B and D out of P(A,B,C,D), where A, B, C, D are all discrete variables each with two states. To calculate the first entry, we must sum all the entries in P(A,B,C,D) where A = True and C = True. Therefore P(A = True,C = True) = 0.0036 + 0.0098 + 0.0256 + 0.0432 = 0.0822.
|B||C||D||A = True||A = False|
|C||A = True||A = False|