mixture models as Bayesian networks
Mixture models are simple Bayesian networks, and therefore we can represent them
graphically as shown in Image 2.
Image 2 - Bayesian network mixture model
The node named cluster is a discrete variable, with a number of discrete states,
each representing an individual cluster. Each state has a probability associated
with it, which tells us how much support there was for a cluster during learning.
The node named X contains four continuous variables X1, X2, X2 and X4. The distribution
assigned to node X is a multivariate Gaussian distribution, one for each state of
node cluster. Therefore our Mixture model is a mixture (collection) of multivariate
Gaussians. Since this is just a Bayesian network, the probability distribution of
the model is the product of the probabilities of each node given their parents,
i.e. P(Cluster)P(X1,X2,X3,X4|Cluster).
There are other ways a Mixture model can be represented as a Bayesian network. Image
3 shows a model which is equivalent to the model in Image 2, however only has a
single variable per node.
Image 3 - Alternative Bayesian network mixture model
Image 4 shows a Mixture model in which the probability of each continuous variable
is independent of the other continuous variables given the cluster. This model has
fewer parameters, however cannot represent the rotations of ellipses shown in Image
1. A model such as this is termed a diagonal model, because if you constructed a
multivariate Gaussian over the continuous variables, all values of the covariance
matrix would be zero, except for the diagonal variance entries.
Image 4 - Diagonal Bayesian network mixture model
Unlike some clustering techniques where a data point only belongs to a single cluster
(hard clustering), probabilistic mixture models use what is known as soft clustering,
i.e. each point belongs to each cluster with a probability. These probabilities
some to 1.
The node named cluster, is sometimes called a latent node. This is because we do
not have data associated with it during learning.