Tutorial 2 - Mixture model

In this tutorial we will build a model of the position data shown in the chart below.

The following concepts will be covered:

  • Continuous variables
  • Nodes with multiple variables
  • Creating a Data Connection
  • Learning parameters from data

Mixture model chart

Bayes Server must be installed, before starting this tutorial. An evaluation version can be downloaded from the Downloads page

Companion video (No Audio)

Creating the model

We will create a Bayesian network that is equivalent to a Mixture model (also known as a Cluster Model), as shown below.

Position mixture model

Add the nodes

  • Click New on the File tab to create a new empty network.

  • To add the Cluster node, click Node on the Network tab, Editing group, to create a new node. This will launch the New node window.

  • Enter Cluster in the Name text box.

  • In the Variable section, click Add New on the toolbar to add an additional state.

  • Rename the states to the following values, by clicking on the name of each state, and typing the new name.

    • Cluster1
    • Cluster2
    • Cluster3

    The New Node window should look like this:

    Mixture model - new cluster node

  • Click the OK button to create the new node.

  • To add the Position node, click Node on the Network tab, Editing group, to create a new node. This will launch the New node window.

  • Enter Position in the Name text box.

  • This node will contain two continuous variables, so in the Variable section, click the Multiple tab.

  • Click the Add Continuous toolbar button twice, to add two new continuous variables to the node.

  • Rename the new variables to the following, by clicking on the name of each new variable, and typing the new name.

    • X
    • Y

    The New Node window should look like this:

    Mixture model - new position node

    • Click the OK button to create the new node.

Add the link

  • To add a new link, click the Link button on the Network tab, Editing group. This will launch the New link window.
  • Select Cluster as the From node.
  • Select Position as the To node.
  • Click the OK button to create the new link.

Learning the distributions

We could now enter the distribution parameters for the nodes manually using the Distribution editor, however in this tutorial we will learn the parameter values from data.

For convenience, we will use Microsoft Excel as the data source, however another database can be substituted.

Although Microsoft Excel is a convenient way of storing data, in practice we recommend using a database as the data source.

Adding a data connection

Note: You can skip this step, and instead use the pre-installed Tutorial data connection (Walkthrough Data in earlier versions).

  • Select the data (including the header) in the data section and copy it to the clipboard (Ctrl+C).
  • Open Microsoft Excel and paste the data into a new Microsoft Excel spreadsheet (Ctrl+V).
  • Save the new spreadsheet.
  • In Bayes Server, click the Data Connections button on the Data tab, Data Sources group. This will launch the Data connection manager.
  • Click the New button on the toolbar. This will launch the Data connection editor.
  • In the list of data providers, select the appropriate Excel Driver for the version of Microsoft Excel you are using.
  • Next to the File Name text box, click the Ellipsis (...) button, and select the Microsoft Excel spreadsheet created in an earlier step.
  • Click the Test Connection button, to ensure the new data connection is working.
  • Click OK to add the new Data Connection.
  • Click OK to close the Data connection manager.

Parameter learning

  • Click the Parameter Learning button, on the Data tab, Network group. This will launch the Data tables window.

  • In the Data Connection drop down, select the new Data Connection created in an earlier step, or the Tutorial data connection if you skipped that step. This should enable the Data drop down.

  • In the Data drop down, select the worksheet that contains the data. (If the data is on the first worksheet, select Sheet1$). If you are using the pre-installed Tutorial data connection, select Tutorial 2 - Mixture model.

  • Click the OK button. This will launch the Data map window.

  • In the Data map window, ensure that variable X has automatically been mapped to column X, and variable Y has automatically been mapped to column Y.

    The window should look like this:

    Mixture model data map

  • Click the OK button. This will launch the Parameter learning wizard.

  • Click Next in the wizard, accepting all the default settings, until you reach the Run page. Click the Run button to start learning.

  • When learning has completed, click the Finish button on the wizard. This will launch the Candidate Networks window.

  • Click the OK button in the Candidate Networks window.

  • The network distributions have now been learned.

  • Click the All button with the green arrow at the bottom of network viewer, or use the shortcut F6 to query all the nodes in the network.

Visualizing the network distributions

Creating a Custom Query

  • Click the Custom Query button on the Query tab, Query group. This will launch the Custom query window.

  • Select all three variables (Cluster, X and Y), and click the Add Head button.

    The window should look like this:

    Mixture model custom query

  • Click the Query button. This will launch the Query distribution window.

Charting the distributions with data

  • Click the Plot button. This will launch the Query Plot options window.

  • Ensure the Overlay data check-box is checked.

    The window should look like this:

    Mixture model query plot options

  • Click OK. This will launch the Data tables window.

  • In the Data Connection drop down, select the new Data Connection created in an earlier step, or the Tutorial data connection if you skipped that step. This should enable the Data drop down.

  • In the Data drop down, select the worksheet that contains the data. (If the data is on the first worksheet, select Sheet1$). If you are using the pre-installed Tutorial data connection, select Tutorial 2 - Mixture model.

  • Click the OK button. This will launch the Data map window.

  • In the Data map window, ensure that variable X has automatically been mapped to column X, and variable Y has automatically been mapped to column Y.

  • Click the OK button. This will launch the Data Plot options window.

  • Click OK to chart the distribution and data.

    The chart should look similar to this:

    Mixture model distribution chart

Data

X Y
0.176502224 7.640580199
1.308020831 8.321963251
7.841271129 3.34044587
2.623799516 6.667664279
8.617288623 3.319091539
0.292639161 9.070469416
1.717525934 6.509707265
0.347388367 9.144193334
4.332228381 0.129103276
0.550570479 9.925610034
10.18819907 3.414009144
9.796154937 4.335498562
4.492011746 0.527572356
8.793496377 3.811848391
0.479689038 8.041976487
0.460045193 10.74481444
3.249955813 5.58667984
1.677468832 8.742639202
2.567398263 3.338528008
8.507535409 3.358378353
8.863647208 3.533757566
-0.612339597 11.27289689
10.38075113 3.657256133
9.443691262 3.561824026
1.589644185 7.936062309
7.680055137 2.541577306
1.047477704 6.382052946
0.735659679 8.029083014
0.489446685 11.40715477
3.258072314 1.451124598
0.140278917 7.78885888
9.237538442 2.647543473
2.28453948 5.836716478
7.22011534 1.51979264
1.474811913 1.942052919
1.674889251 5.601765101
1.30742068 6.137114076
6.957133145 3.957540541
10.87472856 5.598949484
1.110499364 9.241584372
7.233905739 2.322237847
7.474329505 2.920099189
0.455631413 7.356350266
1.234318558 6.592203772
10.72837103 5.371838788
0.655168407 6.713544957
2.001307579 5.30283356
0.061834893 2.071499561
1.86460938 6.013710897
9.35680964 3.719046646
-0.008787992 7.387352578
0.610918535 8.343425847
-0.238965542 9.89893411
1.940925093 6.209752266
1.333199057 7.59848403
8.484655224 3.073253305
1.364358184 5.975527829
10.72748994 4.134446075
2.046614845 7.437682289
1.662951156 6.370669577
3.162551343 4.864600865
2.789107868 6.143289172
2.587010436 1.599672084
1.470218845 8.656125114
1.409410007 0.992888942
0.919912218 7.052651078
8.778925691 3.704669502
1.117567765 1.993522613
-0.144489104 11.53479807
5.284863514 1.489314676
2.663178432 3.177481897
2.011776623 7.897365033
1.464680213 1.528483262
0.158678139 8.908835673
0.214401967 7.292093447
2.402088546 2.362154057
2.378733602 2.551873091
0.827701089 10.69624252
0.395016071 8.305645848
2.121369004 2.815448463
7.169453919 1.905806566
0.601520002 9.785279396
1.586490061 6.726857095
1.439861204 7.014361866
-0.686397699 9.77425866
0.801845144 6.804976671
1.137477302 0.225502899
1.921361226 1.7909808
5.173856001 0.823420025
1.05037942 9.453914186
1.111047436 2.742705875
2.474067445 2.812341558
6.804286117 1.95500379
0.819301796 11.32009105
2.617654071 2.264324097
1.027061887 8.046658989
6.149555065 1.891771827
8.034889849 3.017307147
-0.687339329 9.029022351
5.263367085 1.680024298
1.195130289 6.286967967
1.731321429 7.390609268
1.551609971 6.336140355
-0.520890154 10.19932065
3.428134061 4.078913686
2.572038859 3.089414526
-0.399780876 9.469230934
1.718614257 7.038834914
11.43348179 5.957011929
3.957769968 3.837668973
1.397437302 6.075260837
0.641022085 7.121565252
0.898241907 9.891320102
7.881545649 2.520490475
0.466133925 6.923223388
1.083456697 0.744882546
1.676386764 7.311645991
7.09954842 1.896511497
0.064081268 9.625098882
0.934196198 1.707061134
1.773186249 7.078639821
2.614429517 8.186884596
1.588726807 3.189121762
2.576481413 8.338793925
1.493343882 7.817329126
1.040380815 7.019325225
1.2238645 7.52380108
0.564538219 8.554700627
7.522790642 2.205814269
7.221158478 1.881945286
6.155353437 2.956359604
0.543960458 9.876015061
2.969310521 1.689421147
1.130952545 7.323147618
1.637814935 7.505231003
3.0319218 1.806998212
-0.691891787 10.68314634
-0.172458962 11.0038236
6.382030326 1.529850265
7.081234369 3.018901732
6.420440542 1.710179248
2.567996925 2.175798021
5.484764693 1.249988782
2.169826086 1.457485314
2.666166169 3.006020372
1.255748487 8.172890601
1.110450177 6.909645674
0.64221948 7.115968797
7.382062636 2.885279632
-0.390488356 10.58445538
8.673875474 4.606369236
2.703825246 4.532865095
0.256417369 8.637987542
2.171303599 2.887466856
0.76946757 9.931359151
1.429713914 7.061909133
1.916059822 0.411361527
2.069221406 1.169196508
2.443632587 1.633471641
8.651228033 3.478728796
5.094879523 0.618202099
1.197475474 7.806661104
2.721229947 -1.040833105
0.649135449 6.703355477
0.955899266 10.0812704
-0.05107945 9.412982102
2.09150178 7.85570867
8.496542476 2.72631079
6.129258907 1.391867166
1.415242321 8.69036834
-0.181799846 9.564270677
2.147903749 5.971313108
7.429093289 1.837920789
0.258858273 8.36201855
0.436279242 7.122238994
2.400524268 9.484131132
8.949800461 4.050725157
1.377808913 9.131672137
0.488721438 6.24375667
0.938826647 6.533751708
1.609019133 9.491402761
7.686040142 2.571497086
7.913477158 2.973634152
-0.141689656 7.490501119
1.54214829 1.462388521
8.836690062 3.323118698
1.292553241 7.696934647
1.338461668 0.916163751
2.223196493 7.092454045
7.283823688 2.739494961
3.118964374 5.500739786
0.807728186 6.844431805
0.670279272 10.92590148
9.12996622 3.917329879
3.546742416 1.337113351
0.766419935 7.261302999
12.55210397 5.948973499
0.376685504 9.865645387
0.890836141 6.556401491
7.597140488 1.163621719