Add nodes from data
Nodes can be added to a network based on values in a data source such as a database or spreadsheet.
- Discrete variables
- Continuous variables
- Missing data
- DBN variables
Nodes are added to a network, by defining variables.
To change the options for more than one variable at once, select the variables in the grid, and click Edit Selected.
By default, each defined variable will generate a new node, but if two or more variable definitions share the same node name, a single node will be generated containing multiple variables. It is also possible to add variables to existing nodes.
Sometimes it is useful to discretize continuous data, generating a discrete variable, where each state represents a continuous interval.
Discrete variables, whose state values are intervals, can be created manually, but often it is useful to generate them from a data source. To generate a discretized variable, change the Discretization option, to use one of the discretization algorithms. In addition, update the DiscretizationOptions if required.
The following discretization algorithms are available:
- Clustering - uses a probabilistic clustering algorithm to determine the intervals, based on cluster centers
- Equal frequencies - defines intervals such that each one represents a similar number of items from the data source.
- Equal ranges - defines intervals by splitting the range of continuous values into equal amounts.