Base Non-binary node functionality¶

Optional: Create a (conda) environment and activate it, install the package

conda create -y -n conda_nbnode python=3.8
conda activate conda_nbnode
git clone https://github.com/ggrlab/nbnode
cd nbnode
pip install --upgrade pip
pip install .

Base-functionality of the package is to enable non-binary trees. The following creates a tree with a root node a and three children a0, a1 and a2. a1 is the only child with another child a1a.

a
├── a0
├── a1
│   └── a1a
└── a2

A basic non-binary node (NBNode) consists of four important attributes:

- ``name`` The name of the node. This is the only mandatory attribute.
- ``parent`` The parent node of this node.
- ``decision_name`` The name of the value leading to this node.
- ``decision_value`` The value leading to this node.

The name of the node must only be unique within all childs of the parent node. The decision_name and decision_value are the named values leading to this node. Note that decision_name must be a string, but decision_value can be anything, including strings, integers, floats, etc.

To build the tree above, we can use the following code:

[45]:

from nbnode.nbnode import NBNode
simple_tree = NBNode("a")
NBNode("a0", parent=simple_tree, decision_value=-1, decision_name="m1")
a1 = NBNode("a1", parent=simple_tree, decision_value=1, decision_name="m1")
NBNode("a2", parent=simple_tree, decision_value="another", decision_name="m3")
NBNode("a1a", parent=a1, decision_value="test", decision_name="m2")

[45]:

NBNode('/a/a1/a1a', counter=0, decision_name='m2', decision_value='test')

We can check if the previous tree was built correctly:

[46]:

simple_tree.pretty_print()

a (counter:0)
├── a0 (counter:0)
├── a1 (counter:0)
│   └── a1a (counter:0)
└── a2 (counter:0)

And we can show additional information about each node of the tree:

[53]:

simple_tree.pretty_print("__long__")

a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:-1)
├── a1 (counter:0, decision_name:m1, decision_value:1)
│   └── a1a (counter:0, decision_name:m2, decision_value:test)
└── a2 (counter:0, decision_name:m3, decision_value:another)

[ ]:

# Alternatively, we prepared the tree already for you:
import nbnode.nbnode_trees as nbtree
simple_tree = nbtree.tree_simple()
simple_tree.pretty_print("__long__")

Finally, we use the tree to predict the final node of a new data point. The following values, supplied as two lists values and names are used to predict the final node.

[48]:

single_prediction = simple_tree.predict(
        values=[1, "test", 2], names=["m1", "m2", "m3"]
    )
print(single_prediction)

NBNode('/a/a1/a1a', counter=0, decision_name='m2', decision_value='test')

This returns the identified NBnode object defined by the values. NBNode can additionally handle the following data types:

[49]:

print("\nDictionary")
value_dict = {"m1": 1, "m2": "test", "m3": 2}
print(value_dict)
pred_dict = simple_tree.predict(values=value_dict)
print("Prediction: ")
print(pred_dict)


Dictionary
{'m1': 1, 'm2': 'test', 'm3': 2}
Prediction:
NBNode('/a/a1/a1a', counter=0, decision_name='m2', decision_value='test')

[50]:

print("\nPandas DataFrame")
import pandas as pd
value_df = pd.DataFrame.from_dict([value_dict])
print(value_df)
print("\nPrediction: ")
pred_df = simple_tree.predict(values=value_df)
print(pred_df)


Pandas DataFrame
   m1    m2  m3
0   1  test   2

Prediction:
0    (((NBNode('/a/a1/a1a', counter=0, decision_nam...
dtype: object

[51]:

print("\nNumpy array: Only for numerical values")
import numpy as np
values_np = np.array([[-1, 0, 0]])
print(values_np)
pred_np = simple_tree.predict(values=values_np,  names=["m1", "m2", "m3"])
print(pred_np)


Numpy array: Only for numerical values
[[-1  0  0]]
0    (((NBNode('/a/a0', counter=0, decision_name='m...
dtype: object

NBNode basic methods¶

NBNode has a large number of implemented basic methods:

[66]:

from nbnode.nbnode import NBNode
import nbnode.nbnode_trees as nbtree
simple_tree = nbtree.tree_simple()

# Print the tree
simple_tree.pretty_print("__long__")
# Print specific attributes of the tree as list
simple_tree.pretty_print(["counter"])
simple_tree.pretty_print(["decision_name", "decision_value"])
simple_tree.__dict__


# Access nodes
# Access a child of any (here root) node
simple_tree.children
a1 = simple_tree.children[1]
print(a1)

# You can also access nodes by their _full_ name
# full name is the path from root to the node, not the decision name, nor the node name
# You can retrieve the full name of a node by
print(a1.get_name_full())
# Mind the "/" ("root") at the beginning of the path
a1_by_name = simple_tree["/a/a1"]
print(a1_by_name)

# We can compare nodes! Here we have the exact same node, so it is identical.
assert a1_by_name == a1

a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:-1)
├── a1 (counter:0, decision_name:m1, decision_value:1)
│   └── a1a (counter:0, decision_name:m2, decision_value:test)
└── a2 (counter:0, decision_name:m3, decision_value:another)
a (counter:0)
├── a0 (counter:0)
├── a1 (counter:0)
│   └── a1a (counter:0)
└── a2 (counter:0)
a (decision_name:None, decision_value:None)
├── a0 (decision_name:m1, decision_value:-1)
├── a1 (decision_name:m1, decision_value:1)
│   └── a1a (decision_name:m2, decision_value:test)
└── a2 (decision_name:m3, decision_value:another)
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=1)
/a/a1
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=1)

Decision cutoffs¶

NBNode can also be used to split and then decide on continuous features.

[74]:

continuous_tree = NBNode("a")
NBNode("a0", parent=continuous_tree, decision_value=1, decision_name="m1", decision_cutoff=0.5)
NBNode("a1", parent=continuous_tree, decision_value=-1, decision_name="m1", decision_cutoff=0.5)
continuous_tree.pretty_print("__long__")

a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:1)
└── a1 (counter:0, decision_name:m1, decision_value:-1)

The above continuous_tree contains two nodes, which both decide on the value of m1 with either 1 or -1. Additionally, they have a decision cutoff. Until now, NBNode needed an exact match of the decision value. With decision_cutoff, the value in decision_name is first cut at the cutoff and returns:

True if >= 0.5
False if < 0.5

[75]:

print(continuous_tree.predict(values=[0.6], names=["m1"]))
print(continuous_tree.predict(values=[0.4], names=["m1"]))

print(continuous_tree.predict(values=[1], names=["m1"]))
print(continuous_tree.predict(values=[-1], names=["m1"]))

print(continuous_tree.predict(values=[10], names=["m1"]))
print(continuous_tree.predict(values=[-10], names=["m1"]))

NBNode('/a/a0', counter=0, decision_name='m1', decision_value=1)
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=-1)
NBNode('/a/a0', counter=0, decision_name='m1', decision_value=1)
NBNode('/a/a1', counter=0, decision_name='m1', decision_value=-1)

Multiple decision values¶

Some nodes need not only a single value to decide on the endnode but multiple. With NBNode, you can decide on any number of features.

[85]:

from nbnode.nbnode import NBNode

mytree = NBNode("a")
# a0 =
NBNode("a0", parent=mytree, decision_value=-1, decision_name="m1")
a1 = NBNode("a1", parent=mytree, decision_value=1, decision_name="m1")
# a2 =
NBNode("a2", parent=mytree, decision_value="another", decision_name="m3")
# a1a =
NBNode("a1a", parent=a1, decision_value="test", decision_name="m2")
NBNode(
    "a3",
    parent=mytree,
    decision_value=["test", 1],
    decision_name=["m2", "m4"],
    decision_cutoff=[None, 0],
)

mytree.pretty_print("__long__")

print("\n\nPredictions")
print(mytree.predict(values=[None, "test", None, 3], names=["m1", "m2", "m3", "m4"]))
try:
    print(mytree.predict(
        values=[None, "NOT_test", None, 3], names=["m1", "m2", "m3", "m4"]
        ))
except ValueError:
    print("ValueError: Could not find a fitting endnode for the data you gave. You also did not allow for part predictions.")

a (counter:0, decision_name:None, decision_value:None)
├── a0 (counter:0, decision_name:m1, decision_value:-1)
├── a1 (counter:0, decision_name:m1, decision_value:1)
│   └── a1a (counter:0, decision_name:m2, decision_value:test)
├── a2 (counter:0, decision_name:m3, decision_value:another)
└── a3 (counter:0, decision_name:['m2', 'm4'], decision_value:['test', 1])
Predictions


NBNode('/a/a3', counter=0, decision_name=['m2', 'm4'], decision_value=['test', 1])
ValueError: Could not find a fitting endnode for the data you gave. You also did not allow for part predictions.

Base Non-binary node functionality¶

NBNode basic methods¶

Decision cutoffs¶

Multiple decision values¶

nbnode

Navigation

Related Topics