nbnode package¶
Subpackages¶
- nbnode.apply package
- nbnode.io package
- nbnode.plot package
- nbnode.simulation package
- Submodules
- nbnode.simulation.FlowSimulationTree module
BaseFlowSimulationTreeBaseFlowSimulationTree.estimate_cell_distributions()BaseFlowSimulationTree.estimate_population_distribution()BaseFlowSimulationTree.generate_populations()BaseFlowSimulationTree.ncells_from_percentages()BaseFlowSimulationTree.remove_population()BaseFlowSimulationTree.reset_populations()BaseFlowSimulationTree.sample()BaseFlowSimulationTree.sample_populations()BaseFlowSimulationTree.set_seed()
FlowSimulationTreeDirichletFlowSimulationTreeDirichlet.alpha_allFlowSimulationTreeDirichlet.estimate_population_distribution()FlowSimulationTreeDirichlet.generate_populations()FlowSimulationTreeDirichlet.mean_leafsFlowSimulationTreeDirichlet.new_pop_mean()FlowSimulationTreeDirichlet.pop_alpha()FlowSimulationTreeDirichlet.pop_leafnode_names()FlowSimulationTreeDirichlet.pop_mean()FlowSimulationTreeDirichlet.precisionFlowSimulationTreeDirichlet.remove_population()
- nbnode.simulation.TreeMeanDistributionSampler module
- nbnode.simulation.TreeMeanRelative module
- nbnode.simulation.save_sample module
- nbnode.simulation.sim_proportional module
- nbnode.simulation.sim_target module
- Module contents
- nbnode.specific_analyses package
- nbnode.testutil package
- nbnode.utils package
Submodules¶
nbnode.nbnode module¶
- class nbnode.nbnode.NBNode(name: str, parent: NBNode | None = None, decision_value: Any | None = None, decision_name: str | None = None, decision_cutoff: float | None = None, **kwargs)[source]¶
Bases:
NodeNon-binary node class, inherits from anytree.Node.
- apply(fun, input_attribute_name: str = 'data', result_attribute_name: str | None = None, iterator=<class 'anytree.iterators.preorderiter.PreOrderIter'>, *fun_args, **fun_kwargs) Dict[NBNode, Any] | None[source]¶
Apply the given function to the .data property of each node.
- Parameters:
fun – Function to apply on the attribute named input_attribute_name
input_attribute_name – Name of the attribute to apply fun on.
result_attribute_name – If result_attribute_name is given, the return value of fun(node.data) is set to the node’s result_attribute_name-attribute
iterator – How to iterate over the nodes.
- astype_math_node_attribute(dtype, inplace=True) NBNode[source]¶
Replaces all node.math_node_attribute with the given dtype.
- both_iterator(other: NBNode, strict: bool = False) Tuple[NBNode, NBNode][source]¶
Iterates over self and other simultaneously.
Gives (yields) the same nodes until EITHER tree is at its end.
- copy_structure() NBNode[source]¶
Copy only the structure of the tree.
This does not copy the data, the ids or the counts. It copies additionally set attributes.
- Returns:
_description_
- Return type:
- count(node_list: List[NBNode] | None = None, reset_counts: bool = True, use_ids: bool = False) None[source]¶
Count ids and save into node.counter.
- Parameters:
node_list –
- The [usually predicted] nodes. Usual workflow would be:
Predict n samples
Get a list of n nodes from these predictions
But you can insert here any node (inside the tree) you want.
reset_counts – Should all .counter be set to 0?
use_ids – If use_ids==True, do not use node_list to count but just access the length of the node.ids
- property data: DataFrame¶
Data of a node for its ids.
root._data contains all data. However, each node only “holds” a subset of the data. To not have to copy the data for each node, we just subset the data for each node by the node’s ids.
Usually you would set the ids by celltree.id_preds(predicted_nodes). You can also set them manually, but you have to be certain that they match to the order of the data!
- Returns:
A subset of the root._data corresponding to the node’s ids.
- Return type:
pd.DataFrame
- static do_cutoff(value: float | Any, cutoff: float | None) Any[source]¶
If cutoff is not None, cut the value into 1 or -1.
Otherwise return the value.
- static edge_label_fun(decision_names, decision_values)[source]¶
Function to label the edges of the tree.
- eq_structure(other: NBNode) bool[source]¶
Check if the structure of two trees is equal.
It only checks node.name, node.decision_name and node.decision_value. It disregards the data, ids, counts or any other attribute.
- export_counts(only_leafnodes: bool = False, node_counts_dtype='int64') DataFrame[source]¶
Export the counts of the predicted celltree to a pd.Dataframe.
Rows are the samples, columns the node names get_name_full()
- Parameters:
- Returns:
Rows are the samples, columns the node names get_name_full().
- Return type:
pd.DataFrame
- export_dot(unique_dot_exporter_kwargs: Dict = 'default') str[source]¶
Convenience wrapper around anytree.DotExporter.
- Parameters:
unique_dot_exporter_kwargs (Dict, optional) –
Arguments to anytree.DotExporter. Defaults to “default”:
unique_dot_exporter_kwargs = { "options": ['node [shape=box, style="filled", color="black"];'], "nodeattrfunc": lambda node: 'label="{}", fillcolor="white"'.format( node.name ), }
- Returns:
A string with the exported dot graph in dot format
- Return type:
- get_name_full() str[source]¶
Get the full name of the node (including the root node “/”)
- Returns:
Full name of the node.
- Return type:
- graph_from_dot(tree: ~nbnode.nbnode.NBNode | None = None, exported_dot_graph: str | None = None, title: str | None = None, fillcolor_node_attribute: str = 'counter', custom_min_max_dict: ~typing.Dict[str, float] | None = None, minmax: str = 'equal', fillcolor_missing_val: str = '#91FF9D', node_text_attributes: ~typing.List[str] | ~typing.Dict[str, str] = 'default', cmap: str = <matplotlib.colors.LinearSegmentedColormap object>)[source]¶
See NBNode._graph_from_dot.
If no
treeis given, self is used.
- id_preds(node_list: List[NBNode], reset_ids: bool = True)[source]¶
Predict node ids.
Given a list of nodes, enumerate through them and assign this (enumerate(node_list)) number to self.ids.
This is then used to subset self.data for each node.
- Parameters:
node_list (List['NBNode']) – _description_
reset_ids (bool, optional) – _description_. Defaults to True.
- insert_nodes(node_list: List[NodeMixin], copy_list: bool = False)[source]¶
Insert nodes with the current node as parent.
For every node in node_list, create a child node of self.
- Parameters:
node_list – A list of nodes. All of these nodes get the parent set to the current node
copy_list – If you reuse a list of nodes, e.g. twice, the nodes will not be assigned to both insertion nodes but RE-assigned to ONE of them. To omit this, copy the nodes first.
- join(other: NBNode, add_source_name: str | None = None, add_source_self: str = 'self', add_source_other: str = 'other', inplace: bool = True) NBNode[source]¶
Join two NBNodes.
The NBNodes must match in structure.
The nodes are added –> math_node_attribute of both NBNodes
The data of other is added to the data of self
The ids of other are added to the ids of self according to the new _data
- Parameters:
other (NBNode) – The other NBNode to join
add_source_name (str, optional) –
If True, when joining the data an additional column is created with the name add_source_name which then contains either
add_source_selforadd_source_otherdepending on the source of the data.Defaults to None, so the column is not created.
add_source_self (str, optional) –
The value to use for the column
add_source_namewhen the data is joined.Defaults to “self”.
add_source_other (str, optional) –
The value to use for the column
add_source_namewhen the data is joinedDefaults to “other”.
inplace (bool, optional) – If True, the NBNode is modified inplace.
- Returns:
A single NBNode with the data of both NBNodes
- Return type:
- predict(values: List | Dict | DataFrame | None = None, names: list | None = None, allow_unfitting_data: bool = False, allow_part_predictions: bool = False) NBNode | List[NBNode] | Series[source]¶
See
single_prediction.But you can put in dataframes or ndarrays instead of only dict + value/key paired lists.
If values is not given or None, the self._data is used.
- Returns:
Returns for each value its NBNode
- Return type:
List[NBNodes]
- prediction_str(nodename: str, split: str = '/') NBNode[source]¶
Return the node that is the prediction for the given nodename.
- Parameters:
nodename (str) – The name of the node to predict. Should be something matching to any
node.get_name_full(). You have to start with “/” as the root node.split (str, optional) –
The string to split the single node names. Defaults to “/”. E.g. “/child1/child2” corresponds to the node in the hierarchy:
root |---child1 | |---child2
- Returns:
The node for the given nodename.
- Return type:
- pretty_print(print_attributes: List[str] = '__default__', round_ndigits: int | None = None)[source]¶
Print the tree in a pretty way.
- Parameters:
- set_DotExporter_ids()[source]¶
Create unique ids for each node.
DotExporter needs unique ids for each node. I set them to the hex(id(node)) to make sure they are unique.
- single_prediction(values: List | Dict, names: list | None = None, allow_unfitting_data: bool = False, allow_part_predictions: bool = False) NBNode | List[NBNode][source]¶
Predicts the endnode (leaf) of the tree given the values.
- Parameters:
values – Either a list or a dict of values. If a dict is given, the keys of the dict are used as names. This is used to identify the correct _exact_ value for the decision node defined by
self.decision_value.names – If values is a list, names is a list of the names of the values. This is used to identify the correct value for the decision node defined by
self.decision_name.allow_unfitting_data – If True, returns None if the data you gave was not possible to fit in the tree. If False, raises a ValueError. Useful if decision values only fit partly to the tree but perfectly (completely) to another branch of the tree.
allow_part_predictions – If True, returns all (potentially multiple!) nodes that fit the given values. They do not have to be leaf nodes. If False, returns only the first node that fits the given values.
- Returns:
Either a single NBNode instance (the leaf node) or if multiple leaf nodes fit, all of them as a list.
nbnode.nbnode_trees module¶
- nbnode.nbnode_trees.tree_complete_aligned() NBNode[source]¶
Tree for the aligned (now “rescaled”) T-cell panel data.
Uses decision cutoffs.
- Returns:
See
tree_complete_cell(), only the decision cutoffs are different- Return type:
- nbnode.nbnode_trees.tree_complete_aligned_trunk() NBNode[source]¶
Trunk of the tree_complete_aligned_v2 tree.
- Returns:
NBNode:
AllCells (counter:0, decision_name:None, decision_value:None) ├── DN (counter:0, decision_name:['CD4', 'CD8'], decision_value:[-1, -1]) ├── DP (counter:0, decision_name:['CD4', 'CD8'], decision_value:[1, 1]) ├── CD4-/CD8+ (counter:0, decision_name:['CD4', 'CD8'], decision_value:[-1, 1]) │ ├── naive (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, 1]) │ ├── Tcm (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, -1]) │ ├── Temra (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, 1]) │ └── Tem (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, -1]) └── CD4+/CD8- (counter:0, decision_name:['CD4', 'CD8'], decision_value:[1, -1]) ├── naive (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, 1]) ├── Tcm (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, -1]) ├── Temra (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, 1]) └── Tem (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, -1])
- nbnode.nbnode_trees.tree_complete_aligned_v2()[source]¶
Tree for the aligned (now “rescaled”) T-cell panel data.
Uses decision cutoffs.
- Returns:
See
tree_complete_aligned(), only the decision cutoffs are different- Return type:
- nbnode.nbnode_trees.tree_complete_cell() NBNode[source]¶
Complete tree for T-cell panel of Beckman Coulter.
- Returns:
NBNode:
AllCells () ├── not CD45 () └── CD45+ () ├── not CD3 () │ ├── not MNC () │ └── MNCs () │ ├── other () │ └── Monocytes () └── CD3+ () ├── DN () ├── DP () ├── CD4-/CD8+ () │ ├── naive () │ │ ├── CD27+/CD28+ () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ ├── CD27+/CD28- () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ ├── CD27-/CD28+ () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ └── CD27-/CD28- () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── Tcm () │ │ ├── CD27+/CD28+ () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ ├── CD27+/CD28- () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ ├── CD27-/CD28+ () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ └── CD27-/CD28- () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── Temra () │ │ ├── CD27+/CD28+ () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ ├── CD27+/CD28- () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ ├── CD27-/CD28+ () │ │ │ ├── CD57+/PD1+ () │ │ │ ├── CD57+/PD1- () │ │ │ ├── CD57-/PD1+ () │ │ │ └── CD57-/PD1- () │ │ └── CD27-/CD28- () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ └── Tem () │ ├── CD27+/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27+/CD28- () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27-/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ └── CD27-/CD28- () │ ├── CD57+/PD1+ () │ ├── CD57+/PD1- () │ ├── CD57-/PD1+ () │ └── CD57-/PD1- () └── CD4+/CD8- () ├── naive () │ ├── CD27+/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27+/CD28- () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27-/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ └── CD27-/CD28- () │ ├── CD57+/PD1+ () │ ├── CD57+/PD1- () │ ├── CD57-/PD1+ () │ └── CD57-/PD1- () ├── Tcm () │ ├── CD27+/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27+/CD28- () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27-/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ └── CD27-/CD28- () │ ├── CD57+/PD1+ () │ ├── CD57+/PD1- () │ ├── CD57-/PD1+ () │ └── CD57-/PD1- () ├── Temra () │ ├── CD27+/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27+/CD28- () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ ├── CD27-/CD28+ () │ │ ├── CD57+/PD1+ () │ │ ├── CD57+/PD1- () │ │ ├── CD57-/PD1+ () │ │ └── CD57-/PD1- () │ └── CD27-/CD28- () │ ├── CD57+/PD1+ () │ ├── CD57+/PD1- () │ ├── CD57-/PD1+ () │ └── CD57-/PD1- () └── Tem () ├── CD27+/CD28+ () │ ├── CD57+/PD1+ () │ ├── CD57+/PD1- () │ ├── CD57-/PD1+ () │ └── CD57-/PD1- () ├── CD27+/CD28- () │ ├── CD57+/PD1+ () │ ├── CD57+/PD1- () │ ├── CD57-/PD1+ () │ └── CD57-/PD1- () ├── CD27-/CD28+ () │ ├── CD57+/PD1+ () │ ├── CD57+/PD1- () │ ├── CD57-/PD1+ () │ └── CD57-/PD1- () └── CD27-/CD28- () ├── CD57+/PD1+ () ├── CD57+/PD1- () ├── CD57-/PD1+ () └── CD57-/PD1- ()
- nbnode.nbnode_trees.tree_complex() NBNode[source]¶
Complex tree to use with yternary.
- Returns:
NBNode:
AllCells (counter:0, decision_name:None, decision_value:None) ├── not CD45 (counter:0, decision_name:CD45, decision_value:-1) └── CD45+ (counter:0, decision_name:CD45, decision_value:1) ├── not CD3 (counter:0, decision_name:CD3, decision_value:-1) │ ├── not MNC (counter:0, decision_name:MNC, decision_value:-1) │ └── MNCs (counter:0, decision_name:MNC, decision_value:1) │ ├── other (counter:0, decision_name:CD4, decision_value:-1) │ └── Monocytes (counter:0, decision_name:CD4, decision_value:1) └── CD3+ (counter:0, decision_name:CD3, decision_value:1) ├── DN (counter:0, decision_name:['CD4', 'CD8'], decision_value:[-1, -1]) ├── DP (counter:0, decision_name:['CD4', 'CD8'], decision_value:[1, 1]) ├── CD4-/CD8+ (counter:0, decision_name:['CD4', 'CD8'], decision_value:[-1, 1]) └── CD4+/CD8- (counter:0, decision_name:['CD4', 'CD8'], decision_value:[1, -1])
- nbnode.nbnode_trees.tree_simple() NBNode[source]¶
Simple tree for testing.
- Returns:
NBNode:
a (counter:0, decision_name:None, decision_value:None) ├── a0 (counter:0, decision_name:m1, decision_value:-1) ├── a1 (counter:0, decision_name:m1, decision_value:1) │ └── a1a (counter:0, decision_name:m2, decision_value:test) └── a2 (counter:0, decision_name:m3, decision_value:another)
- nbnode.nbnode_trees.tree_simpleB() NBNode[source]¶
Another simple tree for testing.
- Returns:
NBNode:
a (counter:0, decision_name:None, decision_value:None) ├── a0 (counter:0, decision_name:m1, decision_value:-1) ├── a1 (counter:0, decision_name:m1, decision_value:1) │ ├── a1a (counter:0, decision_name:m2, decision_value:test) │ └── a1b (counter:0, decision_name:m2, decision_value:tmp) └── a2 (counter:0, decision_name:m3, decision_value:another)
- nbnode.nbnode_trees.tree_simple_cutoff() NBNode[source]¶
_summary_.
- Returns:
NBNode:
a (counter:0, decision_name:None, decision_value:None) ├── a0 (counter:0, decision_name:m1, decision_value:-1) ├── a1 (counter:0, decision_name:m1, decision_value:1) │ └── a1a (counter:0, decision_name:m2, decision_value:test) └── a2 (counter:0, decision_name:m3, decision_value:another)
- nbnode.nbnode_trees.tree_simple_cutoff_NOTWORKING() NBNode[source]¶
Not working simple tree with decision cutoffs, only for testing.
Blank
- Returns:
NBNode:
a (counter:0, decision_name:None, decision_value:None) ├── a0 (counter:0, decision_name:m1, decision_value:-1) ├── a1 (counter:0, decision_name:m1, decision_value:1) │ └── a1a (counter:0, decision_name:m2, decision_value:test) ├── a2 (counter:0, decision_name:m3, decision_value:another) └── a3 (counter:0, decision_name:['m1', 'm4'], decision_value:[0, 1])
- nbnode.nbnode_trees.tree_simple_cutoff_mixed() NBNode[source]¶
Functioning tree with decision cutoffs, testing.
- Returns:
NBNode:
a (counter:0, decision_name:None, decision_value:None) ├── a0 (counter:0, decision_name:m1, decision_value:-1) ├── a1 (counter:0, decision_name:m1, decision_value:1) │ └── a1a (counter:0, decision_name:m2, decision_value:test) ├── a2 (counter:0, decision_name:m3, decision_value:another) └── a3 (counter:0, decision_name:['m2', 'm4'], decision_value:['test', 1])
nbnode.nbnode_util module¶
- nbnode.nbnode_util.frame_cov(dt_frame: Frame) DataFrame[source]¶
Compute the covariance matrix of a datatable frame from all columns.
Similar to pd.DataFrame.cov().
- Parameters:
dt_frame (datatable.Frame) – The datatable frame to compute the covariance matrix from
- Returns:
pd.DataFrame of the covariance matrix
- Return type:
_type_
- nbnode.nbnode_util.per_node_data_fun(x: DataFrame, fun_name: str, include_features: List[int | str] | slice | None = None, *fun_args, **fun_kwargs) DataFrame | Any[source]¶
per_node_data_fun.
To be used in NBnode.node.apply() to apply a function to the data of each node.
- Parameters:
x (pd.DataFrame) – A dataframe, usually the NBnode.data attribute
include_features (Union[List[Union[str, int]], slice]) – the given function
fun_namewill be applied to only these featuresfun_name (str) –
Name of the function, is usually retrieved by getattr(x, fun_name). Therefore, if x is e.g. an instance of datatable.Frame, fun_name can be any function applicable to a datatable.Frame, e.g. “mean”, “sum”. If x is e.g. an instance of pd.DataFrame, fun_name can be any function applicable to a pd.DataFrame, e.g. “mean”, “sum”, “cov”.
Special cases:
”cov”: compute the covariance matrix of the given features
- Returns:
Usually, but not necessarily a pd.DataFrame. Depends on the function.
- Return type:
pd.DataFrame
Examples:
node.apply( lambda x: per_node_data_fun( x=x, include_features=include_features, fun_name="mean" ), result_attribute_name="mean", )