{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Counting and math\n",
    "\n",
    "Apart from introducing non-binary trees, the power of NBNode comes from its included counting and math mechanisms. \n",
    "Each ``NBNode`` has a ``math_node_attribute`` which is used to calculate math on. This is usually set to ``counter``. \n",
    "\n",
    "In this example, we will use a small test dataset coming with the package. It comes from a flow cytometry experiment with 13 features (columns) of 999 cells (rows). \n",
    "Each cell can further be classified into cell types which we defined with prior biological knowledge as tree given in ``nbtree``.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Counting\n",
    "\n",
    "I start by introducing how to count. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/home/gugl/clonedgit/ccc_verse/nbnode/docs/notebooks\n",
      "         FS  FS.0      SS  CD45RA  CCR7  CD28   PD1  CD27   CD4   CD8   CD3   \n",
      "0    197657    94  186372    3.90  6.34  4.97 -1.98  7.51  5.87  3.55  5.83  \\\n",
      "1    180716    92  135447    6.48  6.63  5.17  3.07  7.38  5.49  2.64  5.83   \n",
      "2    134129    90  168268    5.92  6.53  5.39  2.60  7.57  5.70  2.54  5.74   \n",
      "3    239241    94   79262    5.47  6.57  4.68  3.30  7.36  5.75  2.76  6.06   \n",
      "4    246527    89   97635    6.12  6.26  5.22  3.05  7.40  5.70  2.66  6.29   \n",
      "..      ...   ...     ...     ...   ...   ...   ...   ...   ...   ...   ...   \n",
      "994  176236    90  149982    6.48 -1.11  2.85 -1.55  2.28  0.59  1.70  0.39   \n",
      "995  191863    99  115406    6.30  5.19  3.01  2.07 -1.58  0.62  1.02  0.73   \n",
      "996  217752    93  124675    6.35  4.75  0.42  1.89  2.02  0.52  1.48  0.53   \n",
      "997  334174    97  210458    1.90  1.36  1.22  2.52 -0.72  0.59  1.03  0.75   \n",
      "998  308089   103  219747    6.48 -0.42  1.23  2.64  7.07  0.57  1.82  1.72   \n",
      "\n",
      "     CD57  CD45  \n",
      "0    2.62  6.78  \n",
      "1    2.39  6.76  \n",
      "2    1.02  6.46  \n",
      "3    1.14  6.59  \n",
      "4    2.22  6.33  \n",
      "..    ...   ...  \n",
      "994  4.22  6.49  \n",
      "995  2.69  6.22  \n",
      "996  2.92  6.50  \n",
      "997  2.98  5.38  \n",
      "998  2.87  6.03  \n",
      "\n",
      "[999 rows x 13 columns]\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import re\n",
    "import pandas as pd\n",
    "\n",
    "print(os.getcwd())\n",
    "cellmat = pd.read_csv(\n",
    "    os.path.join(\n",
    "        os.pardir,\n",
    "        os.pardir,\n",
    "        \"tests\",\n",
    "        \"testdata\",\n",
    "        \"flowcytometry\",\n",
    "        \"gated_cells\",\n",
    "        \"cellmat.csv\",\n",
    "    )\n",
    ")\n",
    "# FS TOF (against FS INT which is \"FS\")\n",
    "cellmat.rename(columns={\"FS_TOF\": \"FS.0\"}, inplace=True)\n",
    "cellmat.columns = [re.sub(\"_.*\", \"\", x) for x in cellmat.columns]\n",
    "print(cellmat)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type='text/css'>\n",
       ".datatable table.frame { margin-bottom: 0; }\n",
       ".datatable table.frame thead { border-bottom: none; }\n",
       ".datatable table.frame tr.coltypes td {  color: #FFFFFF;  line-height: 6px;  padding: 0 0.5em;}\n",
       ".datatable .bool    { background: #DDDD99; }\n",
       ".datatable .object  { background: #565656; }\n",
       ".datatable .int     { background: #5D9E5D; }\n",
       ".datatable .float   { background: #4040CC; }\n",
       ".datatable .str     { background: #CC4040; }\n",
       ".datatable .time    { background: #40CC40; }\n",
       ".datatable .row_index {  background: var(--jp-border-color3);  border-right: 1px solid var(--jp-border-color0);  color: var(--jp-ui-font-color3);  font-size: 9px;}\n",
       ".datatable .frame tbody td { text-align: left; }\n",
       ".datatable .frame tr.coltypes .row_index {  background: var(--jp-border-color0);}\n",
       ".datatable th:nth-child(2) { padding-left: 12px; }\n",
       ".datatable .hellipsis {  color: var(--jp-cell-editor-border-color);}\n",
       ".datatable .vellipsis {  background: var(--jp-layout-color0);  color: var(--jp-cell-editor-border-color);}\n",
       ".datatable .na {  color: var(--jp-cell-editor-border-color);  font-size: 80%;}\n",
       ".datatable .sp {  opacity: 0.25;}\n",
       ".datatable .footer { font-size: 9px; }\n",
       ".datatable .frame_dimensions {  background: var(--jp-border-color3);  border-top: 1px solid var(--jp-border-color0);  color: var(--jp-ui-font-color3);  display: inline-block;  opacity: 0.6;  padding: 1px 10px 1px 5px;}\n",
       "</style>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:0, decision_name:None, decision_value:None)\n",
      "├── DN (counter:0, decision_name:['CD4', 'CD8'], decision_value:[-1, -1])\n",
      "├── DP (counter:0, decision_name:['CD4', 'CD8'], decision_value:[1, 1])\n",
      "├── CD4-/CD8+ (counter:0, decision_name:['CD4', 'CD8'], decision_value:[-1, 1])\n",
      "│   ├── naive (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, 1])\n",
      "│   ├── Tcm (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, -1])\n",
      "│   ├── Temra (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, 1])\n",
      "│   └── Tem (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, -1])\n",
      "└── CD4+/CD8- (counter:0, decision_name:['CD4', 'CD8'], decision_value:[1, -1])\n",
      "    ├── naive (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, 1])\n",
      "    ├── Tcm (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[1, -1])\n",
      "    ├── Temra (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, 1])\n",
      "    └── Tem (counter:0, decision_name:['CCR7', 'CD45RA'], decision_value:[-1, -1])\n"
     ]
    }
   ],
   "source": [
    "import nbnode.nbnode_trees as nbtree\n",
    "cell_tree = nbtree.tree_complete_aligned_trunk()\n",
    "cell_tree.pretty_print(\"__long__\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's predict the cell type of all cells which returns a list of 999 predicted nodes! "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0      (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "1      (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "2      (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "3      (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "4      (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "                             ...                        \n",
      "994    (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "995    (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "996    (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "997    (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "998    (((NBNode('/AllCells/DP', counter=0, decision_...\n",
      "Length: 999, dtype: object\n"
     ]
    }
   ],
   "source": [
    "cell_preds = cell_tree.predict(cellmat)\n",
    "print(cell_preds)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This by itself did not change anything in the tree. \n",
    "\n",
    "I will introduce another NBNode attribute: ``NBNode.ids``. This is a list of numerical indices indicating which _predicted_ nodes are \"contained\" in a specific node. \n",
    "Naturally, ``root.ids`` should contain _ALL_ ids, and every other node only the list of ids which are (or passed) the node until reaching a endnode. \n",
    "\n",
    "Even after predicting, no ids are set, so this is still an empty list. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n",
      "[]\n"
     ]
    }
   ],
   "source": [
    "print(cell_tree.ids)\n",
    "print(cell_tree[\"/AllCells/DP\"].ids)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To set the ids, you have to actively use the predicted nodes and identify their ids. ``celltree.id_preds`` takes a list of nodes and sorts them within the tree. The numerical index refers to the order in which the predicted nodes occurred!\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n",
      "999\n",
      "[69, 74, 443, 972, 973]\n",
      "[69, 74, 443, 972, 973]\n"
     ]
    }
   ],
   "source": [
    "cell_tree.id_preds(cell_preds)\n",
    "print(cell_tree.ids[0:10])\n",
    "print(len(cell_tree.ids))\n",
    "\n",
    "# With this here we see that nodes [69, 74, 443, 972, 973] are all in /AllCells/CD4-/CD8+\n",
    "# or a node below!\n",
    "print(cell_tree[\"/AllCells/CD4-/CD8+\"].ids[0:10])\n",
    "print(cell_tree[\"/AllCells/CD4-/CD8+\"].ids[0:10])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "\n",
    "However, it would be interesting to know how many cells are in each node. For this, we can use ``cell_preds.count(cell_preds)``"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "cell_tree.count(cell_preds)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we already set ``NBNode.ids``, we could also not recount but directly use the ``len(every_node.ids)`` which saves us a lot of computation. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "cell_tree.count(cell_preds, use_ids=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Internally, this iterates over every predicted node and iterates the tree until reaching the node. \n",
    "Any passed node's ``node.ids`` gets appended by the (numerical) index of the predicted node. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:999)\n",
      "├── DN (counter:0)\n",
      "├── DP (counter:973)\n",
      "├── CD4-/CD8+ (counter:5)\n",
      "│   ├── naive (counter:5)\n",
      "│   ├── Tcm (counter:0)\n",
      "│   ├── Temra (counter:0)\n",
      "│   └── Tem (counter:0)\n",
      "└── CD4+/CD8- (counter:21)\n",
      "    ├── naive (counter:20)\n",
      "    ├── Tcm (counter:0)\n",
      "    ├── Temra (counter:1)\n",
      "    └── Tem (counter:0)\n"
     ]
    }
   ],
   "source": [
    "cell_tree.pretty_print()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see that now the printed ``counter`` became filled, and the majority of cells are ``/AllCells/DP`` (which are double positive T-cells, but that does not matter for our examples). \n",
    "\n",
    "\n",
    "Finally, we can export the counts per node, but we should set ``.data`` for it, see another jupyter notebook for further explanation. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Counts for every sample, only leaf (=end) nodes:\n",
      "Sample                       0\n",
      "/AllCells/DN                 0\n",
      "/AllCells/DP               973\n",
      "/AllCells/CD4-/CD8+/naive    5\n",
      "/AllCells/CD4-/CD8+/Tcm      0\n",
      "/AllCells/CD4-/CD8+/Temra    0\n",
      "/AllCells/CD4-/CD8+/Tem      0\n",
      "/AllCells/CD4+/CD8-/naive   20\n",
      "/AllCells/CD4+/CD8-/Tcm      0\n",
      "/AllCells/CD4+/CD8-/Temra    1\n",
      "/AllCells/CD4+/CD8-/Tem      0\n",
      "\n",
      "\n",
      "Counts for every sample, leaf AND intermediate nodes:\n",
      "Sample                       0\n",
      "/AllCells/DN                 0\n",
      "/AllCells/DP               973\n",
      "/AllCells/CD4-/CD8+/naive    5\n",
      "/AllCells/CD4-/CD8+/Tcm      0\n",
      "/AllCells/CD4-/CD8+/Temra    0\n",
      "/AllCells/CD4-/CD8+/Tem      0\n",
      "/AllCells/CD4-/CD8+          5\n",
      "/AllCells/CD4+/CD8-/naive   20\n",
      "/AllCells/CD4+/CD8-/Tcm      0\n",
      "/AllCells/CD4+/CD8-/Temra    1\n",
      "/AllCells/CD4+/CD8-/Tem      0\n",
      "/AllCells/CD4+/CD8-         21\n",
      "/AllCells                  999\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n",
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:353: UserWarning: self.ids was an empty list, subset an empty dataframe. Did you call celltree.id_preds(predicted_nodes)? Can also be a node with no cells.\n",
      "  warnings.warn(\n"
     ]
    }
   ],
   "source": [
    "cell_tree.data = cellmat\n",
    "print(\"\\nCounts for every sample, only leaf (=end) nodes:\")\n",
    "print(cell_tree.export_counts(only_leafnodes=True).transpose())\n",
    "print(\"\\n\\nCounts for every sample, leaf AND intermediate nodes:\")\n",
    "print(cell_tree.export_counts(only_leafnodes=False).transpose())"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Math on one NBNode\n",
    "\n",
    "After we now have a usefull number assigned to each node, we can do quite a bit of math. \n",
    "Each ``NBNode`` has a ``math_node_attribute`` which is used to calculate math on. This is usually set to ``counter``. \n",
    "\n",
    "We can then use usual math to add, subtract, multiply, etc. nodes with numerics.  \n",
    "\n",
    "Note that this is then not backed up by ``NBNode.ids`` anymore!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:1099)\n",
      "├── DN (counter:100)\n",
      "├── DP (counter:1073)\n",
      "├── CD4-/CD8+ (counter:105)\n",
      "│   ├── naive (counter:105)\n",
      "│   ├── Tcm (counter:100)\n",
      "│   ├── Temra (counter:100)\n",
      "│   └── Tem (counter:100)\n",
      "└── CD4+/CD8- (counter:121)\n",
      "    ├── naive (counter:120)\n",
      "    ├── Tcm (counter:100)\n",
      "    ├── Temra (counter:101)\n",
      "    └── Tem (counter:100)\n",
      "None\n",
      "AllCells (counter:999)\n",
      "├── DN (counter:0)\n",
      "├── DP (counter:973)\n",
      "├── CD4-/CD8+ (counter:5)\n",
      "│   ├── naive (counter:5)\n",
      "│   ├── Tcm (counter:0)\n",
      "│   ├── Temra (counter:0)\n",
      "│   └── Tem (counter:0)\n",
      "└── CD4+/CD8- (counter:21)\n",
      "    ├── naive (counter:20)\n",
      "    ├── Tcm (counter:0)\n",
      "    ├── Temra (counter:1)\n",
      "    └── Tem (counter:0)\n",
      "None\n"
     ]
    }
   ],
   "source": [
    "# Math operations do not happen inplace\n",
    "added_tree = cell_tree + 100\n",
    "print(added_tree.pretty_print())\n",
    "\n",
    "print(cell_tree.pretty_print())"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Re-counting by using the ids RESETS all math operations and overwrites the counter with the length of the ids!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:999)\n",
      "├── DN (counter:0)\n",
      "├── DP (counter:973)\n",
      "├── CD4-/CD8+ (counter:5)\n",
      "│   ├── naive (counter:5)\n",
      "│   ├── Tcm (counter:0)\n",
      "│   ├── Temra (counter:0)\n",
      "│   └── Tem (counter:0)\n",
      "└── CD4+/CD8- (counter:21)\n",
      "    ├── naive (counter:20)\n",
      "    ├── Tcm (counter:0)\n",
      "    ├── Temra (counter:1)\n",
      "    └── Tem (counter:0)\n"
     ]
    }
   ],
   "source": [
    "added_tree.count(use_ids=True)\n",
    "added_tree.pretty_print()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To focus on the important math, we will only print the root node from now on, but could use pretty_print() everytime to show that the operations happen on every node. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NBNode('/AllCells', counter=1099, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=1089, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=2178, decision_name=None, decision_value=None)\n"
     ]
    }
   ],
   "source": [
    "new_tree = cell_tree + 100\n",
    "print(new_tree)\n",
    "\n",
    "new_tree = new_tree - 10\n",
    "print(new_tree)\n",
    "\n",
    "new_tree = new_tree *2\n",
    "print(new_tree)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sometimes it is important which type is used. There are two options to do that:\n",
    "\n",
    " 1. Change the math operation such that it is appropriate\n",
    " 2. Modify the type of the tree"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NBNode('/AllCells', counter=242.0, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=242.0, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=242, decision_name=None, decision_value=None)\n"
     ]
    }
   ],
   "source": [
    "try: \n",
    "    new_tree = new_tree /3\n",
    "except TypeError as e:\n",
    "    print(\"TypeError: descriptor '__truediv__' requires a 'float' object but received a 'int'\")\n",
    "\n",
    "new_tree = new_tree /3.0\n",
    "# Note that the error did NOT happen in the rootnode, so it might be that some math \n",
    "# operations have already been done!\n",
    "print(new_tree)\n",
    "\n",
    "# With astype_math_node_attribute we can change the type of the math node attribute\n",
    "# from all nodes in the tree\n",
    "print(new_tree.astype_math_node_attribute(float))\n",
    "print(new_tree.astype_math_node_attribute(int))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NBNode('/AllCells', counter=16.133333333333333, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=16, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=2, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=968, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=60, decision_name=None, decision_value=None)\n"
     ]
    }
   ],
   "source": [
    "print(new_tree / 15)\n",
    "print(new_tree // 15)\n",
    "print(new_tree % 15)\n",
    "print(new_tree << 2)\n",
    "print(new_tree >> 2)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Equalities\n",
    "\n",
    "When introducing counters, suddenly the same \"trees\" are not identical anymore: \n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n",
      "False\n"
     ]
    }
   ],
   "source": [
    "print(cell_tree == cell_tree)\n",
    "print(cell_tree == cell_tree + 100)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Therefore we introduce the difference between \"structural\" and \"complete\" identity. \n",
    "Two trees are structurally equal if their node.name, node.decision_name and node.decision_value are equal, everything else can be different. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False\n",
      "True\n",
      "a (counter:0)\n",
      "├── a0 (counter:0)\n",
      "├── a1 (counter:0)\n",
      "│   └── a1a (counter:0)\n",
      "├── a2 (counter:0)\n",
      "└── ADDITIONAL_NODE (counter:0)\n",
      "a (counter:0)\n",
      "├── a0 (counter:0)\n",
      "├── a1 (counter:0)\n",
      "│   └── a1a (counter:0)\n",
      "└── a2 (counter:0)\n",
      "False\n",
      "False\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/gugl/.conda_envs/nbnode_pyscaffold/lib/python3.8/site-packages/nbnode/nbnode.py:364: UserWarning: data is no pandas.DataFrame, converting it via pd.DataFrame(data).\n",
      "  warnings.warn(\n"
     ]
    }
   ],
   "source": [
    "from nbnode.nbnode import NBNode\n",
    "import nbnode.nbnode_trees as nbtree\n",
    "\n",
    "original_tree = nbtree.tree_simple()\n",
    "new_tree = nbtree.tree_simple()\n",
    "print(new_tree == new_tree  + 100)\n",
    "print(new_tree.eq_structure(new_tree + 100))\n",
    "\n",
    "NBNode(\"ADDITIONAL_NODE\", parent=new_tree)\n",
    "\n",
    "new_tree.pretty_print()\n",
    "original_tree.pretty_print()\n",
    "\n",
    "print(original_tree == original_tree + 100)\n",
    "print(original_tree == new_tree)\n",
    "\n",
    "# You can generate a new tree by only copying the structure, then counts and data are not copied: \n",
    "new_tree = original_tree.copy_structure()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Math with multiple NBNodes\n",
    "\n",
    "We can then use usual math to add, subtract, multiply, etc. nodes with each other. Explicitely, this traverses all nodes in both trees simultaneously and does the mathematical operation using both ``math_node_attribute``. The result is then saved in the ``math_node_attribute``, but no tree is changed inplace. \n",
    "\n",
    "Note that this is then not backed up by ``NBNode.ids`` anymore!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:999)\n",
      "├── DN (counter:0)\n",
      "├── DP (counter:973)\n",
      "├── CD4-/CD8+ (counter:5)\n",
      "│   ├── naive (counter:5)\n",
      "│   ├── Tcm (counter:0)\n",
      "│   ├── Temra (counter:0)\n",
      "│   └── Tem (counter:0)\n",
      "└── CD4+/CD8- (counter:21)\n",
      "    ├── naive (counter:20)\n",
      "    ├── Tcm (counter:0)\n",
      "    ├── Temra (counter:1)\n",
      "    └── Tem (counter:0)\n",
      "AllCells (counter:1)\n",
      "├── DN (counter:1)\n",
      "├── DP (counter:1)\n",
      "├── CD4-/CD8+ (counter:1000)\n",
      "│   ├── naive (counter:1)\n",
      "│   ├── Tcm (counter:1)\n",
      "│   ├── Temra (counter:1)\n",
      "│   └── Tem (counter:1)\n",
      "└── CD4+/CD8- (counter:1)\n",
      "    ├── naive (counter:1)\n",
      "    ├── Tcm (counter:1)\n",
      "    ├── Temra (counter:1)\n",
      "    └── Tem (counter:1)\n"
     ]
    }
   ],
   "source": [
    "import copy\n",
    "import nbnode.nbnode_trees as nbtree\n",
    "cell_tree = nbtree.tree_complete_aligned_trunk()\n",
    "cell_tree.id_preds(cell_tree.predict(cellmat))\n",
    "cell_tree.count(use_ids=True)\n",
    "cell_tree.pretty_print()\n",
    "\n",
    "cell_tree_2 = copy.deepcopy(cell_tree)\n",
    "# Reset the counts of the nodes\n",
    "cell_tree_2.reset_counts()\n",
    "cell_tree_2 = cell_tree_2 + 1\n",
    "\n",
    "# You can set the counter values manually. \n",
    "# Keep in mind that setting an intermediate node (like this one)\n",
    "#  might not make any sense biologically as every cell must reach a leaf node\n",
    "cell_tree_2[\"/AllCells/CD4-/CD8+\"].counter = 1000\n",
    "cell_tree_2.pretty_print()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:1000)\n",
      "├── DN (counter:1)\n",
      "├── DP (counter:974)\n",
      "├── CD4-/CD8+ (counter:1005)\n",
      "│   ├── naive (counter:6)\n",
      "│   ├── Tcm (counter:1)\n",
      "│   ├── Temra (counter:1)\n",
      "│   └── Tem (counter:1)\n",
      "└── CD4+/CD8- (counter:22)\n",
      "    ├── naive (counter:21)\n",
      "    ├── Tcm (counter:1)\n",
      "    ├── Temra (counter:2)\n",
      "    └── Tem (counter:1)\n",
      "NBNode('/AllCells', counter=999, decision_name=None, decision_value=None)\n",
      "NBNode('/AllCells', counter=1, decision_name=None, decision_value=None)\n"
     ]
    }
   ],
   "source": [
    "# Add the two trees\n",
    "(cell_tree + cell_tree_2).pretty_print()\n",
    "print(cell_tree)\n",
    "print(cell_tree_2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:998)\n",
      "├── DN (counter:-1)\n",
      "├── DP (counter:972)\n",
      "├── CD4-/CD8+ (counter:-995)\n",
      "│   ├── naive (counter:4)\n",
      "│   ├── Tcm (counter:-1)\n",
      "│   ├── Temra (counter:-1)\n",
      "│   └── Tem (counter:-1)\n",
      "└── CD4+/CD8- (counter:20)\n",
      "    ├── naive (counter:19)\n",
      "    ├── Tcm (counter:-1)\n",
      "    ├── Temra (counter:0)\n",
      "    └── Tem (counter:-1)\n"
     ]
    }
   ],
   "source": [
    "(cell_tree - cell_tree_2).pretty_print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:999)\n",
      "├── DN (counter:0)\n",
      "├── DP (counter:973)\n",
      "├── CD4-/CD8+ (counter:5000)\n",
      "│   ├── naive (counter:5)\n",
      "│   ├── Tcm (counter:0)\n",
      "│   ├── Temra (counter:0)\n",
      "│   └── Tem (counter:0)\n",
      "└── CD4+/CD8- (counter:21)\n",
      "    ├── naive (counter:20)\n",
      "    ├── Tcm (counter:0)\n",
      "    ├── Temra (counter:1)\n",
      "    └── Tem (counter:0)\n"
     ]
    }
   ],
   "source": [
    "(cell_tree * cell_tree_2).pretty_print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:999.0)\n",
      "├── DN (counter:0.0)\n",
      "├── DP (counter:973.0)\n",
      "├── CD4-/CD8+ (counter:0.005)\n",
      "│   ├── naive (counter:5.0)\n",
      "│   ├── Tcm (counter:0.0)\n",
      "│   ├── Temra (counter:0.0)\n",
      "│   └── Tem (counter:0.0)\n",
      "└── CD4+/CD8- (counter:21.0)\n",
      "    ├── naive (counter:20.0)\n",
      "    ├── Tcm (counter:0.0)\n",
      "    ├── Temra (counter:1.0)\n",
      "    └── Tem (counter:0.0)\n"
     ]
    }
   ],
   "source": [
    "\n",
    "(cell_tree / cell_tree_2).pretty_print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AllCells (counter:0)\n",
      "├── DN (counter:0)\n",
      "├── DP (counter:0)\n",
      "├── CD4-/CD8+ (counter:5)\n",
      "│   ├── naive (counter:0)\n",
      "│   ├── Tcm (counter:0)\n",
      "│   ├── Temra (counter:0)\n",
      "│   └── Tem (counter:0)\n",
      "└── CD4+/CD8- (counter:0)\n",
      "    ├── naive (counter:0)\n",
      "    ├── Tcm (counter:0)\n",
      "    ├── Temra (counter:0)\n",
      "    └── Tem (counter:0)\n"
     ]
    }
   ],
   "source": [
    "\n",
    "(cell_tree % cell_tree_2).pretty_print()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}