# Split Up: dtreeviz (Part 3)

## Goal¶

This post aims to break down the module dtreeviz module step by step to fully understand what is implemented. After fully understanding this, I would like to contribute to this module and submit a pull request.

I really like this module and would like to see this works for other tree-based modules like XGBoost or Lightgbm. I found the exact same issue (issues 15) in github so I hope I could contribute to this issue.

This post is the 3rd part: breaking down ShadowDecTree.

Reference

## ShadowDecTreeNode class¶

### Source github¶

In :
import numpy as np
import pandas as pd
from collections import defaultdict, Sequence
from typing import Mapping, List, Tuple
from numbers import Number
from sklearn.utils import compute_class_weight

#

"""
A node in a shadow tree.  Each node has left and right
pointers to child nodes, if any.  As part of tree construction process, the
samples examined at each decision node or at each leaf node are
saved into field node_samples.
"""
def __init__(self, shadow_tree, id, left=None, right=None):
self.id = id
self.left = left
self.right = right

def split(self) -> (int,float):

def feature(self) -> int:

def feature_name(self) -> (str,None):
return None

def samples(self) -> List[int]:
"""
Return a list of sample indexes associated with this node. If this is a
leaf node, it indicates the samples used to compute the predicted value
or class.  If this is an internal node, it is the number of samples used
to compute the split point.
"""

def nsamples(self) -> int:
"""
Return the number of samples associated with this node. If this is a
leaf node, it indicates the samples used to compute the predicted value
or class. If this is an internal node, it is the number of samples used
to compute the split point.
"""
return self.shadow_tree.tree_model.tree_.n_node_samples[self.id] # same as len(self.node_samples)

def split_samples(self) -> Tuple[np.ndarray, np.ndarray]:
"""
Return the list of indexes to the left and the right of the split value.
"""
samples = np.array(self.samples())
split = self.split()
left = np.nonzero(node_X_data < split)
right = np.nonzero(node_X_data >= split)
return left, right

def isleaf(self) -> bool:
return self.left is None and self.right is None

def isclassifier(self):

def prediction(self) -> (Number,None):
"""
If this is a leaf node, return the predicted continuous value, if this is a
regressor, or the class number, if this is a classifier.
"""
if not self.isleaf(): return None
if self.isclassifier():
predicted_class = np.argmax(counts)
return predicted_class
else:

def prediction_name(self) -> (str,None):
"""
If the tree model is a classifier and we know the class names,
return the class name associated with the prediction for this leaf node.
Return prediction class or value otherwise.
"""
if self.isclassifier():
return self.prediction()

def class_counts(self) -> (List[int],None):
"""
If this tree model is a classifier, return a list with the count
associated with each class.
"""
if self.isclassifier():
else:
return None

def __str__(self):
if self.left is None and self.right is None:
return "<pred={value},n={n}>".format(value=round(self.prediction(),1), n=self.nsamples())
else:
return "({f}@{s} {left} {right})".format(f=self.feature_name(),
s=round(self.split(),1),
left=self.left if self.left is not None else '',
right=self.right if self.right is not None else '')


### Instantiate class objects¶

#### Create a tree model by scikit learn¶

In :
import numpy as np
import graphviz
from sklearn import tree

X = np.array([[0, 0], [1, 1]])
Y = np.array([0, 1])
# Y = [0, 1]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)
dot_data = tree.export_graphviz(clf, out_file=None,
feature_names=[0, 1],
class_names=['0', '1'],
filled=True, rounded=True,
special_characters=True)
graph = graphviz.Source(dot_data)
graph

Out:

### Create a ShadowDecTreeNode¶

ShadowDecTreeNode __init__

• L222-226: store input arguments as class members
• L228-308: define the same functions in tree objects like split, feature etc. or utility functions
In :
# instantiate ShadowDecTree

In :
# instantiate ShadowDecTreeNode

Out:
<__main__.ShadowDecTreeNode at 0x120eda908>

### Methods under ShadowTreeDecNode¶

In :
# L228 split

Out:
0.5
In :
# L231 feature

Out:
1
In :
# L239 samples

Out:
[0, 1]
In :
# L248 nsamples

Out:
2
In :
# L257 split_samples

Out:
(array(), array())
In :
# L268 isleaf

Out:
True
In :
# L271 isclassifier

Out:
array([ True])
In :
# L287 prediction_name

0
# L298 class_counts

array([1, 1])