Split-Up: dtreeviz (Part 1)

Goal

This post aims to go through each function in dtreeviz module to fully understand what is implemented. After fully understanding this, I would like to contribute to this module and submit a pull request.

I really like this module and would like to see this works for other tree-based modules like XGBoost or Lightgbm. I found the exact same issue (issues 15) in github so I hope I could contribute to this issue.

You would just have to get ShadowDecisionTree wrappers for those trees.

Based on this comment, I need first understand the class object ShadowDecisionTree

image

Understand folder structure

In this post, we will deep dive into the core module dtreeviz image

This module comprises of 4 python files. image

__init__.py is empty so we can skip it.

Let's see one by one.

dtreeviz Module

shadow.py

shadow module contains two class objects:

  • ShadowDecTree
  • ShadowDecTreeNode

image

image

Since ShadowDecTreeNode take shadow_tree object as one of the input argument, we can first look at ShadowDecTree.

image

ShadowDecTree

Attributes

There are 12 class attributes in ShadowDecTree.

image

tree_model, feature_names, class_names, class_weight, X_train, y_train is just passed from input arguments and inherited from decision tree object e.g., scikit-learn decision tree. class_names might be updated when the number of class is more than 2.

self.tree_model = tree_model
        self.feature_names = feature_names
        self.class_names = class_names
        self.class_weight = tree_model.class_weight
        # Omit the lines
        self.X_train = X_train
        self.y_train = y_train
  • root is a root ShadowDecTreeNode
  • internal and leaves are a list of ShadowDecTreeNode generated by a function walk.

node_to_samples and unique_target_values

To be continued....

Comments

Comments powered by Disqus