8

I'm doing some classification experiments with decision trees ( specifically rpart package in R). By setting the depth of a decision tree to 10 I expect to get a small tree but it is in fact quite large and its size is 7650. So what is exactly the definition of size (and depth) in decision trees?

PS: my dataset is quite large.

Raphael
  • 73,212
  • 30
  • 182
  • 400
user
  • 81
  • 1
  • 1
  • 2

1 Answers1

12

The depth of a decision tree is the length of the longest path from a root to a leaf.

The size of a decision tree is the number of nodes in the tree.

Note that if each node of the decision tree makes a binary decision, the size can be as large as $2^{d+1}-1$, where $d$ is the depth. If some nodes have more than 2 children (e.g., they make a ternary decision instead of a binary decision), then the size can be even larger. So, a size of 7650 is not unreasonable, if some nodes make binary decisions and some make ternary or multi-way decisions.

D.W.
  • 167,959
  • 22
  • 232
  • 500