Basic exercises on decision trees

Question

I am a pure math person doing some ML self-study and I am pretty lost.

I am trying to solve the following exercises on decision trees:

Exercise 1. Consider the following training set where $X_1,X_2,X_3,X_4$ are the attributes and $Y$ is the class variable. $$ \begin{matrix} Y & X_1 & X_2 & X_3 & X_4 \\ 1 & 0 & 1 & 0 & 1 \\ 1 & 1 & 0 & 1 & 0 \\ 1 & 1 & 1 & 1 & 0 \\ 1 & 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 1 & 0 \\ -1& 0 & 0 & 1 & 1 \\ -1& 0 & 0 & 0 & 0 \\ -1& 0 & 0 & 1 & 0 \\ -1& 1 & 0 & 0 & 0 \\ -1& 0 & 0 & 1 & 1 \\ \end{matrix} $$

Learn a decision tree using the ID3 algorithm.

Draw a decision tree having only 4 leaf nodes, 3 internal nodes and depth bounded by 2, that has 100% accuracy on the given dataset.

Exercise 2. Let $x$ be a vector of $n$ Boolean variables ${X_1,...,X_n}$ and let $k$ be an integer less than $n$. Let $f_k$ be a target concept which is a disjunction consisting of $k$ literals. State the size of the smallest possible consistent decision tree (namely a decision tree that correctly classifies all possible examples) for $f_k$ in terms of $n$ and $k$ and describe its shape.

Now, I've done the first part of Exersice 1 as follows: first I noticed that $X_2$ is pure and so I chose it as root; then I used Information Gain to find the nodes. The second question confuses me: my understanding was that ID3 already gives the smallest tree, so how am I supposed to make it smaller? Or if am I wrong can anybody help me clarify?

For the second problem, I really don't know where to start from, so any hint would be appreciated.

score 2 · Accepted Answer · answered Mar 03 '21 at 23:22

ID3 is a heuristic. It is not guaranteed to generate an optimal decision tree.

Here is an algorithm for Exercise 1:

If $X_3 = 1$, then answer $2X_1-1$.
If $X_3 = 0$, then answer $2X_4-1$.

For Exercise 2, suppose $f_k = X_1 \lor \cdots \lor X_k$. An optimal decision tree simulates the following algorithm:

If $X_1 = 1$, output 1, else continue.
If $X_2 = 1$, output 1, else continue.
...
If $X_k = 1$, output 1, else output 0.

Indeed, every decision tree for $f_k$ must query all of $X_1,\ldots,X_k$ in the worst case.

Basic exercises on decision trees

1 Answers1