Questions tagged [cart]
16 questions
8
votes
1 answer
Decision Trees - C4.5 vs CART - rule sets
When I read the scikit-learn user manual about Decision Trees, they mentioned that
CART (Classification and Regression Trees) is very similar to C4.5,
but it differs in that it supports numerical target variables
(regression) and does not…
emperorspride188
- 181
- 1
- 2
4
votes
3 answers
List of samples that each tree in a random forest is trained on in Scikit-Learn
In Scikit-learn's random forest, you can set bootstrap=True and each tree would select a subset of samples to train on. Is there a way to see which samples are used in each tree?
I went through the documentation about the tree estimators and all the…
theonionring0127
- 43
- 5
3
votes
1 answer
Make a random forest estimator the exact same of a decision tree
The idea is to make one of the trees of a Random Forest, to be built exactly equal to a Decision Tree.
First, we load all libraries, fit a decision tree and plot it.
import numpy as np
import pandas as pd
import matplotlib.pyplot as…
Carlos Mougan
- 6,430
- 2
- 20
- 51
2
votes
1 answer
Problems with decision tree labeling of nodes
Decision trees as we know assigns label to the node based on majority class voting. I am curious to find that what could be the problems with such labeling schemes? Does it lead to overfitting the data?
christopher
- 153
- 4
1
vote
0 answers
Looking for CART/ML model that works with relative data
I am a beginner at AI and ML. I have been given a dataset, where I have noticed the columns are relative to one another. So is there any CART or ML model that can work with relative data ?
For example in Decision Tree, the tree looks like :
if…
BannerG
- 11
- 1
1
vote
1 answer
Random selection of variables in each run of python sklearn decision tree (regressio )
When I put random_state = None and run Decision tree for regression in python sklearn, it takes different variables to build tree each time?
Shouldn't there be only few top variables which should be used to split and should throw me similar trees…
Mighty
- 153
- 4
1
vote
0 answers
Prediction in CART Decision Trees
I was studying the algorithm of CART (classification and regression trees), but the formula of the prediction is irritating me.
First we have the following definition:
Let $X:={x_1,...,x_N} \subset \mathbb{R}^d $ of datapoints and $B$ the smallest…
Code Pope
- 121
- 3
1
vote
0 answers
What is the difference between a decision tree and something called "subgroup discovery algorithms"?
I'm reading a paper which states that subgroup discovery is:
Subgroup discovery is a data mining technique whose goal is to detect interesting subgroups into a
population with respect to a property of interest
The paper goes on to make the…
Anton
- 111
- 1
1
vote
0 answers
CART classification for imbalanced datasets with R
Hey guys i need your help for a university project. The main Task is to analyze the effects of over/under-smapling on a imbalanced Dataset. But before we can even start with that, our task sheet says, that we 1) have to find/create imbalanced…
mingabua
- 11
- 2
1
vote
1 answer
XGB predict_proba estimates don't match sum of leaves
When using an XGB model in the context of binary classification, I observed that the test estimates given by predict_proba were close but not equal to the results I obtained by summing the outputs of the corresponding leaves for each observation and…
1
vote
1 answer
N-ary decision tree with categorical features
I want to build an n-ary decision tree with categorical features.
I am using ordinary ID3 algorithm to build a tree.
Lets take the next dataset as a training dataset for building a decision…
dzi
- 111
- 2
1
vote
0 answers
How to perform bootstrap validation on CART decision tree?
I have a relatively small dataset n = 500 for which I am training a CART decision tree.
My dataset has about 30 variables and the outcome has 3 classes.
I am using CART for interpretability purposes, as what I am interested in, is sharing and…
Eric Yamga
- 11
- 2
0
votes
1 answer
scikit learn target variable reversed (DecisionTreeClassifier)
I created a Decision Tree Classifier using sklearn, defined the target variable:
#extract features and target variables
x = df.drop(columns="target_column",)
y = df["target_column"]
#save the feature name and target variables
feature_names =…
Godgory
- 3
- 1
0
votes
1 answer
how are split decisions for observations(not features) made in decision trees
I have read a lot of articles about decision trees, and every one of them only focused on telling how a feature/column is considered for split, based on criterion like gini index, entropy, chi-square and information gain. But, not one talked about…
Naveen Reddy Marthala
- 325
- 2
- 16
0
votes
1 answer
How do the splits points in a decision tree within Random Forest are taken/selected? (Base on which criteria?)
I checked many posts to figure out how random forest (RF) learning algorithm (an ensemble of many decision trees (DT) constructed by Rain forest algorithm) within bagging select split points at each leaf. There are some close questions which are…
Mario
- 571
- 1
- 6
- 24