Highest Voted 'lightgbm' Questions - Data Science Stack Exchange

24

votes

1 answer

Lightgbm vs xgboost vs catboost

I've seen that in Kaggle competitions people are using lightgbms where they used to use xgboost. My question is: when would you rather use xgboost instead of lightgbm? What about catboost?

asked Apr 19 '19 at 06:08

David Masip

6,136
2
28
62

20

votes

3 answers

L1 & L2 Regularization in Light GBM

This question pertains to L1 & L2 regularization parameters in Light GBM. As per official documentation: reg_alpha (float, optional (default=0.)) – L1 regularization term on weights. reg_lambda (float, optional (default=0.)) – L2 regularization term…

xgboost regularization lightgbm

asked Aug 08 '19 at 17:08

Vikrant Arora

466
1
5
10

19

votes

3 answers

What is the proper way to use early stopping with cross-validation?

I am not sure what is the proper way to use early stopping with cross-validation for a gradient boosting algorithm. For a simple train/valid split, we can use the valid dataset as the evaluation dataset for the early stopping and when refitting we…

xgboost cross-validation lightgbm early-stopping

asked May 17 '20 at 15:15

amine456

191
1
1
4

19

votes

5 answers

How to make LightGBM to suppress output?

I have tried for a while to figure out how to "shut up" LightGBM. Especially, I would like to suppress the output of LightGBM during training (i.e. feedback on the boosting steps). My model: params = { 'objective': 'regression', …

python boosting lightgbm

asked Jun 17 '19 at 15:06

Peter

7,896
5
23
50

15

votes

4 answers

LightGBM gives different results (metrics) depending on the columns order

I have two nearly identical datasets A and B which differ only in terms of columns ordering. I then train a LightGBM model on each of the two datasets with the following steps: Divide each dataset into training and testing (use the same random seed…

machine-learning classification lightgbm

asked Apr 30 '19 at 17:09

Duy Bui

261
2
5

14

votes

2 answers

SHAP value analysis gives different feature importance on train and test set

Should SHAP value analysis be done on the train or test set? What does it mean if the feature importance based on mean |SHAP value| is different between the train and test set of my lightgbm model? I intend to use SHAP analysis to identify how each…

features lightgbm predictor-importance shap

asked Oct 07 '19 at 19:10

pbk

143
1
6

11

votes

1 answer

Random Forest VS LightGBM

Random Forest VS LightGBM Can somebody explain in-detailed differences between Random Forest and LightGBM? And how the algorithms work under the hood? As per my understanding from the documentation: LightGBM and RF differ in the way the trees are…

machine-learning random-forest lightgbm

asked Nov 18 '19 at 07:44

Pluviophile

4,203
14
32
56

11

votes

1 answer

Differences between class_weight and scale_pos weight in LightGBM

I have a very imbalanced dataset with the ratio of the positive samples to the negative samples being 1:496. The scoring metric is the f1 score and my desired model is LightGBM. I am using the sklearn implementation of LightGBM. I have read the…

classification class-imbalance lightgbm

asked Jun 18 '19 at 20:36

Oluwafemi E. Ogundare

111
1
1
4

10

votes

1 answer

How do GBM algorithms handle missing data?

How do algorithms GBM algorithms, such as XGBoost or LightGBM handle NaN values? I know that they learn how to replace NaN values with other values but my question is: How do they do it exactly?

xgboost missing-data lightgbm

asked Jan 06 '20 at 12:37

user10296606

1,906
6
18
33

7

votes

2 answers

What is the best way (cheapest / fastest option) to train an model on massive dataset (400GB+, 100m rows x 200 columns)?

I have a 400GB data set that I want to train a model on. What is the cheapest method to train this model? The options I can think of so far are: AWS instance with massive RAM and train CPU (slow, but instances are cheap). AWS instance with many…

machine-learning python xgboost bigdata lightgbm

asked Mar 18 '21 at 02:04

lara_toff

221
1
3

7

votes

1 answer

How is the "base value" of SHAP values calculated?

I'm trying to understand how the base value is calculated. So I used an example from SHAP's github notebook, Census income classification with LightGBM. Right after I trained the lightgbm model, I applied explainer.shap_values() on each row of the…

lightgbm explainable-ai shap

asked May 04 '20 at 23:09

David293836

217
1
2
6

7

votes

2 answers

Correct interpretation of summary_plot shap graph

While through the various resources online to understand the shap plots, I ended up slightly confused. Find below my interpretation of the overall plot given in examples - Shap value 0 for a feature corresponds to the average prediction using all…

python data-science-model lightgbm

asked Jan 03 '20 at 12:39

Sanchez_P

101
1
1
5

7

votes

2 answers

Math Behind GOSS (Gradient-Based One Side Sampling)?

As per my understanding through books & Google Search, GOSS (Gradient-Based One Side Sampling) is a novel sampling method that downsamples the instances on the basis of gradients. As we know instances with small gradients are well trained (small…

machine-learning lightgbm

asked Dec 02 '19 at 09:01

Pluviophile

4,203
14
32
56

7

votes

2 answers

Light GBM Regressor, L1 & L2 Regularization and Feature Importances

I want to know how L1 & L2 regularization works in Light GBM and how to interpret the feature importances. Scenario is: I used LGBM Regressor with RandomizedSearchCV (cv=3, iterations=50) on a dataset of 400000 observations & 160 variables. In order…

feature-selection regularization lightgbm

asked Aug 08 '19 at 09:35

Vikrant Arora

466
1
5
10

6

votes

1 answer

What is Pruning & Truncation in Decision Trees?

Pruning & Truncation As per my understanding Truncation: Stop the tree while it is still growing so that it may not end up with leaves containing very low data points. One way to do this is to set a minimum number of training inputs to use on each…

xgboost lightgbm pruning gradient-boosting-decision-trees

asked Nov 27 '19 at 07:01

Pluviophile

4,203
14
32
56

Questions tagged [lightgbm]