Xgboost predict probabilities

Question

When using the python / sklearn API of xgboost are the probabilities obtained via the predict_proba method "real probabilities" or do I have to use logit:rawand manually calculate the sigmoid function?

I wanted to experiment with different cutoff points. Currently using binary:lgisticvia the sklearn:XGBClassifier the probabilities returned from the prob_a method rather resemble 2 classes and not a continuous function where changing the cut-off point impacts the final scoring.

Is this the right way to obtain probabilities for experimenting with the cutoff value?

score 3 · Answer 1 · answered Jan 09 '17 at 03:32

Curious Georg if you ran across this article in your pursuit of trying to generate probabilities. It is worth noting that binary:logistic and multi:softprob return predicted probability of each data point belonging to each class.

You can look here to see how the following code is used:

Georg Heiler · Accepted Answer · 2017-01-30T23:21:06.327

0

LightGBM forum was the answer ;) https://github.com/Microsoft/LightGBM/issues/272#issuecomment-276168493

apparently, my model is fitting the result very good, i.e. it is very sure about the class probabilities
I did not expect such clear-cut boundaries and thus was confused if this could be correct.

edited Jan 30 '17 at 23:21

answered Jan 30 '17 at 19:47

Georg Heiler

337
2
4
13

Xgboost predict probabilities

2 Answers2

Linked