6

When using the python / sklearn API of xgboost are the probabilities obtained via the predict_proba method "real probabilities" or do I have to use logit:rawand manually calculate the sigmoid function?

I wanted to experiment with different cutoff points. Currently using binary:lgisticvia the sklearn:XGBClassifier the probabilities returned from the prob_a method rather resemble 2 classes and not a continuous function where changing the cut-off point impacts the final scoring.

Is this the right way to obtain probabilities for experimenting with the cutoff value?

enter image description here

Georg Heiler
  • 337
  • 2
  • 4
  • 13

2 Answers2

3

Curious Georg if you ran across this article in your pursuit of trying to generate probabilities. It is worth noting that binary:logistic and multi:softprob return predicted probability of each data point belonging to each class.

You can look here to see how the following code is used: XGBoost Predict_Proba Code

0

LightGBM forum was the answer ;) https://github.com/Microsoft/LightGBM/issues/272#issuecomment-276168493

  • apparently, my model is fitting the result very good, i.e. it is very sure about the class probabilities
  • I did not expect such clear-cut boundaries and thus was confused if this could be correct.
Georg Heiler
  • 337
  • 2
  • 4
  • 13