10

How do algorithms GBM algorithms, such as XGBoost or LightGBM handle NaN values? I know that they learn how to replace NaN values with other values but my question is: How do they do it exactly?

Hagbard
  • 434
  • 6
  • 16
user10296606
  • 1,906
  • 6
  • 18
  • 33

1 Answers1

7

LIGHTGBM will ignore missing values during a split, then allocate them to whichever side reduces the loss the most. https://github.com/microsoft/LightGBM/issues/2921

There are some options you can set such as usemissing=false, which disables handling for missing values. You can also use the zeroas_missing option to change behavior. GitHub

Noah Weber
  • 5,829
  • 1
  • 13
  • 26