I am reading a presentation and it recommends not using leave one out encoding, but it is okay with one hot encoding. I thought they both were the same. Can anyone describe what the differences between them are?
Asked
Active
Viewed 1.5k times
20
1 Answers
20
They are probably using "leave one out encoding" to refer to Owen Zhang's strategy.
From here
The encoded column is not a conventional dummy variable, but instead is the mean response over all rows for this categorical level, excluding the row itself. This gives you the advantage of having a one-column representation of the categorical while avoiding direct response leakage
