0

How can I introduce bias for a decision tree model while building an ML application?

e.g. If I am building a stock trading recommendation algorithim, I would want to recommend a stock only when the model detects a probability of swing (upturn and downturn) but, when I have a set of stocks that I have defined as volatile, I would like the model to recommend them only when the probability of swing is above a certain value. Can I define this as bias? How can I introduce this in a model?

Can I:

  • Introduce a categorical varable that defines a certain stock as volatile and then fit?

or

  • Set a value to such a stock as categorical and then fit?

Apologies I am not able to explain my question better but essentially, I want to introduce bias in a model. What is the correct approach to doing it?

PyNoob
  • 83
  • 9

1 Answers1

1

In general it's a bad idea to try to force a model to do something: ML is supposed to be data-driven, so if the data doesn't represent the particular desirable pattern then either there's a good reason for that (i.e. the pattern is not as relevant as one thinks it is) or the data is not suitable for the task (or noisy, incomplete...).

You don't give any detail about the current model so there's no way to know whether introducing a variable will change the model the way you want, it depends how the ranking is calculated (assuming there's a ranking involved).

Keep in mind that there's no reason to make the model do everything itself, especially if it's not based on the data. It might make sense to do some rule-based pre- or post-processing. In the case you mention it would be simple to post-process the prediction: if the stock is volatile and the probability is lower than the threshold then ignore this stock.

Erwan
  • 26,519
  • 3
  • 16
  • 39