Best practice advice for linear regression - if training data contains entries that do not need predictions, is it commonplace to remove these entries? For example, if you are predicting a fare amount but some fares are flat fee fares (not needing to be predicted since they are predetermined), is it best practice to remove these from a sampled data set before training? Or, does removing them create biased data?
This is the shortened version of the question I asked here: Best practice advice for known target values before training a linear regression model?