How to interpret RMSE to evaluate a regression model

Question

I am trying to evaluate a regression model (random forests); my understanding is that R^2 (coefficient of determination) is not a good measure of fitness since my dataset is non-linear. It looks like RMSE is the usual choice, but how do I know what is a 'good' value? Furthermore, it seems that RMSE is sensitive to the scale of the data?? I don't have a baseline model to compare to unfortunately.

edit: perhaps I use learning curves to determine if it is underfit or overfit ?

J_H · Accepted Answer · 2024-01-26T01:10:42.597

You didn’t tell us about the use case or business domain for your problem. For example, if you were modeling battery energy consumption in noise canceling headphones, root mean squared error would be a natural loss function for your model; it falls out of the power equations.

Figure out what matters to the business. Write it down. Then pick a loss function that steers the model in the direction the business cares about.

All that you told us about the problem you’re solving is that it involves a “non-linear” dataset. It’s not obvious that RMSE is a natural measure for your problem. You did not describe nonlinear equations of motion, or other relevant description of the situation you’re examining.

Often it can be convenient to precondition inputs with a nonlinear transform. For example, if you were looking at impact velocity or crater size in some trebuchet observations, SQRT might be a natural fit. If you’re looking at market cap of firms in a given vertical, or home prices, or salaries, distributions will skew toward large figures, and a LOG transform may prove useful for taming that long tail.

In the problems I’ve worked on, interpreting root mean square error has never been an issue. Usually it corresponds to heat load dissipated by a resistor, or size of a power supply in a control system. If error magnitude suggests we could exceed rated load of the component, then we look for another model solution, or buy a bigger component.

Clearly we can rank order models by RMSE. But if that’s not interpretable enough for your use case, and there’s not some obvious feature in the input space that suggests a ratio against the error output, then maybe RMSE isn’t appropriate to the problem. Adopting a metric “because everyone else is using it” doesn’t sound like a principled approach.

predict housing prices .... What would be an appropriate loss function?

Well now it's pretty obvious what matters to the business. For a large firm it's just profit, or capital at risk. So MAE, possibly with asymmetric skewing so a "loss" surprise weighs more heavily than a windfall "profit" surprise. Sounds like a pretty linear measure.

Predicting the error bars around an estimate might be more valuable than the actual estimate. In the face of large uncertainty, choose not to transact.

For a small firm, existential risk ("we can't make payroll next month") may be the more interesting measure. It's very non-linear, either we're in business next month or we aren't. So training a model to identify low variance predicted transactions could be the focus. High recall might not matter if the market is large enough to be choosy, as long as we have fairly high precision on the deals we choose to participate in. A loss function like RMSE can be helpful here, in the sense that it discourages large errors more than MAE would. I can't offer a principled theory for why we should square such dollar errors, instead of, say, cubing them. If we're going for "time value of money", then maybe EXP plays nicely with compound interest and with the opportunity cost of alternative investments we didn't make.

I have worked with models of house price and of propensity for owner to sell. I can tell you that nailing it within one standard deviation of ± $5k, in the U.S. market, is essentially impossible. There's a lot of things happening in the market, not all of them observable by a model.

How to interpret RMSE to evaluate a regression model

1 Answers1