Unexpected behaviour of Scikit-Learn SVR

Question

I'm using Scikit-learn to fit a support vector regression on a really simple dataset of car stopping distances vs car speed.

My code for applying SVR to this dataset is:

# Import libraries
from sklearn.svm import SVR
import numpy as np 
import matplotlib.pyplot as plt
Training and prediction dataset
X_train = np.array([ 4.,  4.,  7.,  7.,  8.,  9., 10., 10., 10., 11., 11., 12., 12.,
           12., 12., 13., 13., 13., 13., 14., 14., 14., 14., 15., 15., 15.,
           16., 16., 17., 17., 17., 18., 18., 18., 18., 19., 19., 19., 20.,
           20., 20., 20., 20., 22., 23., 24., 24., 24., 24., 25.])[:, None]
y_train = np.array([  2.,  10.,   4.,  22.,  16.,  10.,  18.,  26.,  34.,  17.,  28.,
            14.,  20.,  24.,  28.,  26.,  34.,  34.,  46.,  26.,  36.,  60.,
            80.,  20.,  26.,  54.,  32.,  40.,  32.,  40.,  50.,  42.,  56.,
            76.,  84.,  36.,  46.,  68.,  32.,  48.,  52.,  56.,  64.,  66.,
            54.,  70.,  92.,  93., 120.,  85.])
X_test =  np.linspace(0, 60, 100)[:, None]
Fit SVR
mysvr = SVR(kernel = "rbf")
mysvr.fit(X_train, y_train)
pred = mysvr.predict(X_test)
Plot results
fig, ax = plt.subplots()
ax.scatter(speeds, distances, label="data")
ax.plot(X_test, pred, label="SVR", color = 'r')
plt.xlabel("Speed")
plt.ylabel("Distance")
plt.xlim((0, 60))
ax.legend()
ax.grid(True)
plt.show()

The data and the SVR fit is shown below:

From my understanding of how the "rbf" kernel works, the prediction should go to zero if we extrapolate from the training data points. So why does the prediction tend towards distance = 40?

MuhammedYunus StopTheGenocide · Accepted Answer · 2024-06-10T16:09:07.227

SVR() has an independent intercept term .intercept_. Subtracting it from the prediction (i.e pred - mysvr.intercept_) brings the baseline of the curve to zero.

The User Guide §1.4 speaks to the intercept in various places, and towards the end clarifies that $b$ is an independent intercept term.

Response to comments.

As far as I've seen, the intercept term is usually there, but might be hidden/implied by $w$ (usually $w_0$ with a new feature $x_0=1$) rather than being explicitly stated as a separate bias term $b$.

To fit a model without an intercept, you can use LinearSVR(fit_intercept=False, ...). This is a linear SVR; to make it RBF-like you would need to prepend the model with an RBFSampler or another kernel approximator.

A simpler way might be to de-mean/centre data, and then fit the usual SVR(kernel="rbf") on it.

Unexpected behaviour of Scikit-Learn SVR

Training and prediction dataset

Fit SVR

Plot results

1 Answers1