I have an OHLCV dataset that starts on 01-01-2000 and ends on 31-12-2003 and I want to evaluate a model, say an SVM regressor.
What is the correct routine to evaluate the performance of the model from 01-01-2003?
These are the steps I performed:
For testday in [01-01-2003, 31-12-2003]:
1. X_train = data[01-01-2000, testday-1]
2. X_test = data[testday]
3. scaler_X = MinMaxScaler().fit(X_train)
4. X_scaled_train = scaler_X.transform(X_train)
5. X_scaled_test = scaler_X.transform(X_test)
6. model = svm.SVR()
7. model.fit(X_scaled_train, y_train)
8. res[testday]['pred'] = model.predict(X_scaled_test)[0]
9. res[testday]['real'] = y_test[0]
Finally, I get the accuracy with:
accuracy_score(res['real'], res['pred'])*100
Is this routine of training on a increasing number of days and testing on the next day correct?