How to predict a certain time span into the future with recurrent neural networks in Keras

Question

I have the following code for time series predictions with RNNs and I would like to know whether for the testing I predict one day in advance:

# -*- coding: utf-8 -*-
"""
Time Series Prediction with  RNN
"""
import pandas as pd
import numpy as np
from tensorflow import keras
#%%  Configure parameters
epochs = 5
batch_size = 50
steps_backwards = int(1* 4 * 24)
steps_forward = int(1* 4 * 24)
split_fraction_trainingData = 0.70
split_fraction_validatinData = 0.90
#%%  "Reading the data"
dataset = pd.read_csv('C:/User1/Desktop/TestValues.csv', sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime':[0]}, index_col=['datetime'])
df = dataset
data = df.values
indexWithYLabelsInData = 0
data_X = data[:, 0:2]
data_Y = data[:, indexWithYLabelsInData].reshape(-1, 1)
#%%   Prepare the input data for the RNN
series_reshaped_X =  np.array([data_X[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
series_reshaped_Y =  np.array([data_Y[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
timeslot_x_train_end = int(len(series_reshaped_X)* split_fraction_trainingData)
timeslot_x_valid_end = int(len(series_reshaped_X)* split_fraction_validatinData)
X_train = series_reshaped_X[:timeslot_x_train_end, :steps_backwards] 
X_valid = series_reshaped_X[timeslot_x_train_end:timeslot_x_valid_end, :steps_backwards] 
X_test = series_reshaped_X[timeslot_x_valid_end:, :steps_backwards]
indexWithYLabelsInSeriesReshapedY = 0
lengthOfTheYData = len(data_Y)-steps_backwards -steps_forward
Y = np.empty((lengthOfTheYData, steps_backwards, steps_forward))

for step_ahead in range(1, steps_forward + 1):

   Y[..., step_ahead - 1] =   series_reshaped_Y[..., step_ahead:step_ahead + steps_backwards, indexWithYLabelsInSeriesReshapedY]
Y_train = Y[:timeslot_x_train_end] 
Y_valid = Y[timeslot_x_train_end:timeslot_x_valid_end] 
Y_test = Y[timeslot_x_valid_end:]
#%%  Build the model and train it
model = keras.models.Sequential([
    keras.layers.SimpleRNN(90, return_sequences=True, input_shape=[None, 2]),
    keras.layers.SimpleRNN(60, return_sequences=True),
    keras.layers.TimeDistributed(keras.layers.Dense(steps_forward))
    #keras.layers.Dense(steps_forward)
])
model.compile(loss="mean_squared_error", optimizer="adam", metrics=['mean_absolute_percentage_error'])
history = model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size,
                    validation_data=(X_valid, Y_valid))
#%%    #Predict the test data
Y_pred = model.predict(X_test)
prediction_lastValues_list=[]
for i in range (0, len(Y_pred)):
  prediction_lastValues_list.append((Y_pred[i][0][steps_forward-1]))
#%% Create thw dataframe for the whole data
wholeDataFrameWithPrediciton = pd.DataFrame((X_test[:,0]))
wholeDataFrameWithPrediciton.rename(columns = {indexWithYLabelsInData:'actual'}, inplace = True)
wholeDataFrameWithPrediciton.rename(columns = {1:'Feature 1'}, inplace = True)
wholeDataFrameWithPrediciton['predictions'] = prediction_lastValues_list
wholeDataFrameWithPrediciton['difference'] = (wholeDataFrameWithPrediciton['predictions'] - wholeDataFrameWithPrediciton['actual']).abs()
wholeDataFrameWithPrediciton['difference_percentage'] = ((wholeDataFrameWithPrediciton['difference'])/(wholeDataFrameWithPrediciton['actual']))*100

So I define eps_forward = int(1* 4 * 24) which is basically one full day (in 15 minutes resolution which makes 1 * 4 *24 = 96 time stamps). I predict the test data by using Y_pred = model.predict(X_test) and I create a list with the predicted values by using for i in range (0, len(Y_pred)): prediction_lastValues_list.append((Y_pred[i][0][steps_forward-1]))

As for me the input and output data of RNNs is quite confusing I am not sure whether for the test dataset I predict one day in advance meaning 96 time steps into the future. Actually what I want is to read historic data and then predict the next 96 time steps based on the historic 96 time steps. Can anyone of you tell me whether I am doing this by using this code or not?

Here I have a link to some test data that I just created randomly. Do not care about the actual values but just on the structure of the prediction: Download Test Data

Reminder: My bountry is expiring soon and I have not received an answer to my basic question so far. I have uploaded a minimal reproducible example and even some test data. So I'd be quite happy if you could answer my basic question on whether I am forecasting 96 steps in advance with the given code. I'll highly appreciate it. If you need some further information, you can tell me.

score 1 · Answer 1 · answered Oct 22 '21 at 15:29

Usually with NN you would use LSTM layers to deal with time. Time steps can be a little confusing with TF/Keras. However, there is a great tutorial using the Jena data. Maybe this helps: https://blogs.rstudio.com/ai/posts/2017-12-20-time-series-forecasting-with-recurrent-neural-networks/

score 0 · Answer 2 · answered Oct 25 '21 at 20:30

Your code implements timeseries split from scratch. Implementing from scratch has the potential to introduce subtle bugs. Another option would be to use an established package. Examples include scikit-learn's TimeSeriesSplit and Keras' TimeseriesGenerator.

How to predict a certain time span into the future with recurrent neural networks in Keras

2 Answers2