RNN's with multiple features

Question

I have a bit of self taught knowledge working with Machine Learning algorithms (the basic Random Forest and Linear Regression type stuff). I decided to branch out and begin learning RNN's with Keras. When looking at most of the examples, which usually involve stock predictions, I haven't been able to find any basic examples of multiple features being implemented other than 1 column being the feature date and the other being the output. Is there a key fundamental thing I'm missing or something?

If anyone has an example I would greatly appreciate it.

Thanks!

score 32 · Accepted Answer · edited Aug 27 '18 at 21:08

Recurrent neural networks (RNNs) are designed to learn sequence data. As you guess, they can definitely take multiple features as input! Keras' RNNs take 2D inputs (T, F) of timesteps T and features F (I'm ignoring the batch dimension here).

However, you don't always need or want the intermediate timesteps, t = 1, 2 ... (T - 1). Therefore, Keras flexibly supports both modes. To have it output all T timesteps, pass return_sequences=True to your RNN (e.g., LSTM or GRU) at construction. If you only want the last timestep t = T, then use return_sequences=False (this is the default if you don't pass return_sequences to the constructor).

Below are examples of both of these modes.

Example 1: Learning the sequence

Here's a quick example of training a LSTM (type of RNN) which keeps the entire sequence around. In this example, each input data point has 2 timesteps, each with 3 features; the output data has 2 timesteps (because return_sequences=True), each with 4 data points (because that is the size I pass to LSTM).

import keras.layers as L
import keras.models as M

import numpy

# The inputs to the model.
# We will create two data points, just for the example.
data_x = numpy.array([
    # Datapoint 1
    [
        # Input features at timestep 1
        [1, 2, 3],
        # Input features at timestep 2
        [4, 5, 6]
    ],
    # Datapoint 2
    [
        # Features at timestep 1
        [7, 8, 9],
        # Features at timestep 2
        [10, 11, 12]
    ]
])

# The desired model outputs.
# We will create two data points, just for the example.
data_y = numpy.array([
    # Datapoint 1
    [
        # Target features at timestep 1
        [101, 102, 103, 104],
        # Target features at timestep 2
        [105, 106, 107, 108]
    ],
    # Datapoint 2
    [
        # Target features at timestep 1
        [201, 202, 203, 204],
        # Target features at timestep 2
        [205, 206, 207, 208]
    ]
])

# Each input data point has 2 timesteps, each with 3 features.
# So the input shape (excluding batch_size) is (2, 3), which
# matches the shape of each data point in data_x above.
model_input = L.Input(shape=(2, 3))

# This RNN will return timesteps with 4 features each.
# Because return_sequences=True, it will output 2 timesteps, each
# with 4 features. So the output shape (excluding batch size) is
# (2, 4), which matches the shape of each data point in data_y above.
model_output = L.LSTM(4, return_sequences=True)(model_input)

# Create the model.
model = M.Model(input=model_input, output=model_output)

# You need to pick appropriate loss/optimizers for your problem.
# I'm just using these to make the example compile.
model.compile('sgd', 'mean_squared_error')

# Train
model.fit(data_x, data_y)

Example 2: Learning the last timestep

If, on the other hand, you want to train an LSTM which only outputs the last timestep in the sequence, then you need to set return_sequences=False (or just remove it from the constructor entirely, since False is the default). And then your output data (data_y in the example above) needs to be rearranged, since you only need to supply the last timestep. So in this second example, each input data point still has 2 timesteps, each with 3 features. The output data, however, is just a single vector for each data point, because we have flattened everything down to a single timestep. Each of these output vectors still has 4 features, though (because that is the size I pass to LSTM).

import keras.layers as L
import keras.models as M

import numpy

# The inputs to the model.
# We will create two data points, just for the example.
data_x = numpy.array([
    # Datapoint 1
    [
        # Input features at timestep 1
        [1, 2, 3],
        # Input features at timestep 2
        [4, 5, 6]
    ],
    # Datapoint 2
    [
        # Features at timestep 1
        [7, 8, 9],
        # Features at timestep 2
        [10, 11, 12]
    ]
])

# The desired model outputs.
# We will create two data points, just for the example.
data_y = numpy.array([
    # Datapoint 1
    # Target features at timestep 2
    [105, 106, 107, 108],
    # Datapoint 2
    # Target features at timestep 2
    [205, 206, 207, 208]
])

# Each input data point has 2 timesteps, each with 3 features.
# So the input shape (excluding batch_size) is (2, 3), which
# matches the shape of each data point in data_x above.
model_input = L.Input(shape=(2, 3))

# This RNN will return timesteps with 4 features each.
# Because return_sequences=False, it will output 2 timesteps, each
# with 4 features. So the output shape (excluding batch size) is
# (2, 4), which matches the shape of each data point in data_y above.
model_output = L.LSTM(4, return_sequences=False)(model_input)

# Create the model.
model = M.Model(input=model_input, output=model_output)

# You need to pick appropriate loss/optimizers for your problem.
# I'm just using these to make the example compile.
model.compile('sgd', 'mean_squared_error')

# Train
model.fit(data_x, data_y)

RNN's with multiple features

1 Answers1

Example 1: Learning the sequence

Example 2: Learning the last timestep

Linked