Multivariate and multi-series LSTM

Question

I am trying to create a pollution prediction LSTM. I've seen an example on the web to cater for a Multivariate LSTM to predict the pollution levels for one city (Beijing), but what about more than one city? I don't really want a separate network for every city, I'd like a single generalised model/network for all x cities. But how do I feed that data into the LSTM?

Say I have the same data for each city, do I...

1) Train on all data for one city, then the next city, and so on until all cities are done.

2) Train data for all cities on date t, then data for all cities on t+1, then t+2 etc.

3) Something completely different.

Any thoughts?

score 2 · Accepted Answer · answered Jun 20 '18 at 08:07

I would say that option 1 will not work out too well: in my experience, the model will either only be good for the first or last model you train, depending on how much freedom you give the algorithm to change weights as time goes on (e.g. with the learning rate).

You really need to decide what you are going to be predicting. Is it the pollution level for a single city: Which features do you have for each city?

It could can make sense to train all cities at the same time if the features you have are also general ones that really can explain the target variable. So if you have temperature, humidity, some transport statistics for that city etc. then training everything together could make sense.

I would think about each sample leading to one target pollution level, and if that sample has enough information (based on the features) to distinguish itself from samples of the other cities, the model should pick up on and leverage those subtleties in the data.

score 1 · Answer 2 · answered Jun 20 '18 at 08:17

I can think of two alternatives:

Multiple Inputs / Multiple Outputs model: If the cities are close to each other, the pollution in one of them might affect the pollution in the other one. In this case, it makes sense to check for mutual pollution by having the measurements of each city as a separate input to your LSTM-RNN. You can train your network with these time series and during testing you will insert the pollution of each city at time t and the network can predict the pollution of all cities at t+n (n is the arbitrary horizon; the longer it is, the lower the accuracy of the prediction).
Single Input / Single Output Model: Another way will be to use all the training data of every city to create a single-input network, by assuming that all the data come from the same source (or at least similar sources). This network is trained to output the t+n pollution prediction given the pollution at t for any city. But this implies that your network can generalize well. In order to generalize with your LSTM-RNN, you should consider adding Dropout during training. See this and this.

Multivariate and multi-series LSTM

2 Answers2

Linked