0

I have time series data coming at 10sec intervals from passenger counter in a bus [10,10,10,10,9,9,9,5,5,5,10,10 ...]. I need to estimate the total number of passengers carried in 1 hour. When the counts decrease, it means someone/somepeople got off. And when it increases it means new people got on.

2 Answers2

0

First you will need to aggregate your data by the hour so that it is in the right format. It should be in the following format (t, c, x) where t is the hour-timestamp, c is the passenger count for that hour, and x is any other feature you might have that you think can help better estimate/predict the count. x can also be empty, i.e., none.

Then, you have a myriad of algorithms that you can apply. See this Wikipedia list for an example. You can choose the algorithm based on (1) your expertise and (2) the data's statistical properties.

I have personally used autoregressive moving average models and their many variants, and some deep learning models, like this tutorial here. I've learned that the correct choice of the model/algorithm depends on the problem itself, so my suggestion is for you to try out some of the algorithms presented in these links and see what works out for you. If you have a problem with applying a particular algorithm, than you can ask a more specific question.

Stefan Popov
  • 471
  • 2
  • 9
0

Maybe I'm missing something, but it seems to me that, to know the total number of people that have been in a bus during an hour, you just need to start with the initial value of people for that hour and add all the increments (not the decrements) over that hour.

For instance, if during one hour we had the following counter values:

10, 10, 10, 10,  9,  9,  9,  5,  5,  5, 10

We would first compute the successive differences (starting at the first value):

10,  0,  0,  0, -1,  0,  0, -4,  0,  0, +5

And then we would add only the positive values together: 10 + 5 = 15

Please, clarify if my understanding of the problem is not correct.

noe
  • 28,203
  • 1
  • 49
  • 83