1

I'm trying to find references about how to proceed to get the probability of an event happening when we have "temporal data" in our table

My data is basically:

  • hex_id: id of the object
  • date: monthly date (2018-01 to 2022-12)
  • var1: binary variable
  • var2: binary variable
  • event: binary variable (can happen more than once)

So, I have +-160 ids and 50+ entries of each of these IDs (50+ months of each IDs);

I would like to understand what should I try to solve this kind of problem, because of I'm a little scared to see that there are almost nothing about predicting categorical data on a "time series"/temporal data

Any reference or answer will be welcome.

1 Answers1

1

You can first create new features, if you believe they might have something to do with the occurrence of events - such as day of the week, weekend/work day, season, month. After you have created the new features, you want to run a classification model - Random Forest, XGB, LightGBM, KNN. Definitely try multiple models. You can also experiment with using less straightforward features - such as time passed from last event, number of previously occurred events on a specific object, whether neighboring ID-s have events, etc.