0

I have two datasets in long formats.

Dataset #1: The columns are weeks (202301, 202302, ..., 202352). The indices are item ID's. The values are sales units (count).

Dataset #2: The columns are weeks (202301, 202302, ..., 202352). The indices are promo ID's. The values are flag (binary, of 0 and 1). The 1 values are sparse.

For each item, I want to see which promo ID has the most influence on sales units. In other words, if flag = 1 then we should see more sales.

I imagine I will have to run a regression on every single item/promo combination (roughly 100 million separate regressions).

I plan on using a Poisson regression in python, however, is there a better statistical method better suited for my needs? I am currently not worried about predictability, but rather explainability. I.e., which item/promo combos have the most influence with each other.

feonyte
  • 101
  • 1

0 Answers0