1

I have three columns,['date'] which has the date, ['id'] which holds product id's and ['rating'] which holds product ratings for each product for each data, I want to create a dummy variable ['threshold'] which equals 1 when within the same value of ['id'] the value of rating went from anywhere above 5 to anywhere below 6. My code would use a for loop as follows:

df['threshold']=np.zeros(df.shape[0])
for i in range(df.shape[0]):
        if df.iloc[i]['id'] == df.iloc[i-1]['id'] and df.iloc[i-1]['rating']>5 and df.iloc[i]['rating']<6:
            df.iloc[i]['threshold']=1

Is there a way to perform this without using a for loop?

Olivier
  • 95
  • 6
  • 1
    Please include a small sample of your data along with your desired results. Take a look at [how-to-make-good-reproducible-pandas-examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Shubham Sharma Jun 21 '20 at 07:27

1 Answers1

1

Use Series.shift and compare with Series.eq for equal and convert output mask to integers 0,1 by Series.view:

df['threshold']= (df['id'].eq(df['id'].shift()) & 
                  df['rating'].shift().gt(5) & 
                  df['rating'].lt(6)).view('i1')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252