After having searched for similar questions I found out with this and this questions. Unfortunately neither of them works with me.
The first works on all the columns, the second does not work on my column of True and False and returns error (I also have not understood it completely).
Here's a description of the problem:
I am working with a dataframe of ~54k rows. Here's an example of 24 values:
+----+---------------------+---------------------+----------------------+--------------------+-------+
| | date | omegasr | omega | omegass | isday |
+----+---------------------+---------------------+----------------------+--------------------+-------+
| 1 | 2012-03-27 00:00:00 | -1.5707963267948966 | -3.32335035194977 | 1.5707963267948966 | False |
| 2 | 2012-03-27 01:00:00 | -1.5707963267948966 | -3.0615509641506207 | 1.5707963267948966 | False |
| 3 | 2012-03-27 02:00:00 | -1.5707963267948966 | -2.799751576351471 | 1.5707963267948966 | False |
| 4 | 2012-03-27 03:00:00 | -1.5707963267948966 | -2.5379521885523215 | 1.5707963267948966 | False |
| 5 | 2012-03-27 04:00:00 | -1.5707963267948966 | -2.2761528007531724 | 1.5707963267948966 | False |
| 6 | 2012-03-27 05:00:00 | -1.5707963267948966 | -2.014353412954023 | 1.5707963267948966 | False |
| 7 | 2012-03-27 06:00:00 | -1.5707963267948966 | -1.7525540251548732 | 1.5707963267948966 | False |
| 8 | 2012-03-27 07:00:00 | -1.5707963267948966 | -1.4907546373557239 | 1.5707963267948966 | True |
| 9 | 2012-03-27 08:00:00 | -1.5707963267948966 | -1.2289552495565745 | 1.5707963267948966 | True |
| 10 | 2012-03-27 09:00:00 | -1.5707963267948966 | -0.9671558617574253 | 1.5707963267948966 | True |
| 11 | 2012-03-27 10:00:00 | -1.5707963267948966 | -0.7053564739582756 | 1.5707963267948966 | True |
| 12 | 2012-03-27 11:00:00 | -1.5707963267948966 | -0.44355708615912615 | 1.5707963267948966 | True |
| 13 | 2012-03-27 12:00:00 | -1.5707963267948966 | -0.1817576983599767 | 1.5707963267948966 | True |
| 14 | 2012-03-27 13:00:00 | -1.5707963267948966 | 0.08004168943917273 | 1.5707963267948966 | True |
| 15 | 2012-03-27 14:00:00 | -1.5707963267948966 | 0.34184107723832213 | 1.5707963267948966 | True |
| 16 | 2012-03-27 15:00:00 | -1.5707963267948966 | 0.6036404650374716 | 1.5707963267948966 | True |
| 17 | 2012-03-27 16:00:00 | -1.5707963267948966 | 0.8654398528366211 | 1.5707963267948966 | True |
| 18 | 2012-03-27 17:00:00 | -1.5707963267948966 | 1.127239240635771 | 1.5707963267948966 | True |
| 19 | 2012-03-27 18:00:00 | -1.5707963267948966 | 1.3890386284349199 | 1.5707963267948966 | True |
| 20 | 2012-03-27 19:00:00 | -1.5707963267948966 | 1.6508380162340692 | 1.5707963267948966 | False |
| 21 | 2012-03-27 20:00:00 | -1.5707963267948966 | 1.9126374040332188 | 1.5707963267948966 | False |
| 22 | 2012-03-27 21:00:00 | -1.5707963267948966 | 2.174436791832368 | 1.5707963267948966 | False |
| 23 | 2012-03-27 22:00:00 | -1.5707963267948966 | 2.4362361796315177 | 1.5707963267948966 | False |
| 24 | 2012-03-27 23:00:00 | -1.5707963267948966 | 2.698035567430667 | 1.5707963267948966 | False |
+----+---------------------+---------------------+----------------------+--------------------+-------+
omega is the solar hour angle in radians. It ranges from -pi/2 to +pi/2 for the hours 00:00 and 24:00 respectively. At midday its value is 0.
omegass is the hour angle to which the sunset occurs. Due to the symmetry of the sun-earth system, omegasr = -omegass. These values are constant along one day, but change for every day.
The column isday is a result of a conditional expression: when omegasr < omega < omegasr then it's day and further calculations can be made.
In order to do further calculations I need to associate for each hour the midpoint of the time span that the measure covers. So, for example, the midday measure was recorded at 12:00 but in order to represent all of that hour I want to have the hour angle of 12:30. Therefore I need a
omegam[i] = (omega[i],omega[i+1]).mean()
where i represents the index.
But here a new problem arises: if the sunset occurs, let's say, at 6:40 am then the midpoint hour has to be calculated like this:
omegam[i] = (omegasr[i],omega[i+1]).mean() #sunrise
omegam[i] = (omega[i],omegass[i+1]).mean() #sunset
Thus the hourly radian angle will correspond to 6:50am. I created the column isday to help perform this task, but unfortunately I can't really use it.
Thank you.
EDIT:
The solution proposed by @Mabel Villaba is not correct, for the new_omega column only has sunrise and sunset values.
A coorect new_omega column would be:
new_omega
...
7 #here the mean is between omegasr and omega[8], therefore this new_omega value can't have a correct value, according to the proposed solution.
8 -1.2289552495565745 # = omega[9]
9 omega[10] #
10 omega[11]
...
17 omega[18]
18 omega[19]
19 1.570796 #omegass
...
I hope that it is clear enough
EDIT2:
Thank you again, but the values are still not correct: the mean values are still calculated wrongly. I have calculated manually the correct values, I will post them here:
omegam
...
7 -1.530775
8 -1.359855
9 -1.098058
...
13 -0.05256705
...
19 1.47992
...
EDIT3:
I think the column df['isday'] obtained thanks to the boolean mask might be misleading.
In fact: the sunrise always occurs between two rows, let them be called omega1 and omega2, whom belong to row1 and row2 respectively. The same happens with the sunset, but withomega3 and omega4. What happens is that the correct omegam of row1 is calculated as:
omegam(row1) = (omegasr + omega2)/2
but row1 hase a False attribute in the isday column.
For the sunset it's the opposite: occurring between row3 and row4 it is calculated as:
omegam(row3) = (omega3 + omegass)/2
and row3 has a True attribute.