I am trying to measure political ideology on Twitter (by using Rtweet). I now have a dataframe consisting of +100 politicians user_id's along with two ideal point scores on 'factor 1' and 'factor 2' (both factors have a range of 1-4). It looks like this (called kandidat):
| Navne | Faktor 1 | Faktor 2 |
|---|---|---|
| "Politician1" | 3.5 | 1.0 |
| "Politician2" | 2.0 | 4.0 |
| Etc... | X | X |
I would then like to detect if random Twitter users follow one or more of the politicians from my dataset. If they e.g. follow two of the politicians in my dataset - "Politician1" and "Politician2" - I will then assign a mean of the two politicians ideal point scores on the two factors to the user. An example of a Twitteruser following these two politicians could then be factor 1 = (3.5+1.0)/2 = 2.25 and factor 2 = (2.0+4.0)/2 = 3.00.
So I've tried to create a simplified loop including only two journalists from Twitter called 'testusers', who both follow a large share of the politicians in my dataset. The loop should then check whether the respective journalists follow one or more of the politicians: If they follow, then the loop should assign the mean of the values like described above. If not, they should be automatically removed from the dataset. The loop below does run, but unfortunately provides a wrong output (see table below the code):
### loop ###
for(i in 1:ncol(testusers)){
#pick politician1 of dataset
politician1_friends <- get_friends(testusers$Navne[1])
#intersect with candidate data
ids_intersect = intersect(politician1_friends$user_id, kandidat$user_id)
if(length(ids_intersect == 0)){
testusers[i, "anyFriends"] <- FALSE #user has no friends in the politicians df
} else {
#assign values to user based on intersect
politicians_friends = kandidat[kandidat$user_id %in% ids_intersect,]
s1_mean <- mean(politicians_friends$faktor1, na.rm=TRUE)
s2_mean <- mean(politicians_friends$faktor2, na.rm=TRUE)
testusers[i, "faktor1"] <- s1_mean
testusers[i, "faktor2"] <- s2_mean
testusers[i, "anyFriends"] <- TRUE #user has friends in the politicians dataset
}
# etc.
}
The code above gives me this output:
| Navne | anyFriends |
|---|---|
| "Politician1" | FALSE |
| "Politician2" | NA |
The structure of testusers is: structure(list(Navne = c("Politician1", "Politician2"), anyFriends = c(FALSE, NA)), row.names = 1:2, class = "data.frame"). And I can't post the whole structure of kandidat, since it's too big: but it's a dataframe consisting of politicians (with all the informations from the function look_up() like user_id, screen_name, text etc.
So I guess the code needs som minor changes, but I haven't figured them out yet. Ideally the output (df) should consist of "only" three dataframe columns: 1) UserID/Name 2) Faktor1 3) Faktor2?