Background
I've got a dataframe:
d <- data.frame(ID = c("a","a","b","b", "c","c"),
event = c(0,1,0,0,1,1),
event_date = as.Date(c("2011-01-01","2012-08-21","2011-12-23","2011-12-31","2013-03-14","2015-07-12")),
stringsAsFactors=FALSE)
It's got some people (ID), a binary indicator of an event, and an event_date for that event.
The Problem
I want to subset this dataframe to return the rows for those ID's who have never had the event -- in other words, for ID's whose every event cell, no matter how many they have, equals zero. No exceptions.
In other words, I'd like to get something that looks like this:
ID event event_date
b 0 2011-12-23
b 0 2011-12-31
What I've tried
I've got a couple of ways of trying to do this, one in dplyr and the other in data.table (I'm more familiar with dplyr; the latter example is adapted from an answer to an older question I posted). But both get me the same result: the inclusion of ID=b, which is good, but also the inclusion of ID=a, who has one row of event=1 -- which is not good.
Attempt 1:
d_test <- subset(d, ID %in% unique(ID[event == 0])) %>%
arrange(ID, event_date)
ID event event_date
1 a 0 2011-01-01
2 a 1 2012-08-21
3 b 0 2011-12-23
4 b 0 2011-12-31
Attempt 2:
setDT(d)
d_test <- d[ID %in% d[,event==0, by=ID][V1==TRUE, ID]]
ID event event_date
1: a 0 2011-01-01
2: a 1 2012-08-21
3: b 0 2011-12-23
4: b 0 2011-12-31
Bottom line: I can't express NEVER or NONE in my code.