I have a dataframe that is read in with readRDS() as a df. This contains many rows with cities and states. I keep only data that is in the state of California as df_ca.
df_ca contains 100 columns and I only keep a few categorical columns. I create a new catagorical df called df_cat. I want to loop over the categorical columns and get the frequencies with the table function. Ignoring the loop for troubleshooting, I set var as city and execute the table function creating a new df called cat_freq. cat_freq contains all cities from df rather than df_ca, their Freq is 0. Why are they even showing up if they were filtered out? I am new to R but have a python background
df <- as.data.frame(readRDS('some.data.5140'))
df_ca <- df[df$car.state == "ca",]
cat_col <- (unlist(list('color', 'city', 'deliver', 'type')))
df_cat <- df_ca[, cat_col]
var <- "city"
cat_freq <- data.frame(table(df_cat[var]))