I have a regression model that I want to make prediction based on values that I will get from an end user.
In my dataset, I have one categorical variable region which I one-hot encoded, which generated 53 new columns (54 regions).
Now my data has the shape 1000x72. I then split into training and testing sets and my model is working fine.
But I'm confused about how my model would predict new values. Since I will only be getting one value for region from the end user, my model will one-hot encode a single value, and it will no longer fit the shape it has been trained on, as it will have the shape 1x18. I'm really confused as in how would I fit it into the model this way... Do I just make 53 other columns and put a dummy 0 in each one??
Sorry if this is a trivial question, I'm very beginner to this and any help would be greatly appreciated!!
region_ohe = OneHotEncoder(categories = "auto", handle_unknown = "ignore")
X_encoded = region_ohe.fit_transform(df['region'].values.reshape(-1,1)).toarray()