I’m still learning data science and trying to improve my understanding of statistical tests. Right now, I’m working with a dataset where I have a categorical feature (e.g., “School Type” with values like Public, Private…) and a numeric target (e.g., student scores). However, the numeric target is not normally distributed.
What are the best statistical tests to measure the correlation between a categorical variable and a non-normally distributed numeric target? I’ve seen tests like ANOVA (which assumes normality) and Kruskal-Wallis (which is non-parametric), but I’m not sure which is the best choice in different scenarios. Are there other tests I should consider?
Once I calculate the correlation, how can I determine how each category affects the target? For example, how do I find out which categories have a positive or negative effect on student scores? Should I compare medians, use effect sizes, or apply another method?
I’d really appreciate any insights or recommendations.