I have a dataset of cars and it has many features including 'acceleration’, ‘horsepower’, and ‘mpg'.
I am supposed to check which of these features is the most similar to a normal distribution, so I made histograms of each feature, acceleration was definitely the most visually similar.
But I am also supposed to support my answer by using a quantitative measure.
First I tried to use skew-ness measure, but can it indicate which is "most normal" if all these features measure diffrent things?
I also considered the Shapiro-Wilk test where acceleration got the closest to 0.05.But is this really an indication that it's the most similar to normal distribution.
The following are the measures I got for each feature:
acceleration: Skewness = 0.2788, Shapiro-Wilk Test - Statistic = 0.9924, p-value = 0.0399
horsepower: Skewness = 1.1062, Shapiro-Wilk Test - Statistic = 0.9024, p-value = 0.0000
mpg: Skewness = 0.4571, Shapiro-Wilk Test - Statistic = 0.9680, p-value = 0.0000
