Highest Voted 'feature-scaling' Questions - Data Science Stack Exchange

36

votes

4 answers

What is a good way to transform Cyclic Ordinal attributes?

I am having 'hour' field as my attribute, but it takes a cyclic values. How could I transform the feature to preserve the information like '23' and '0' hour are close not far. One way I could think is to do transformation: min(h, 23-h) Input: [0 1…

feature-extraction feature-scaling featurization

asked Jun 03 '15 at 05:56

Mangat Rai Modi

579
1
5
11

33

votes

1 answer

Ways to deal with longitude/latitude feature

I am working on a fictional dataset with 25 features. Two of the features are latitude and longitude of a place and others are pH values, elevation, windSpeed etc with varying ranges. I can perform normalization on the other features but how do I…

machine-learning python feature-engineering feature-scaling normalization

asked Aug 20 '16 at 06:51

AllThingsScience

443
1
4
5

31

votes

1 answer

Should one hot vectors be scaled with numerical attributes

In the case of having a combination of categorical and numerical Attributes, I usually convert the categorical attributes to one hot vectors. My question is do I leave those vectors as is and scale the numerical attributes through…

feature-engineering feature-scaling data-science-model

asked May 14 '18 at 17:54

Suresh Kasipandy

588
1
4
8

30

votes

3 answers

Why do we convert skewed data into a normal distribution

I was going through a solution of the Housing prices competition on Kaggle (Human Analog's Kernel on House Prices: Advance Regression Techniques) and came across this part: # Transform the skewed numeric features by taking log(feature + 1). # This…

regression feature-extraction feature-engineering kaggle feature-scaling

asked Jul 07 '17 at 11:35

PixelPioneer

815
2
10
10

25

votes

2 answers

Feature Transformation on Input data

I was reading about the solution to this OTTO Kaggle challenge and the first place solution seems to use several transforms for the input data X, for example Log(X+1), sqrt( X + 3/8), etc. Is there a general guideline on when to apply which kind…

machine-learning feature-extraction feature-scaling

asked Jul 24 '17 at 03:44

terenceflow

406
1
4
6

18

votes

3 answers

When should I use StandardScaler and when MinMaxScaler?

I have a feature vector with One-Hot-Encoded features and with continous features. How can I decide now, which data I shall scale with StandardScaler and which data scale with MinMaxScaler? I think I do not have to scale the one-hot-encoded anyway…

neural-network feature-selection feature-scaling classifier

asked Jan 14 '19 at 13:58

jochen6677

611
2
5
10

17

votes

4 answers

How to scale an array of signed integers to range from 0 to 1?

I'm using Brain to train a neural network on a feature set that includes both positive and negative values. But Brain requires input values between 0 and 1. What's the best way to normalize my data?

machine-learning neural-network feature-scaling normalization javascript

asked May 24 '15 at 02:11

Jonathan Shobrook

323
1
3
8

16

votes

3 answers

Zero Mean and Unit Variance

I'm studying Data Scaling, and in particular the Standardization method. I've understood the math behind it, but it's not clear to me why it's important to give the features zero mean and unit variance. Can you explain me ?

machine-learning feature-scaling normalization

asked May 24 '18 at 14:28

Qwerto

705
1
8
15

11

votes

6 answers

When should I NOT scale features

Feature scaling can be crucially necessary when using distance-, variance- or gradient-based methods (KNN, PCA, neural networks...), because depending on the case, it can improve the quality of results or the computational effort. In some cases…

feature-scaling

asked Dec 05 '19 at 22:37

Romain Reboulleau

1,387
9
26

11

votes

3 answers

Data scaling before or after PCA

I have seen senior data scientists doing data scaling either before or after applying PCA. What is more right to do and why?

machine-learning feature-scaling

asked Jul 25 '18 at 13:50

Outcast

1,117
3
14
29

11

votes

2 answers

Linear Regression and scaling of data

The following plot shows coefficients obtained with linear regression (with mpg as the target variable and all others as predictors). For mtcars dataset (here and here) both with and without scaling the data: How do I interpret these results? The…

feature-selection linear-regression feature-scaling

asked Apr 14 '18 at 07:54

rnso

1,608
3
19
35

11

votes

2 answers

Consequence of Feature Scaling

I am currently using SVM and scaling my training features to the range of [0,1]. I first fit/transform my training set and then apply the same transformation to my testing set. For example: ### Configure transformation and apply to training set …

machine-learning svm feature-scaling

asked Dec 02 '14 at 16:19

mike1886

933
9
17

10

votes

1 answer

Should I rescale tfidf features?

I have a dataset which contains both text and numeric features. I have encoded the text ones using the TfidfVectorizer from sklearn. I would now like to apply logistic regression to the resulting dataframe. My issue is that the numeric features…

nlp feature-engineering feature-scaling tfidf

asked Jun 27 '18 at 16:30

ignoring_gravity

793
4
15

8

votes

1 answer

How to handle preprocessing (StandardScaler, LabelEncoder) when using data generator to train?

So, I have a dataset that is too big to load into memory all at once. Therefore I want to use a generator to load batches of data to train on. In this scenario, how do I go about performing scaling of the features using LabelEncoder +…

machine-learning keras scikit-learn feature-scaling labels

asked Jan 19 '19 at 09:10

Jim

181
3

8

votes

6 answers

Do Clustering algorithms need feature scaling in the pre-processing stage?

Is feature scaling useful for clustering algorithms? What type of features, I mean numeric, categorical etc., are most efficient for clustering?

machine-learning clustering feature-engineering feature-scaling

asked Sep 03 '17 at 14:55

Shengjie

241
1
2
7

Questions tagged [feature-scaling]