Questions tagged [feature-construction]
77 questions
34
votes
6 answers
Are there any tools for feature engineering?
Specifically what I am looking for are tools with some functionality, which is specific to feature engineering. I would like to be able to easily smooth, visualize, fill gaps, etc. Something similar to MS Excel, but that has R as the underlying…
John
- 441
- 1
- 5
- 4
29
votes
3 answers
How to combine categorical and continuous input features for neural network training
Suppose we have two kinds of input features, categorical and continuous. The categorical data may be represented as one-hot code A, while the continuous data is just a vector B in N-dimension space. It seems that simply using concat(A, B) is not a…
JunjieChen
- 525
- 1
- 5
- 8
18
votes
2 answers
List of feature engineering techniques
Is there any resource with a list of feature engineering techniques? A mapping of type of data, model and feature engineering technique would be a gold mine.
icm
- 539
- 2
- 5
- 9
14
votes
2 answers
What to do when testing data has less features than training data?
Let's say we are predicting the sales of a shop and my training data has two sets of features:
One about the store sales with the dates (the field "Store" is not unique)
One about the store types (the field "Store" is unique here)
So the matrix…
alvas
- 2,510
- 7
- 28
- 40
10
votes
4 answers
Is this a good practice of feature engineering?
I have a practical question about feature engineering... say I want to predict house prices by using logistic regression and used a bunch of features including zip code. Then by checking the feature importance, I realize zip is a pretty good…
user3768495
- 987
- 1
- 7
- 8
7
votes
3 answers
How to deal with categorical feature of very high cardinality?
I would like to train a binary classifier on feature vectors.
One of the features is categorical feature with string, it is the zip codes of a country.
Typically, there is thousands of zip codes, and in my case they are strings.
How can convert this…
Rami
- 604
- 2
- 6
- 16
5
votes
1 answer
Predictive models with class value belonging to a set of observations
I would like to know whether it's possible to build a predictive model where I could define a set of rows with their attributes, and a class belonging to that set of rows, instead of having the typical model one observation - one class.
What I'm…
Hibai
- 51
- 2
5
votes
1 answer
How to transform raw data to fixed-frequency time series?
How to transform raw data to fixed-frequency time series?
For example I have the following raw data in DataFrame
A B
2017-01-01 00:01:01 0 100
2017-01-01 00:01:10 1 200
2017-01-01 00:01:16 2 300
2017-01-01…
mahnunchik
- 153
- 1
- 5
4
votes
1 answer
Is there any difference between feature extraction and feature learning?
It appears to me that "feature extraction" and "feature learning" are equivalent concepts, however there are 2 separate wikipedia articles dedicated to them that are notably different. In particular, only in the Feature Learning article Neural…
Tnatsissa H Craeser
- 155
- 1
- 5
4
votes
2 answers
Approach to creating a user profile in music web application
I am working on a use case, and I'm unsure of the best way to proceed: in order to analyze the behavior of users of a web-based music application, we retain all songs each has played since 2009. We store this information in flat files, each…
user17241
- 151
- 1
- 7
4
votes
2 answers
Combining Latitude/Longitude position into single feature
I have been playing with two dimensional machine learning using pandas (trying to do something like this), and I would like to combine Lat/Long into a single numerical feature -- ideally in a linear fashion. Is there a "best practice" to do this?
mainstringargs
- 151
- 1
- 1
- 4
3
votes
1 answer
Finding if an outcome is predictable
Suppose we are asked to predict something given a set of features, how do we know if that target is actually predictable? That is, how do we know if there is actually some relation between the dependant and independent features or there are some…
Bharathi A
- 75
- 3
3
votes
1 answer
How to handle a feature vector that could be variable length?
I would like to train a machine learning model with several features as input as X[] and with one output as Y. For example Every sample has a Data frame like this: X[0], X[1], X[2], X[3], X[4], Y
Let's say One sample the followings Data is only one…
Crazy9
- 31
- 2
3
votes
2 answers
How to treat the undefined values which make sense?
I'm currently trying to create a few features to improve the performances of a model. One of those features that I would like to create corresponds to the difference in days between a customer's purcharse and his last one. To create this feature is…
qwertzuiop
- 203
- 1
- 5
3
votes
3 answers
how to evaluate feature quality for decision tree model
Most of the tutorials assume that the features are known before generating the model and give no way to select 'good' feature and to discard 'bad' ones.
The naive method is to test the model with new features and see how the new results change…
Bertrand
- 197
- 6