Highest Voted 'classifier' Questions - Data Science Stack Exchange

18

votes

3 answers

When should I use StandardScaler and when MinMaxScaler?

I have a feature vector with One-Hot-Encoded features and with continous features. How can I decide now, which data I shall scale with StandardScaler and which data scale with MinMaxScaler? I think I do not have to scale the one-hot-encoded anyway…

asked Jan 14 '19 at 13:58

jochen6677

611
2
5
10

16

votes

1 answer

Train Accuracy vs Test Accuracy vs Confusion matrix

After I developed my predictive model using Random Forest I get the following metrics: Train Accuracy :: 0.9764634601043997 Test Accuracy :: 0.7933284397683713 Confusion matrix [[28292 1474] …

python predictive-modeling accuracy confusion-matrix classifier

asked Feb 28 '18 at 21:07

Pedro Alves

367
2
3
11

15

votes

2 answers

How do I get the feature importace for a MLPClassifier?

I use the MLPClassifier from scikit learn. I have about 20 features. Is there a scikit method to get the feature importance? I found clf.feature_importances_ but it seems that it only exists for decision trees.

neural-network scikit-learn classifier mlp

asked Jan 28 '19 at 13:17

jochen6677

611
2
5
10

6

votes

1 answer

XGBoost skews towards minority class

I have a dataset with 85k positive labels and 53k negative labels. For this use-case, I am trying to maximize my efforts to the negative class (accurately identify true negatives, and minimize false negatives). Currently, I am able to train a…

machine-learning python xgboost overfitting classifier

asked Sep 21 '20 at 02:55

Nick Bohl

95
4

6

votes

3 answers

Classifier that optimizes performance on only a subset of the data?

I'm working on machine learning problem where I'm only interested in getting high accuracy within a narrow band of my predicted likelihoods. Specifically, I want an algorithm that will score very accurately when it predicts a likelihood above a…

machine-learning classification loss-function classifier

asked May 03 '18 at 18:18

Mike

61
1

5

votes

3 answers

How to handle "unknown" category in machine learning classification problems?

Tutorial problems come in the form of binary or mult-class classification where data are all properly labelled. In real-life applications, there are incoming data that do not belong to any category and cannot be classified. How can we handle these…

machine-learning classification class-imbalance classifier

asked Sep 02 '18 at 09:08

user781486

1,455
2
17
20

4

votes

1 answer

Selecting a boundary on a binary classifier to optimal precision and recall

I have a logistic regression classifier that shows differing levels of performance for precision and recall at different probability boundaries as follows: The default threshold for the classifier to decide which class something belongs to is 0.5.…

scikit-learn logistic-regression classifier

asked Jan 14 '21 at 22:13

Sandy Lee

267
2
9

4

votes

2 answers

What is a discrimination threshold of binary classifier?

With respect to ROC can anyone please tell me what the phrase "discrimination threshold of binary classifier system" means? I know what a binary classifier is.

classification graphs classifier roc

asked Feb 03 '15 at 08:52

girl101

1,161
2
11
26

4

votes

1 answer

Distinguising features of linear vs, non-linear machine learning models (algorithms)

What are some examples of linear and non-linear machine learning models (algorithms) for purposes of comparison between the two categories? Which are the parameters (or scalars in a linear algebraic sense) and which are the predictors/factors (or…

machine-learning deep-learning classifier

asked Apr 15 '18 at 12:36

Ashish Vankudre

49
1
2

4

votes

1 answer

One hot encoding of target space

I had a face to face interview for a data scientist job a few days ago. One of the questions I was asked was: in the case of classifier predicting the brand of TV from some features (price, size, specs, ...) out of 4 possible brands, how do you…

data-cleaning preprocessing classifier

asked Jan 12 '18 at 19:04

Learning is a mess

646
1
8
16

4

votes

1 answer

How should I calculate AUROC if my (TPR,FPR) doesn't go till (1,1)? Should it be area just under the curve or should I include 1 and calculate?

I am running a model where it generates song detections with a confidence value. I then validate it across an annotated dataset. I then plot the values of TPR and FPR at each confidence threshold, starting with 0 till 1 with a stepping of 0.01. This…

data-science-model classifier roc

asked Jan 26 '25 at 18:22

Aditya_Panigrahy

41
1

3

votes

1 answer

What does these points mean in Naive Bayes?

I have two concept related questions related to Naïve Bayes. Naïve Bayes is robust to irrelevant features. What does this mean? Can anyone give an example how does the irrelevant features cancels out and what are the irrelevant features? It is…

machine-learning naive-bayes-classifier classifier

asked Feb 01 '20 at 01:18

akshit bhatia

127
5

3

votes

3 answers

Image Classification on non real images

I was wondering how image classifier networks perform on images that are not photographs. For example, if you were to feed a drawing of a car or a face to an image classifier that was only trained on photos would the network still be able to…

image-classification image-recognition image classifier

asked Mar 06 '19 at 02:45

dadrake

51
4

2

votes

1 answer

Question about reshaping array size for KNN Classifiers

I keep trying to run a new set of data through my KNN Classifier but would recieve the message: ValueError: query data dimension must match training data dimension It then used: x_new = pd.read_csv('NewFeaturePractice.csv' , names = attributes) …

python k-nn classifier

asked Jun 20 '20 at 05:06

LeeAnn Capistran

21
1
2

2

votes

1 answer

Attitude to text mining and preparing tokens, irrelevant words, low accuracy

For purpose of quite big project I am doing a text mining on some documents. My steps are quite common: All to lower case Tokenization Stop list and stop words Lemmatizaton Stemming Some other steps like removing symbols. Then I prepare bag of…

classification text-mining naive-bayes-classifier classifier

asked Nov 23 '19 at 10:40

heisenberg7584

131
3

Questions tagged [classifier]