Most Popular

1500 questions
71
votes
2 answers

Are Support Vector Machines still considered "state of the art" in their niche?

This question is in response to a comment I saw on another question. The comment was regarding the Machine Learning course syllabus on Coursera, and along the lines of "SVMs are not used so much nowadays". I have only just finished the relevant…
Neil Slater
  • 29,388
  • 5
  • 82
  • 101
71
votes
4 answers

What is the use of torch.no_grad in pytorch?

I am new to pytorch and started with this github code. I do not understand the comment in line 60-61 in the code "because weights have requires_grad=True, but we don't need to track this in autograd". I understood that we mention requires_grad=True…
mausamsion
  • 1,312
  • 1
  • 10
  • 14
71
votes
5 answers

Adding Features To Time Series Model LSTM

have been reading up a bit on LSTM's and their use for time series and its been interesting but difficult at the same time. One thing I have had difficulties with understanding is the approach to adding additional features to what is already a list…
Rjay155
  • 1,235
  • 2
  • 12
  • 9
71
votes
5 answers

Why mini batch size is better than one single "batch" with all training data?

I often read that in case of Deep Learning models the usual practice is to apply mini batches (generally a small one, 32/64) over several training epochs. I cannot really fathom the reason behind this. Unless I'm mistaken, the batch size is the…
Hendrik
  • 8,767
  • 17
  • 43
  • 55
70
votes
2 answers

Sparse_categorical_crossentropy vs categorical_crossentropy (keras, accuracy)

Which is better for accuracy or are they the same? Of course, if you use categorical_crossentropy you use one hot encoding, and if you use sparse_categorical_crossentropy you encode as normal integers. Additionally, when is one better than the…
Master M
  • 803
  • 1
  • 7
  • 5
70
votes
11 answers

Why should the data be shuffled for machine learning tasks

In machine learning tasks it is common to shuffle data and normalize it. The purpose of normalization is clear (for having same range of feature values). But, after struggling a lot, I did not find any valuable reason for shuffling data. I have read…
Green Falcon
  • 14,308
  • 10
  • 59
  • 98
69
votes
5 answers

In softmax classifier, why use exp function to do normalization?

Why use softmax as opposed to standard normalization? In the comment area of the top answer of this question, @Kilian Batzner raised 2 questions which also confuse me a lot. It seems no one gives an explanation except numerical benefits. I get the…
Hans
  • 793
  • 1
  • 6
  • 5
67
votes
9 answers

Clustering geo location coordinates (lat,long pairs)

What is the right approach and clustering algorithm for geolocation clustering? I'm using the following code to cluster geolocation coordinates: import numpy as np import matplotlib.pyplot as plt from scipy.cluster.vq import kmeans2,…
rokpoto.com
  • 813
  • 1
  • 7
  • 6
66
votes
5 answers

How to get accuracy, F1, precision and recall, for a keras model?

I want to compute the precision, recall and F1-score for my binary KerasClassifier model, but don't find any solution. Here's my actual code: # Split dataset in train and test data X_train, X_test, Y_train, Y_test = train_test_split(normalized_X,…
ZelelB
  • 1,067
  • 2
  • 11
  • 15
66
votes
4 answers

Does batch_size in Keras have any effects in results' quality?

I am about to train a big LSTM network with 2-3 million articles and am struggling with Memory Errors (I use AWS EC2 g2x2large). I found out that one solution is to reduce the batch_size. However, I am not sure if this parameter is only related to…
hipoglucido
  • 1,200
  • 1
  • 10
  • 19
65
votes
6 answers

When is a Model Underfitted?

Logic often states that by underfitting a model, it's capacity to generalize is increased. That said, clearly at some point underfitting a model cause models to become worse regardless of the complexity of data. How do you know when your model has…
blunders
  • 1,932
  • 2
  • 15
  • 19
63
votes
9 answers

Tools and protocol for reproducible data science using Python

I am working on a data science project using Python. The project has several stages. Each stage comprises of taking a data set, using Python scripts, auxiliary data, configuration and parameters, and creating another data set. I store the code in…
Yuval F
  • 761
  • 1
  • 6
  • 7
63
votes
11 answers

How to deal with version control of large amounts of (binary) data

I am a PhD student of Geophysics and work with large amounts of image data (hundreds of GB, tens of thousands of files). I know svn and git fairly well and come to value a project history, combined with the ability to easily work together and have…
Johann
  • 741
  • 1
  • 5
  • 5
63
votes
4 answers

Difference between OrdinalEncoder and LabelEncoder

I was going through the official documentation of scikit-learn learn after going through a book on ML and came across the following thing: In the Documentation it is given about sklearn.preprocessing.OrdinalEncoder() whereas in the book it was given…
63
votes
5 answers

Is it always better to use the whole dataset to train the final model?

A common technique after training, validating and testing the Machine Learning model of preference is to use the complete dataset, including the testing subset, to train a final model to deploy it on, e.g. a product. My question is: Is it always…
pcko1
  • 4,030
  • 2
  • 17
  • 30