Most Popular
1500 questions
295
votes
8 answers
Micro Average vs Macro average Performance in a Multiclass classification setting
I am trying out a multiclass classification setting with 3 classes. The class distribution is skewed with most of the data falling in 1 of the 3 classes. (class labels being 1,2,3, with 67.28% of the data falling in class label 1, 11.99% data in…
SHASHANK GUPTA
- 3,855
- 4
- 20
- 26
286
votes
12 answers
What are deconvolutional layers?
I recently read Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, Trevor Darrell. I don't understand what "deconvolutional layers" do / how they work.
The relevant part is
3.3. Upsampling is backwards strided…
Martin Thoma
- 19,540
- 36
- 98
- 170
267
votes
10 answers
How to set class weights for imbalanced classes in Keras?
I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to provide one?
By the way, in this case the appropriate praxis is simply to weight up the…
Hendrik
- 8,767
- 17
- 43
- 55
250
votes
10 answers
What's the difference between fit and fit_transform in scikit-learn models?
I do not understand the difference between the fit and fit_transform methods in scikit-learn. Can anybody explain simply why we might need to transform data?
What does it mean, fitting a model on training data and transforming to test data? Does it…
Kaggle
- 2,977
- 5
- 15
- 8
206
votes
18 answers
Train/Test/Validation Set Splitting in Sklearn
How could I randomly split a data matrix and the corresponding label vector into a X_train, X_test, X_val, y_train, y_test, y_val with scikit-learn?
As far as I know, sklearn.model_selection.train_test_split is only capable of splitting into two not…
Hendrik
- 8,767
- 17
- 43
- 55
203
votes
35 answers
Publicly Available Datasets
One of the common problems in data science is gathering data from various sources in a somehow cleaned (semi-structured) format and combining metrics from various sources for making a higher level analysis. Looking at the other people's effort,…
Amir Ali Akbari
- 1,393
- 3
- 13
- 25
202
votes
13 answers
K-Means clustering for mixed numeric and categorical data
My data set contains a number of numeric attributes and one categorical.
Say, NumericAttr1, NumericAttr2, ..., NumericAttrN, CategoricalAttr,
where CategoricalAttr takes one of three possible values: CategoricalAttrValue1, CategoricalAttrValue2 or…
IgorS
- 5,474
- 11
- 34
- 43
200
votes
6 answers
What is the "dying ReLU" problem in neural networks?
Referring to the Stanford course notes on Convolutional Neural Networks for Visual Recognition, a paragraph says:
"Unfortunately, ReLU units can be fragile during training and can
"die". For example, a large gradient flowing through a ReLU…
tejaskhot
- 4,125
- 7
- 22
- 18
200
votes
7 answers
How to draw Deep learning network architecture diagrams?
I have built my model. Now I want to draw the network architecture diagram for my research paper. Example is shown below:
Muhammad Ali
- 2,509
- 5
- 21
- 22
196
votes
2 answers
Difference between isna() and isnull() in pandas
I have been using pandas for quite some time. But, I don't understand what's the difference between isna() and isnull(). And, more importantly, which one to use when identifying missing values in a dataframe.
What is the basic underlying difference…
Vaibhav Thakur
- 2,403
- 3
- 13
- 9
185
votes
21 answers
How do you visualize neural network architectures?
When writing a paper / making a presentation about a topic which is about neural networks, one usually visualizes the networks architecture.
What are good / simple ways to visualize common architectures automatically?
Martin Thoma
- 19,540
- 36
- 98
- 170
184
votes
6 answers
When to use GRU over LSTM?
The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates).
Why do we make use of GRU when we clearly have more control on the network…
Sayali Sonawane
- 2,101
- 3
- 13
- 13
177
votes
4 answers
When to use One Hot Encoding vs LabelEncoder vs DictVectorizor?
I have been building models with categorical data for a while now and when in this situation I basically default to using scikit-learn's LabelEncoder function to transform this data prior to building a model.
I understand the difference between OHE,…
anthr
- 1,893
- 3
- 12
- 11
153
votes
6 answers
The cross-entropy error function in neural networks
In the MNIST For ML Beginners they define cross-entropy as
$$H_{y'} (y) := - \sum_{i} y_{i}' \log (y_i)$$
$y_i$ is the predicted probability value for class $i$ and $y_i'$ is the true probability for that class.
Question 1
Isn't it a problem that…
Martin Thoma
- 19,540
- 36
- 98
- 170
151
votes
13 answers
Why do people prefer Pandas to SQL?
I've been using SQL since 1996, so I may be biased. I've used MySQL and SQLite 3 extensively, but have also used Microsoft SQL Server and Oracle.
The vast majority of the operations I've seen done with Pandas can be done more easily with SQL. This…
vy32
- 611
- 3
- 7
- 11