Highest Voted Questions - Data Science Stack Exchange

8

votes

4 answers

Data science and MapReduce programming model of Hadoop

What are the different classes of data science problems that can be solved using mapreduce programming model?

apache-hadoop map-reduce

asked Jul 28 '14 at 16:17

10land

369
3
10

8

votes

1 answer

MLflow real world experience

Can someone provide a summary of the real world deployment experience of MLflow? We have a few ML models (e.g., LightGBM, tensorflow v2, etc.) and want to avoid framework like SageMaker (due to customer requirement). So we are looking into various…

mlflow mlops

asked Nov 24 '20 at 19:04

David293836

217
1
2
6

8

votes

1 answer

Who invented the concept of over-fitting?

I list the references that I found so far. Shortly, the first appearance of the term was in 1670, first appearance in in close meaning was in 1827, first appearance in a biological paper was in 1923 and first appearance in statistics was in…

machine-learning overfitting terminology reference-request history

asked Nov 24 '20 at 06:30

DaL

2,663
13
13

8

votes

2 answers

What is the difference between GPT blocks and Transformer Decoder blocks?

I know GPT is a Transformer-based Neural Network, composed of several blocks. These blocks are based on the original Transformer's Decoder blocks, but are they exactly the same? In the original Transformer model, Decoder blocks have two attention…

deep-learning transformer language-model

asked Nov 16 '20 at 09:54

Leevo

6,445
3
18
52

8

votes

1 answer

Cosine Distance > 1 in scipy

I am working on a recommendation engine, and I have chosen to use SciPy's cosine distance as a way of comparing items. I have two vectors: a = [2.7654870801855078, 0.35995355443076027, 0.016221679989074141, -0.012664358453398751,…

python distance cosine-distance

asked Oct 13 '15 at 22:23

redgem

183
1
1
4

8

votes

5 answers

Filling missing data with other than mean values

What are all the options available for filling in missing data? One obvious choice is the mean, but if the percentage of missing data is large, it will decrease the accuracy. So how do we deal with missing values if they are are lot of them?

data-mining missing-data

asked Oct 06 '15 at 10:51

mach

367
1
4
9

8

votes

4 answers

One Hot encoding for large number of values

How do we use one hot encoding if the number of values which a categorical variable can take is large ? In my case it is 56 values. So as per usual method I would have to add 56 columns (56 binary features) in the training dataset which will…

machine-learning data-mining classification dataset categorical-data

asked Oct 03 '15 at 18:37

mach

367
1
4
9

8

votes

2 answers

Can learning algorithms take in data along with their uncertainty? (chaining ML algorithms along with errors)

How to chain statistical methods (estimators or classifiers) taking into account the uncertainty (error) of the previous step? Ex: Consider a pipeline, where housing prices are estimated from census and geographical data and are fed into another…

machine-learning

asked Oct 10 '20 at 07:09

duggi

131
4

8

votes

2 answers

Difference between training and test data distributions

A basic assumption in machine learning is that training and test data are drawn from the same population, and thus follow the same distribution. But, in practice, this is highly unlikely. Covariate shift addresses this issue. Can someone clear the…

machine-learning classification dataset image-classification

asked Sep 25 '15 at 00:47

Daniel Wonglee

191
1
4

8

votes

3 answers

feature importance after classification

I have time series data and more or less 200 features for each sample, I used a recurrent neural network for the binary classification task. After the classification I would like to know which features contribute most to one of the target(let's say…

classification rnn

asked Sep 16 '20 at 09:35

Rick0

105
4

8

votes

2 answers

How does word2vec handle the input word being in the context?

If word2vec encounters the same word multiple times in the same window, what occurs? Obviously it is meaningless to decrease the distance between the vectors for the input word and the target word. But will the repetition strengthen the…

machine-learning nlp word-embeddings

asked Sep 17 '15 at 21:02

jamesmf

3,117
1
18
25

8

votes

2 answers

How should I use BERT embeddings for clustering (as opposed to fine-tuning BERT model for a supervised task)

First of all, I want to say that I am asking this question because I am interested in using BERT embeddings as document features to do clustering. I am using Transformers from the Hugging Face library. I was thinking of averaging all of the Word…

machine-learning deep-learning nlp word-embeddings bert

asked Aug 21 '20 at 02:00

fractalnature

825
6
19

8

votes

4 answers

Does reinforcement learning require the help of other learning algorithms?

Can't reinforcement learning be used without the help of other learning algorithms like SVM and MLP back propagation? I consulted two papers: Paper 1 Paper 2 both have used other machine learning methods in the inner loop.

machine-learning reinforcement-learning algorithms

asked Sep 07 '15 at 08:29

girl101

1,161
2
11
26

8

votes

3 answers

Are there any machine learning techniques to identify points on plots/ images?

I have data for each vehicle's lateral position over time and lane number as shown in these 3 plots in the image and sample data below. > a Frame.ID xcoord Lane 1 452 27.39400 3 2 453 27.38331 3 3 454 27.42999 3 4 …

machine-learning r

asked Sep 06 '15 at 02:04

umair durrani

344
1
2
8

8

votes

3 answers

What are bias and variance in machine learning?

I am studying machine learning, and I have encountered the concept of bias and variance. I am a university student and in the slides of my professor, the bias is defined as: $bias = E[error_s(h)]-error_d(h)$ where $h$ is the hypotesis and…

machine-learning dataset variance bias

asked Aug 12 '20 at 08:10

J.D.

941
6
20
33

Most Popular