Most Popular
1500 questions
8
votes
4 answers
Data science and MapReduce programming model of Hadoop
What are the different classes of data science problems that can be solved using mapreduce programming model?
10land
- 369
- 3
- 10
8
votes
1 answer
MLflow real world experience
Can someone provide a summary of the real world deployment experience of MLflow? We have a few ML models (e.g., LightGBM, tensorflow v2, etc.) and want to avoid framework like SageMaker (due to customer requirement). So we are looking into various…
David293836
- 217
- 1
- 2
- 6
8
votes
1 answer
Who invented the concept of over-fitting?
I list the references that I found so far.
Shortly, the first appearance of the term was in 1670, first appearance in in close meaning was in 1827, first appearance in a biological paper was in 1923 and first appearance in statistics was in…
DaL
- 2,663
- 13
- 13
8
votes
2 answers
What is the difference between GPT blocks and Transformer Decoder blocks?
I know GPT is a Transformer-based Neural Network, composed of several blocks. These blocks are based on the original Transformer's Decoder blocks, but are they exactly the same?
In the original Transformer model, Decoder blocks have two attention…
Leevo
- 6,445
- 3
- 18
- 52
8
votes
1 answer
Cosine Distance > 1 in scipy
I am working on a recommendation engine, and I have chosen to use SciPy's cosine distance as a way of comparing items.
I have two vectors:
a = [2.7654870801855078, 0.35995355443076027, 0.016221679989074141, -0.012664358453398751,…
redgem
- 183
- 1
- 1
- 4
8
votes
5 answers
Filling missing data with other than mean values
What are all the options available for filling in missing data?
One obvious choice is the mean, but if the percentage of missing data is large, it will decrease the accuracy.
So how do we deal with missing values if they are are lot of them?
mach
- 367
- 1
- 4
- 9
8
votes
4 answers
One Hot encoding for large number of values
How do we use one hot encoding if the number of values which a categorical variable can take is large ?
In my case it is 56 values. So as per usual method I would have to add 56 columns (56 binary features) in the training dataset which will…
mach
- 367
- 1
- 4
- 9
8
votes
2 answers
Can learning algorithms take in data along with their uncertainty? (chaining ML algorithms along with errors)
How to chain statistical methods (estimators or classifiers) taking into account the uncertainty (error) of the previous step?
Ex: Consider a pipeline, where housing prices are estimated from census and geographical data and are fed into another…
duggi
- 131
- 4
8
votes
2 answers
Difference between training and test data distributions
A basic assumption in machine learning is that training and test data are drawn from the same population, and thus follow the same distribution. But, in practice, this is highly unlikely. Covariate shift addresses this issue. Can someone clear the…
Daniel Wonglee
- 191
- 1
- 4
8
votes
3 answers
feature importance after classification
I have time series data and more or less 200 features for each sample, I used a recurrent neural network for the binary classification task.
After the classification I would like to know which features contribute most to one of the target(let's say…
Rick0
- 105
- 4
8
votes
2 answers
How does word2vec handle the input word being in the context?
If word2vec encounters the same word multiple times in the same window, what occurs? Obviously it is meaningless to decrease the distance between the vectors for the input word and the target word. But will the repetition strengthen the…
jamesmf
- 3,117
- 1
- 18
- 25
8
votes
2 answers
How should I use BERT embeddings for clustering (as opposed to fine-tuning BERT model for a supervised task)
First of all, I want to say that I am asking this question because I am interested in using BERT embeddings as document features to do clustering. I am using Transformers from the Hugging Face library. I was thinking of averaging all of the Word…
fractalnature
- 825
- 6
- 19
8
votes
4 answers
Does reinforcement learning require the help of other learning algorithms?
Can't reinforcement learning be used without the help of other learning algorithms like SVM and MLP back propagation? I consulted two papers:
Paper 1
Paper 2
both have used other machine learning methods in the inner loop.
girl101
- 1,161
- 2
- 11
- 26
8
votes
3 answers
Are there any machine learning techniques to identify points on plots/ images?
I have data for each vehicle's lateral position over time and lane number as shown in these 3 plots in the image and sample data below.
> a
Frame.ID xcoord Lane
1 452 27.39400 3
2 453 27.38331 3
3 454 27.42999 3
4 …
umair durrani
- 344
- 1
- 2
- 8
8
votes
3 answers
What are bias and variance in machine learning?
I am studying machine learning, and I have encountered the concept of bias and variance. I am a university student and in the slides of my professor, the bias is defined as:
$bias = E[error_s(h)]-error_d(h)$
where $h$ is the hypotesis and…
J.D.
- 941
- 6
- 20
- 33