Questions tagged [validation]
65 questions
4
votes
2 answers
Validatioin loss zigzagging
I'm training a speech recognition model using the Nvidia Nemo framework. Just results with the small fastconformer model and two dozen iterations are pretty good; for my data I would say they are quite amazing.
However, I have noticed something…
comodoro
- 143
- 3
3
votes
1 answer
Does validation data has any effect on training or it acts solely without affecting the training?
When using Keras library of Python, we use validation data with training data while training our model. In every epoch, we get a validation accuracy. Does this validation accuracy have any effect on training on the next epoch?
Rawnak Yazdani
- 133
- 1
- 4
3
votes
1 answer
Why do machine learning engineers insist on training with more data than validation set?
Among my colleagues I have noticed a curious insistence on training with, say, 70% or 80% of data and validating on the remainder. The reason it is curious to me is the lack of any theoretical reasoning, and it smacks of influence from a five-fold…
tdMJN6B2JtUe
- 200
- 8
3
votes
2 answers
What is the conclusion from this Accuracy / Loss plot for Train and Validation
What is the conclusion from this Accuracy / Loss plot for Train and Validation ?
It seems, that the best results for Validation are after few (5) epochs.
Also I'm not comfortable how the Loss and Accuracy looks for validation.
Michael D
- 159
- 3
2
votes
0 answers
ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). Any advice?
The error below presented itself when attempting to assemble a PCA.
My code:
is_float = X.dtype.kind in 'fc'
if is_float and (np.isfinite(_safe_accumulator_op(np.sum, X))):
pass
elif is_float:
msg_err = "Input contains {}…
Macgregorfb
- 21
- 1
2
votes
0 answers
Validation set after hyperparameter tuning
Let's say I'm comparing few models, and for my dataset I'm using train/validation/test split, and not cross validation. Let's say I'm completely done with parameter tuning for one of them and want to evaluate on the test set. Will I train a new…
Oz0234
- 21
- 2
2
votes
1 answer
Using the whole dataset for testing (not validation) in case of small datasets
for an object detection task I created a small dataset to train an object detector. The class frequency is more or less balanced, however I defined some additional attributes with environmental conditions for each image, which results in a rather…
P. Leibner
- 123
- 4
2
votes
1 answer
Validation and training loss of a model are not stable
Below I have a model trained and the loss of both the training dataset (blue) and validation dataset (orange) are shown. From my understanding, the ideal case is that both validation and training loss should converge and stabilize in order to tell…
Avv
- 231
- 1
- 2
- 10
2
votes
1 answer
How to build a model when we have three separate train, validation, and test sets?
I have a data set which should be divided into train, test, and validation sets.
set.seed(98274) # Creating example data
y <- sample(c(0,1), replace=TRUE, size=500)
x1 <- rnorm(500) + 0.2 * y
x2 <- rnorm(500) + 0.2 * x1 +…
ebrahimi
- 1,305
- 7
- 20
- 40
2
votes
3 answers
What is exactly the difference between Validation data and Testing data
I asked this question on stack overflow and was told that this is a better place for it.
I am confused with the terms validation and testing, is validating the model same as testing it? is it possible to use testing data for validation?
what even…
besa
- 33
- 6
2
votes
2 answers
Dataset and why use evaluate()?
I am starting in Machine Learning, and I have doubts about some concepts. I've read we need to split our dataset into training, validation and test sets. I'll ask four questions related to them.
1 - Training set: It is used in .fit() for our model…
Murilo
- 125
- 3
1
vote
3 answers
Spliting Training Test and Validation for Image Dataset
I have 600 images in the training folder, 200 images in the validation folder, and 200 images in the test folder. Suppose if I fit the training data generator and validation data generator for some epochs for learning purposes -…
User
- 36
- 6
1
vote
1 answer
Measure performance of classification model for training on different snapshots
I am trying to do binary classification on some chronological data. Let's assume we have weekly data from the first week of 2017 through the last week of 2020. Now we have found out that 26 weeks of training data might be sufficient for doing…
Ricky
- 189
- 1
- 8
1
vote
1 answer
Using Z-test score to evaluate model performance
I think I know the answer to this question but I am looking for a sanity check here: Is it appropriate to use z-test scores in order to evaluate the performance of my model?
I have a binary model that I have developed with a NN in Keras. I know the…
I_Play_With_Data
- 2,129
- 3
- 17
- 40
1
vote
1 answer
Does it make sense to repeat calculating AUC in logistic regression?
I have a question regarding logistic regression models and testing its skill.
I am not quite sure if I understand correctly how the ROC Curve is established.
When calculating the ROC curve, is a train test split happening and then the skill of a…
DataVader
- 25
- 3