Questions tagged [cross-entropy]
24 questions
3
votes
1 answer
Maximum Likelihood, Cross-Entropy, and Conditional Empirical Distributions for Conditional Models
I came across this article: “MSE is Cross Entropy at Heart: Maximum Likelihood Estimation Explained” which states:
"When training a neural network, we are trying to find the parameters of a probability distribution which is as close as possible…
spie227
- 101
- 4
3
votes
1 answer
Using keras metrics BinaryCrossentropy for a binary model
I'm trying to implement a binary classification model using tensorflow keras and stumbled over problem that I cannot grasp.
My model shall classify images of houses in the two classes of "old/antique" and "new/modern". I used transfer learning using…
Ada
- 33
- 5
2
votes
1 answer
Why do MSE and cross-entropy losses have the same gradient?
I'm a data science student, and while I was learning to derive the logistic regression loss function (cross-entropy loss), I found that the gradient is exactly the same as the least-squares gradient for linear regression, even though the two…
Ammar
- 23
- 4
2
votes
0 answers
Why is cross-entropy increasing with accuracy?
I'm making an implementation of the softmax regression and I'm struggling to understand the nature behind the problem of increasing value of Cross-Entropy: $H(y_i, p_i)=-\sum_{i=1}^C y_i log(p_i)$, along with an increasing accuracy:
This is utmost…
JoshJohnson
- 21
- 4
1
vote
1 answer
Multi label multi class binary numerical encoded output cannot be assigned weight in torch.nn.funcional.cross_entropy
I have a very unique multi-label multi-class problem. I have a neural network that outputs 6 logits. The number of classes that we are trying to predict are 2^6 i.e. I am encoding my output as a binary number. The reason for this is that if I just…
ptushev
- 21
- 1
- 6
1
vote
0 answers
Analysis of relationship between accuracy and total loss (or cost) during training with logistic loss function and threshold 0.5
I'm trying to understanding the relationship between training accuracy and training loss in classification tasks, specifically using logistic regression. When using logistic loss as the loss function and with the threshold set to $0.5$, I can see…
Tran Khanh
- 155
- 7
1
vote
1 answer
PyTorch CrossEntropyLoss and Log_SoftMAx + NLLLoss give different results
As per PyTorch documentation CrossEntropyLoss() is a combination of LogSoftMax() and NLLLoss() function. However, calling CrossEntropyLoss() gives different results compared to calling LogSoftMax() and NLLLoss() as seen from the output of the given…
cbelwal
- 113
- 3
1
vote
1 answer
Why is the calculated cross-entropy not zero?
import torch.nn.functional as F
logits = torch.Tensor([0, 1])
counts = logits.exp()
probs = counts / counts.sum() # equivalent to softmax
loss = F.cross_entropy(logits, probs)
Here, loss is roughly equal to 0.5822.
However, I would expect it to…
Tsiolkovsky
- 13
- 4
1
vote
1 answer
Connecting timeseries quantities to CDF
In the following paper,
[Ponce-Flores, M., Frausto-Solís, J., Santamaría-Bonfil, G., Pérez-Ortega, J., & González-Barbosa, J. J. (2020). Time series complexities and their relationship to forecasting performance. Entropy, 22(1), 89.][1]
several…
Omar Shehab
- 11
- 4
1
vote
1 answer
Loss on whole sequences in Causal Language Model
I'd like to know, from an implementation point of view, when training a Causal Transformer such as GPT-2, if making prediction on whole sequence at once and computing the loss on the whole sequence is something standard ?
When going across examples…
Valentin Macé
- 137
- 5
1
vote
0 answers
Loss function for classification problem
So I'm working on a classification problem, I used convolutional neural networks to classify grayscale ECG beat images of dimension 200x200 (I had around 4000 images for each class in training and I had 4 classes), the model is shown below:
I'm…
imene
- 23
- 1
- 3
1
vote
0 answers
Shannon Information Content related to Uncertainty?
I'm a data scientist student currently writing my master thesis which resolves around the Cross Entropy (CE) Loss Function for neural networks. From my understanding, the CE is based on the Entropy, which in turn is based on the Shannon Information…
xflashx
- 11
- 1
1
vote
2 answers
Why is cross entropy based on Bernoulli or Multinoulli probability distribution?
When we use logistic regression, we use cross entropy as the loss function. However, based on my understanding and https://machinelearningmastery.com/cross-entropy-for-machine-learning/, cross entropy evaluates if two or more distributions are…
Feng Chen
- 207
- 1
- 10
0
votes
0 answers
How to find the optimal features and rewards to get a deep learning AI based on the cross entropy method to learn well?
I am a beginner in programming, but managed to get a little pong game done. For my studies I had to understand an AI that solved the Lunar-Lander-V2 environment of the Gymnasium API. Therefore it used deep learning and the cross-entropy method. It…
0
votes
0 answers
Predicting pregnancy codes with transformer
Im trying to predict pregnancy codes with a basic transformer model architecture. These pregnancy codes are like following prg001, prg002 to prg030. Prg001 would be antenatal screening and prg030 would be maternal outcome of delivery.
The source is…
NatalieL
- 101
- 1