1

I have a very unique multi-label multi-class problem. I have a neural network that outputs 6 logits. The number of classes that we are trying to predict are 2^6 i.e. I am encoding my output as a binary number. The reason for this is that if I just have my last layer be a torch.nn.Linear layer of 64 neurons my model becomes too big. Now, I am also working with a very unbalanced dataset, where some labels are more frequent that others. I have a weight torch.Tensor of size 64 that I try to pass to the weight function argument of torch.nn.functional.cross_entropy but I get a error:

RuntimeError: cross_entropy: weight tensor should be defined either for all 6 classes or no classes but got weight tensor of shape: [64]

How do I assign a weight for each of the 2^6 permutations of the output?

ptushev
  • 21
  • 1
  • 6

1 Answers1

1

PyTorch's cross_entropy loss is designed for single-label multi-class classification where one input sample belongs to only one class and each output logit represents a separate class, and it expects weights per logit (6 in your case). So using cross_entropy loss directly won't support weights for all 64 class permutations.

Therefore if your problem is really single-label multi-class classification, you should design your neural network to output 64 logits instead of 6. If you’re worried about the model size increasing due to 64 outputs due to resources constraint, you can reduce the number of hidden layers and their dimensions along with regularization techniques like dropout or weight decay, etc.

If your problem is actually multi-label problem where one input sample can be assigned to multiple class labels, then you should use PyTorch's binary_cross_entropy_with_logits loss instead.

cinch
  • 158
  • 1
  • 4