2

I am looking for a way to quantify the performance of multi-class model labelers, and thus compare them. I want to account for the fact that some classes are ‘closer’ than others (for example a car is ‘closer’ to a ‘truck’ than a ‘flower’ is. So, if a labeler classifies a car as a truck that is better than classifying the car as a flower. I am considering using a Jaccard similarity score. Will this do what I want?

Tavi
  • 21
  • 1

1 Answers1

0

There is no commonly established metric do that. You'll have to write custom code based on manually indicating rank ordered preferences of misclassifications.

Brian Spiering
  • 23,131
  • 2
  • 29
  • 113