I have a multiclassification problem and I want ro compare two classifiers using McNemar's test to find if there is any statistical significance. Do I have to do it for every class (one vs all approach) or is there another way?
Thanks in advance, because Im in a dead-end.
By trying it for every class I get the following:
For class: Angry McNemar table: [[1261. 53.] [ 54. 72.]] chi-squared: 0.0 p-value: 1.0
For class: Calm McNemar table: [[1263. 46.] [ 32. 99.]] chi-squared: 2.1666666666666665 p-value: 0.14103164052071213
For class: Disgust McNemar table: [[1239. 58.] [ 37. 106.]] chi-squared: 4.2105263157894735 p-value: 0.040173870288512055
For class: Fearful McNemar table: [[1201. 60.] [ 70. 109.]] chi-squared: 0.6230769230769231 p-value: 0.4299061750659041
For class: Happy McNemar table: [[1172. 76.] [ 72. 120.]] chi-squared: 0.060810810810810814 p-value: 0.8052189802694247
For class: Neutral McNemar table: [[1280. 43.] [ 34. 83.]] chi-squared: 0.8311688311688312 p-value: 0.36193476692119253
For class: Sad McNemar table: [[1129. 76.] [ 87. 148.]] chi-squared: 0.6134969325153374 p-value: 0.43347418315950703
For class: Surprised McNemar table: [[1304. 39.] [ 33. 64.]] chi-squared: 0.3472222222222222 p-value: 0.5556897902827946
now I get the average value, the weighted value, or should I do something entirely else?
Also as you see I get that for class Disgust the results are statistically significant, but not anything else