1

In contrast to questions like here, where a slow SVM training results from a high number of samples, I only have around 500 samples. Still, a single training fold (cross-validation) takes several minutes - that's 100x to 1000x the time needed compared to other folds. In particular, this only happens with kernel='poly' when I have less than 10 features (with 5 being significantly worse than 10). When I have 12 or more features this problem disappears completely. Maybe I'm missing something, but why would such a low dimensional feature space (for the poly kernel) have such a negative impact? Perhaps interesting to know: these super slow convergences take many iterations - I've found that if I set max_iter=int(1e6) for example, I get results in about 10s, but with the warning for premature stopping.

*The input is normalized; the problem only occurs if gamma is >0.1 especially if it is >1, however that is not always the case. Some runs work fine with gamma = 2, while other times they take forever. There is no consistency or run to run reproducibility.

0 Answers0