Unsupervised learning: necessity of labels and dependency between features and labels?

Asked Aug 01 '17 at 01:11

Active Aug 01 '17 at 02:34

Viewed 38 times

I have logs of activities without labels, which describe whether an activity is normal or not. Assuming that normal behaviors will follow a Gaussian distribution, I fit Gaussian distributions on dataset. I utilized this to generate a synthetic dataset with abnormal patterns. Then, according to the literature review, I generate labels by confidence intervals. After generating labels, I separated train/test dataset. I utilized pdf of gaussian distribution as a feature and learned a simple decision trees.

Depending on dataset, sometimes, this approaches produce 99.5% precision and recall, almost perfect classification.

I felt like that the approach of generating synthetic dataset, labelling, and feature extraction is related in some sense. I am not sure how much I can rely on this trained classifier. If I did anything incorrect, could anyone suggest how to build more credibility on classifiers?

edited Aug 01 '17 at 02:34

asked Aug 01 '17 at 01:11

pippp

Unsupervised learning: necessity of labels and dependency between features and labels?

0 Answers0