Questions tagged [unsupervised-learning]

Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction.

Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction

Because the items are unlabelled, there's nothing that points toward the "correct" labels, as there is with supervised learning. Unsupervised learning uses methods like clustering and principal components analysis to discover structure.

Reference:
Wikipedia - Unsupervised learning

439 questions
29
votes
1 answer

Word2Vec vs. Sentence2Vec vs. Doc2Vec

I recently came across the terms Word2Vec, Sentence2Vec and Doc2Vec and kind of confused as I am new to vector semantics. Can someone please elaborate the differences in these methods in simple words. What are the most suitable tasks for each…
25
votes
2 answers

What kinds of learning problems are suitable for Support Vector Machines?

What are the hallmarks or properties that indicate that a certain learning problem can be tackled using support vector machines? In other words, what is it that, when you see a learning problem, makes you go "oh I should definitely use SVMs for…
17
votes
3 answers

Intuition Behind Restricted Boltzmann Machine (RBM)

I went through Geoff Hinton's Neural Networks course on Coursera and also through introduction to restricted boltzmann machines, still I didn't understand the intuition behind RBMs. Why do we need to compute energy in this machine? And what is the…
Born2Code
  • 347
  • 2
  • 10
15
votes
4 answers

How word2vec can be used to identify unseen words and relate them to already trained data

I was working on word2vec gensim model and found it really interesting. I am intersted in finding how a unknown/unseen word when checked with the model will be able to get similar terms from the trained model. Is this possible? Can word2vec be…
gaurus
  • 351
  • 1
  • 2
  • 5
13
votes
3 answers

How can autoencoders be used for clustering?

Suppose I have a set of time-domain signals with absolutely no labels. I want to cluster them in 2 or 3 classes. Autoencoders are unsupervised networks that learn to compress the inputs. So given an input $x^{(i)}$, weights $W_1$ and $W_2$, biases…
13
votes
3 answers

How to use GAN for unsupervised feature extraction from images?

I have understood how GAN works while two networks (generative and discriminative) compete with each other. I have built a DCGAN (GAN with convolutional discriminator and de-convolutional generator) which now successfully generates handwritten…
exAres
  • 251
  • 2
  • 4
12
votes
1 answer

What is the difference between topic modeling and clustering?

I know that topic modeling and clustering are related, but not similar techniques. Can anyone suggest what are the main differences?
sara
  • 481
  • 7
  • 15
12
votes
2 answers

Does it make sense to train a CNN as an autoencoder?

I work with analyzing EEG data, which will eventually need to be classified. However, obtaining labels for the recordings is somewhat expensive, which has led me to consider unsupervised approaches, to better utilize our quite large amounts of…
12
votes
2 answers

Clustering high dimensional data

TL;DR: Given a big image dataset (around 36 GiB of raw pixels) of unlabeled data, how can I cluster the images (based on the pixel values) without knowing the number of clusters K to begin with? I am currently working on an unsupervised learning…
sunside
  • 223
  • 1
  • 2
  • 8
10
votes
3 answers

Isolation forest sklearn contamination param

I am working on an unsupervised anomaly detection task on time series data using an isolation forest algorithm. I am developing it in Python, more in detail using scikit-learn. I found a lot of examples on this, but what is not very clear, is how to…
10
votes
1 answer

Robustness of ML Model in question

While trying to emulate a ML model similar to the one described in this paper, I seemed to eventually get good clustering results on some sample data after a bit of tweaking. By "good" results, I mean that Each observation was put in a cluster with…
10
votes
1 answer

Confused about how to apply KMeans on my a dataset with features extracted

I am trying to apply a basic use of the scikitlearn KMeans Clustering package, to create different clusters that I could use to identify a certain activity. For example, in my dataset below, I have different usage events (0,...,11), and each event…
Gary
  • 529
  • 2
  • 5
  • 12
9
votes
1 answer

Gaussian Mixture Models as a classifier?

I'm learning the GMM clustering algorithm. I don't understand how it can used as a classifier. Here are my thought: 1) GMM is an unsupervised ML algorithm. At least that's how sklearn categorizes it. 2) Unsupervised methods can cluster data, but…
8
votes
6 answers

Clustering algorithms for high dimensional binary sparse data

I have a dataset with 10,000 genes like below person gene1 gene2 ... gene10000 ethnic 1 0 1 1 asian 2 1 0 1 European Each row means, whether a person has a gene in their DNA or not. We are…
8
votes
1 answer

Ideas for prospect scoring model

I have to think about a model to identify prospects (companies) that have a high chance of being converted into clients, and I'm looking for advice on what kind of model could be of use. The databases I will have are, as far as I know (I don't have…
1
2 3
29 30