9

I have used the t-SNE algorithm to visualize my high dimensional data. However, I was wondering if this is a practical method for inference?

Shayan Shafiq
  • 1,008
  • 4
  • 13
  • 24
smw
  • 203
  • 1
  • 5

4 Answers4

2

It's a dimensionality reduction algorithm. Inference is the problem of determining the parameters, or labels, that best fit the model for a given input once the model parameters have been learned, or estimated.

Emre
  • 10,541
  • 1
  • 31
  • 39
1

Answering late, so here is just a sketch of strategy.

I think you can use t-SNE also for semi-supervised classification.

It might only work if you have few labels to predict, and if you have well-separated clusters with simple decision boundaries.

Determine the centroids of these clusters that t-SNE gave you, and then you could do a procedure similar to a nearest-neighbor-search to classify new data instances according to their distance to the cluster centroids.

knb
  • 672
  • 5
  • 16
1

If by 'inference' you mean clustering analysis, I have a trick that may be helpful - plugging in the input properties to the t-SNE outputs.

For example, let's say you are applying t-SNE on a customer data set, and it outputs a few cleanly separated clusters. As you can figure out where each individual customer lies on the t-SNE plot, you can identify the customers in each t-SNE cluster. You can then plug in the parameters that describe tje customers and get a sense of the customer characteristics of each cluster.

This trick can help you validate whether the t-SNE output makes business sense before you apply more interpretable clustering algorithms.

0

I quote Hands-On Machine Learning with Scikit-Learn and TensorFlow

t-SNE Reduces dimensionality while trying to keep similar instances close and dissimilar instances apart. It is mostly used for visualization, in particular to visualize clusters of instances in high-dimensional space (e.g., to visualize the MNIST images in 2D).

When visualizing your data you have to take into consideration the curse of dimensionality https://en.wikipedia.org/wiki/Curse_of_dimensionality

Carlos Mougan
  • 6,430
  • 2
  • 20
  • 51