UMAP videos

Clustering Data

UMAP and tsne manage to reduce dimensionality while maintaining the topography of the data. As a side effect they are much slower than PCA or SVD. To speed it up you can use NVIDIA tool called RAPIDS (more in Kaggle Book p. 338 43%).

Vincent is a great fan of UMAP because it keeps the distances between the points which is extremely useful and important for clustering use cases. See his explanation here.

How to use UMAP properly?

Both, UMAP and tsne have to be used carefully, because it is easy to spot clusters and patterns where there are none. The following articles offer advice to use both techniques properly:

UMAP Parameters