Algorithms

<aside> 💡

Don’t mix clstering with Dimensionality Reduction !

</aside>

K-means clustering : This is a centroid-based algorithm, where the goal is to minimize the sum of distances between points and their respective cluster centroid.
Hierarchical Clustering : This method creates a tree of clusters. It is subdivided into Agglomerative (bottom-up approach) and Divisive (top-down approach).
DBSCAN (Density-Based Spatial Clustering of Applications with Noise): This algorithm defines clusters as areas of high density separated by areas of low density.
Mean Shift Clustering: It is a centroid-based algorithm, which updates candidates for centroids to be the mean of points within a given region.
Gaussian Mixture Models (GMM): This method uses a probabilistic model to represent the presence of subpopulations within an overall population without requiring to assign each data point to a cluster.
Spectral Clustering: It uses the eigenvalues of a similarity matrix to reduce dimensionality before applying a clustering algorithm, typically K-means.
OPTICS (Ordering Points To Identify the Clustering Structure): Similar to DBSCAN, but creates a reachability plot to determine clustering structure.
Affinity Propagation: It sends messages between pairs of samples until a set of exemplars and corresponding clusters gradually emerges.
BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies): Designed for large datasets, it incrementally and dynamically clusters incoming multi-dimensional metric data points.
CURE (Clustering Using Representatives): It identifies clusters by shrinking each cluster to a certain number of representative points rather than the centroid.