Kmeans Algorithm
Steps:
- Given a dataset of size N, randomly initialize k centroids if you feel the dataset can be clustered into k different clusters
\(C = \Set{c_1,..,c_K}\)  | Set of K Centroids 
\(S = \Set{s_1,..,s_K}\)  | Set of K Clusters 
Intra-cluster variance is the objective function which has to be minimized in k-means algorithm:
    
- 
Class assignment  
- 
Update centroids  
- 
Repeat the process until the cluster centroids change any further, i.e. repeat until convergence 
Complexity in one iteration: - k n t k = K clustures | n = No of samples | t = time taken
It is sensitive to the randomly chosen centroids
Clustering performance
Silhouette Coefficient
a : the mean distance between a datapoint and the other points from the same cluster
b : the mean distance between a datapoint and all the other points in the next nearest cluster.