gap statistic

Share This

“A method of heuristically selecting the number of means (clusters) to use when clustering data. The number of clusters begins at K=1 and the total within-cluster variance is computed. As K is increased, this value drops. Plotting the total value against K often reveals a break point, presumably indicating the natural number of clusters to be used. Various gap statistics have been defined to formalize this change and highlight the number of clusters to choose for the K-means process.”

« Back to Glossary Index Download Tooltip Pro
By |February 1st, 2019|Comments Off on gap statistic