other:inspect3d:documentation:knowledge_discovery:k-means_clustering
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
other:inspect3d:documentation:knowledge_discovery:k-means_clustering [2024/12/20 16:06] – wikisysop | other:inspect3d:documentation:knowledge_discovery:k-means_clustering [2024/12/20 16:08] (current) – wikisysop | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== K-Means Clustering ====== | ====== K-Means Clustering ====== | ||
- | The k-means clustering algorithm is a commonly used method for grouping //n// individual data points into //k// clusters. is a multi-variate statistical analysis that reduces the high-dimensional matrix of correlated, time-varying signals into a low-dimensional and statistically uncorrelated set of principal components (PCs). These PCs explain the variance found in the original signals and represent the most important features of the data, e.g., the overall magnitude or the shape of the time series at a particular point in the stride cycle. The value of each particular subject’s score for the individual PCs represents how strongly that feature was present in the data. | + | The k-means clustering algorithm is a commonly used method for grouping //n// individual data points into //k// clusters. is a multi-variate statistical analysis that reduces the high-dimensional matrix of correlated, time-varying signals into a low-dimensional and statistically uncorrelated set of sift: |
==== The utility of clustering ==== | ==== The utility of clustering ==== | ||
Line 11: | Line 11: | ||
Inspect3D allows users to apply the k-means clustering algorithm to the results of PCA. The dimensionality of the data space is the number of principal components and the user specifies the number of clusters to be found - this is the parameter //k//. | Inspect3D allows users to apply the k-means clustering algorithm to the results of PCA. The dimensionality of the data space is the number of principal components and the user specifies the number of clusters to be found - this is the parameter //k//. | ||
- | * In the {{: | + | * In the {{: |
* Selecting {{: | * Selecting {{: | ||
* The **K-Means** tab allows the user to specify parameter values for the algorithm and then run it on PCA results. | * The **K-Means** tab allows the user to specify parameter values for the algorithm and then run it on PCA results. | ||
Line 23: | Line 23: | ||
**Abstract** | **Abstract** | ||
+ | |||
The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a simple, randomized seeding technique, we obtain an algorithm that is O(log k)-competitive with the optimal clustering. Experiments show our augmentation improves both the speed and the accuracy of k-means, often quite dramatically. | The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a simple, randomized seeding technique, we obtain an algorithm that is O(log k)-competitive with the optimal clustering. Experiments show our augmentation improves both the speed and the accuracy of k-means, often quite dramatically. | ||
other/inspect3d/documentation/knowledge_discovery/k-means_clustering.1734710815.txt.gz · Last modified: 2024/12/20 16:06 by wikisysop