sift:principal_component_analysis:mahalanobis_distance_and_spe_dialog
Table of Contents
Mahalanobis Distance and SPE Dialog
Mahalanobis Distance and SPE are common measures used to determine outliers in a data sample.
- The Mahalanobis distance can be conceptualized as the distance from a point to a centroid of a data set, taking into account correlations in the data set. The Mahalanobis distance method can be used on PCA results. This is done by measuring the distance of each point to the centroid in the transformed PCA space.
- The SPE can be understood as the distance from the original data, to the PCA reduced/transformed data points (model prediction vs the true model measurement), i.e. the distance from the original point to it's projection into the PCA hyperplane.
As covered in our documentation about Outlier Detection Methods, Mahalanobis Distance and SPE complement each other quite well, and as such we decided to conjoin them into a single dialog, allowing for easier use of both methods.
The Mahalanobis Distance and SPE are found on the toolbar and under 'Outlier Detecting Using PCA' in the Analysis menu.
Dialog
- Grouping to Search: What kind of grouping is used to determine the centroid, Combined Groups, Groups, Workspaces
- Auto-exclude results: If checked any outliers found will automatically be removed
- Number of Passes: How many times should the test be run, removing an outlier may alter the centroid, exposing more outliers
- Find All Outliers: If checked, the Number of Passes parameter will be ignored, and the test will be run until no outliers are found
- Determine Number of PCs Using Variance Explained: If checked PCs Variance Explained will be displayed instead of Number of PCs
- Number of PCs: How many principal components should be considered for the test
- PCs Variance Explained: Instead of selecting the number of PCs directly, select the amount of variance explained
- Outlier alpha value: The threshold used to determine an outlier
SPE
Since Squared Prediction Error compares a single predictive point to its original value, some of the parameters in the dialog do not apply. These are:
- Grouping to Search: SPE is not grouped
- Number of Passes/Find All Outliers: Removing another outlier does not effect if a SPE is an outlier or not.
Results
The Mahalanobis Distance and SPE results appear upon completion of the test.
sift/principal_component_analysis/mahalanobis_distance_and_spe_dialog.txt · Last modified: 2024/11/15 15:22 by wikisysop