Sift - Mahalanobis Distance/SPE Dialog

From Software Product Documentation
Jump to navigation Jump to search
Language:  English  • français • italiano • português • español 

Mahalanobis Distance and SPE are common measures used to determine outliers in a data sample. The Mahalanobis distance can be conceptualized as the distance from a point to a centroid of a data set, taking into account correlations in the data set. The Mahalanobis distance method can be used on PCA results. This is done by measuring the distance of each point to the centroid in the transformed PCA space. Alternatively, the SPE can be understood as the distance from the original data, to the PCA reduced/transformed data points (model prediction vs the true model measurement), i.e. the distance from the original point to it's projection into the PCA hyperplane.

As covered in our documentation about Outlier Detection Methods, Mahalanobis Distance and SPE complement each other quite well, and as such we decided to conjoin them into a single dialog, allowing for easier use of both methods.

The Mahalanobis Distance and SPE are found on the toolbar and under 'Outlier Detecting Using PCA' in the Analysis menu.

Dialog

  • Grouping to Search: What kind of grouping is used to determine the centroid, Combined Groups, Groups, Workspaces
  • Auto-exclude results: If checked and outliers found will automatically be removed
  • Number of Passes: How many times should the test be run, removing an outlier may alter the centroid, exposing more outliers
  • Find All Outliers: If checked, the Number of Passes parameter will be ignored, and the test will be run until no outliers are found
  • Determine Number of PCs Using Variance Explained: If checked PCs Variance Explained will be displayed instead of Number of PCs
  • Number of PCs: How many principal components should be considered for the test
  • PCs Variance Explained: Instead of selecting the number of PCs directly, select the amount of variance explained
  • Outlier alpha value: The threshold used to determine an outlier

SPE

Since Squared Prediction Error compares a single predictive point to its original value, many of the parameters in the dialog do not apply, the only parameters of note for SPE are:

  • Auto-exclude results: If checked and outliers found will automatically be removed
  • Number of PCs: How many principal components should be considered for the test
  • Outlier alpha value: The threshold used to determine an outlier

Results

The Mahalanobis Distance and SPE results appear upon completion of the test.

Retrieved from ""