Package dev.nm.stat.factor.pca
Class PCAbySVD
- java.lang.Object
-
- dev.nm.stat.factor.pca.PCAbySVD
-
- All Implemented Interfaces:
PCA
public class PCAbySVD extends Object
This class performs Principal Component Analysis (PCA) on a data matrix, using the preferred Singular Value Decomposition (SVD) method. PCA essentially rotates the set of points around their mean in order to align with the principal components. This moves as much of the variance as possible (using an orthogonal transformation) into the first few dimensions. The values in the remaining dimensions, therefore, tend to be small and may be dropped with minimal loss of information. The R equivalent function isprcomp.- See Also:
- K. V. Mardia, J. T. Kent and J. M. Bibby, "Multivariate Analysis," London, Academic Press, 1979.
- W. N. Venables and B. D. Ripley, "Modern Applied Statistics with S," New York, Springer-Verlag, 2002.
- Wikipedia: Principal component analysis
-
-
Constructor Summary
Constructors Constructor Description PCAbySVD(Matrix data)Performs Principal Component Analysis, using the preferred SVD method, on a centered and scaled data matrix.PCAbySVD(Matrix data, boolean centered, boolean scaled)Performs Principal Component Analysis, using the preferred SVD method, on a data matrix (possibly centered and/or scaled).PCAbySVD(Matrix data, Vector mean, Vector scale)Performs Principal Component Analysis, using the preferred SVD method, on a data matrix with (optional) mean vector and scaling vector provided.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DenseVectorcumulativeProportionVar()Gets the cumulative proportion of overall variance explained by the principal componentsImmutableMatrixdata()Gets the original data matrix.Vectorloading(int i)Gets the loading vector of the i-th principal component.Matrixloadings()Gets the matrix of variable loadings.Vectormean()Gets the sample means that were subtracted.intnFactors()Gets the number of variables in the original data.intnObs()Gets the number of observations in the original data; sample size.VectorproportionVar()Gets the proportion of overall variance explained by each of the principal components.doubleproportionVar(int i)Gets the proportion of overall variance explained by the i-th principal component.Vectorscale()Gets the scalings applied to each variable.Matrixscores()Gets the scores of supplied data on the principal components.doublesdPrincipalComponent(int i)Gets the standard deviation of the i-th principal component.DenseVectorsdPrincipalComponents()Gets the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the correlation (or covariance) matrix, though the calculation is actually done with the singular values of the data matrix)SVDsvd()Gets the Singular Value Decomposition (SVD) of matrix X.MatrixX()Gets the (possibly centered and/or scaled) data matrix X used for the PCA.
-
-
-
Constructor Detail
-
PCAbySVD
public PCAbySVD(Matrix data, Vector mean, Vector scale)
Performs Principal Component Analysis, using the preferred SVD method, on a data matrix with (optional) mean vector and scaling vector provided.- Parameters:
data- a matrix that represents the original datamean- an optional mean vector (of length equal to nFactors) to be subtracted regardless of the flagcenteredscale- an optional scaling vector (of length equal to nFactors) to be divided regardless of the flagscaled
-
PCAbySVD
public PCAbySVD(Matrix data, boolean centered, boolean scaled)
Performs Principal Component Analysis, using the preferred SVD method, on a data matrix (possibly centered and/or scaled).- Parameters:
data- a matrix that represents the original datacentered- a logical value indicating whether the variables should be shifted to be zero centeredscaled- a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place (N.B. in general scaling is advisable; however, it should only be used if there is no constant variable)
-
PCAbySVD
public PCAbySVD(Matrix data)
Performs Principal Component Analysis, using the preferred SVD method, on a centered and scaled data matrix.- Parameters:
data- a matrix that represents the original data
-
-
Method Detail
-
mean
public Vector mean()
Description copied from interface:PCAGets the sample means that were subtracted.
-
scale
public Vector scale()
Description copied from interface:PCAGets the scalings applied to each variable.
-
svd
public SVD svd()
Gets the Singular Value Decomposition (SVD) of matrix X.- Returns:
- the Singular Value Decomposition (SVD) of matrix X
-
sdPrincipalComponents
public DenseVector sdPrincipalComponents()
Gets the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the correlation (or covariance) matrix, though the calculation is actually done with the singular values of the data matrix)- Returns:
- the standard deviations of the principal components
-
loadings
public Matrix loadings()
Description copied from interface:PCAGets the matrix of variable loadings. The signs of the columns of the loading are arbitrary.- Returns:
- the matrix of variable loadings
-
data
public ImmutableMatrix data()
Gets the original data matrix.- Returns:
- the original data matrix
-
nObs
public int nObs()
Description copied from interface:PCAGets the number of observations in the original data; sample size.
-
nFactors
public int nFactors()
Description copied from interface:PCAGets the number of variables in the original data.
-
X
public Matrix X()
Description copied from interface:PCAGets the (possibly centered and/or scaled) data matrix X used for the PCA.
-
sdPrincipalComponent
public double sdPrincipalComponent(int i)
Description copied from interface:PCAGets the standard deviation of the i-th principal component.- Specified by:
sdPrincipalComponentin interfacePCA- Parameters:
i- an index, counting from 1- Returns:
- the standard deviation of the i-th principal component.
-
loading
public Vector loading(int i)
Description copied from interface:PCAGets the loading vector of the i-th principal component.
-
proportionVar
public Vector proportionVar()
Description copied from interface:PCAGets the proportion of overall variance explained by each of the principal components.- Specified by:
proportionVarin interfacePCA- Returns:
- the proportion of overall variance explained by each of the principal components
-
proportionVar
public double proportionVar(int i)
Description copied from interface:PCAGets the proportion of overall variance explained by the i-th principal component.- Specified by:
proportionVarin interfacePCA- Parameters:
i- an index, counting from 1- Returns:
- the proportion of overall variance explained by the i-th principal component
-
cumulativeProportionVar
public DenseVector cumulativeProportionVar()
Description copied from interface:PCAGets the cumulative proportion of overall variance explained by the principal components- Specified by:
cumulativeProportionVarin interfacePCA- Returns:
- the cumulative proportion of overall variance explained by the principal components
-
-