Interface PCA

  • All Known Implementing Classes:
    PCAbyEigen, PCAbySVD

    public interface PCA
    Principal Component Analysis (PCA) is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has as high a variance as possible (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (uncorrelated with) the preceding components. Principal components are guaranteed to be independent only if the data set is jointly normally distributed.
    See Also:
    • K. V. Mardia, J. T. Kent and J. M. Bibby, "Multivariate Analysis," London, Academic Press, 1979.
    • W. N. Venables and B. D. Ripley, "Modern Applied Statistics with S," New York, Springer-Verlag, 2002.
    • Wikipedia: Principal component analysis
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      Vector cumulativeProportionVar()
      Gets the cumulative proportion of overall variance explained by the principal components
      Vector loading​(int i)
      Gets the loading vector of the i-th principal component.
      Matrix loadings()
      Gets the matrix of variable loadings.
      Vector mean()
      Gets the sample means that were subtracted.
      int nFactors()
      Gets the number of variables in the original data.
      int nObs()
      Gets the number of observations in the original data; sample size.
      Vector proportionVar()
      Gets the proportion of overall variance explained by each of the principal components.
      double proportionVar​(int i)
      Gets the proportion of overall variance explained by the i-th principal component.
      Vector scale()
      Gets the scalings applied to each variable.
      Matrix scores()
      Gets the scores of supplied data on the principal components.
      double sdPrincipalComponent​(int i)
      Gets the standard deviation of the i-th principal component.
      Vector sdPrincipalComponents()
      Gets the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the correlation (or covariance) matrix).
      Matrix X()
      Gets the (possibly centered and/or scaled) data matrix X used for the PCA.
    • Method Detail

      • nObs

        int nObs()
        Gets the number of observations in the original data; sample size.
        Returns:
        nObs, the number of observations in the original data
      • nFactors

        int nFactors()
        Gets the number of variables in the original data.
        Returns:
        nFactors, the number of variables in the original data
      • mean

        Vector mean()
        Gets the sample means that were subtracted.
        Returns:
        the sample means of each variable in the original data
      • scale

        Vector scale()
        Gets the scalings applied to each variable.
        Returns:
        the scalings applied to each variable in the original data
      • X

        Matrix X()
        Gets the (possibly centered and/or scaled) data matrix X used for the PCA.
        Returns:
        the (possibly centered and/or scaled) data matrix X
      • sdPrincipalComponents

        Vector sdPrincipalComponents()
        Gets the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the correlation (or covariance) matrix).
        Returns:
        the standard deviations of the principal components
      • sdPrincipalComponent

        double sdPrincipalComponent​(int i)
        Gets the standard deviation of the i-th principal component.
        Parameters:
        i - an index, counting from 1
        Returns:
        the standard deviation of the i-th principal component.
      • loadings

        Matrix loadings()
        Gets the matrix of variable loadings. The signs of the columns of the loading are arbitrary.
        Returns:
        the matrix of variable loadings
      • loading

        Vector loading​(int i)
        Gets the loading vector of the i-th principal component.
        Parameters:
        i - an index, counting from 1
        Returns:
        the loading vector of the i-th principal component
      • proportionVar

        Vector proportionVar()
        Gets the proportion of overall variance explained by each of the principal components.
        Returns:
        the proportion of overall variance explained by each of the principal components
      • proportionVar

        double proportionVar​(int i)
        Gets the proportion of overall variance explained by the i-th principal component.
        Parameters:
        i - an index, counting from 1
        Returns:
        the proportion of overall variance explained by the i-th principal component
      • cumulativeProportionVar

        Vector cumulativeProportionVar()
        Gets the cumulative proportion of overall variance explained by the principal components
        Returns:
        the cumulative proportion of overall variance explained by the principal components
      • scores

        Matrix scores()
        Gets the scores of supplied data on the principal components. The signs of the columns of the scores are arbitrary.
        Returns:
        the scores of supplied data on the principal components