To use the principal components of variable X in a linear regression with a y-variable and to perform the linear regression using fitlm
function in MATLAB, you would follow the steps below:
pca
function in MATLAB:main.m42 chars2 lines
Here, X
is the input matrix (each row represents an observation, and each column represents a variable). The output variables coeff
and score
represent the principal component loadings and scores, respectively.
Select the desired number of principal components based on the explained variance. You can decide how many components to keep based on the amount of variance explained by each component. For example, you can choose to keep the first k
components that explain a certain percentage (e.g., 95%) of the total variance. The explained
variable from the pca
function provides the explained variance for each component.
Use the selected principal component score vectors as predictors in a linear regression model.
main.m175 chars3 lines
Here, k
is the number of principal components to keep, y
is the response variable, and mdl
is the fitted linear regression model.
Note: It's always a good idea to normalize or standardize your variables before performing principal component analysis and linear regression.
Keep in mind that the interpretation of the principal component loadings and the regression coefficients may differ. The principal component loadings represent the relationships between the original variables and the principal components, while the regression coefficients represent the relationships between the principal components and the response variable.
gistlibby LogSnag