Finding Y-relevant part of X by use of PCR and PLSR model reduction methods
Original version
Journal of Chemometrics 21(2007), No. 12, p. 537-546 http://dx.doi.org/10.1002/cem.1062Abstract
The paper is considering the following question: using principal component regression (PCR) or partial least squares regression (PLSR), how much data can be removed from X while retaining the original ability to predict Y? Two model reduction methods using similarity transformations are discussed, one giving projections of original loadings onto the column space of the fitted response matrix (essentially the orthogonal signal correction (OSC) methods), and one giving projections of original scores onto the column space of the coefficient matrix (essentially the net analyte signal (NAS) methods). The loading projection method gives model residuals that are orthogonal to Y and , which is valuable in certain applications. The score projection method, on the other hand, gives model residuals that are orthogonal to , which is essential in other applications. It is shown that the reduced matrix X from the score projection method is a subset of the reduced matrix X from the loading projection method. It therefore has the smallest Frobenius norm, and thus the smallest total column variance, assuming centered data.