partial least squares regression

synonym: partial least squares
initialism: PLS
https://doi.org/10.1351/goldbook.10155

Multivariate calibration which finds factors that maximise covariance between two blocks of data.

Notes:
  1. PLS finds factors (latent variables) in observed variables \(\boldsymbol{X}\) that explain the maximum variance in the variable(s) \(\boldsymbol{c}\), using the simultaneous decomposition of the two. It removes redundant information from the regression, i.e. factors describing large amounts of variance in the observed data that does not correlate with the predictions.
  2. The decompositions are \(\boldsymbol{X} = \boldsymbol{T}_{k}\boldsymbol{P}_{k}^{\rm{T}} + \boldsymbol{E}\) and \(\boldsymbol{C} = \boldsymbol{U}_{k}\boldsymbol{Q}_{k}^{\rm{T}} + \boldsymbol{F}\), and \(\boldsymbol{X^{+}} = \boldsymbol{W}_{k}(\boldsymbol{P}_{k}\boldsymbol{W}_{k})^{-1} (\boldsymbol{T}_{k}^{\rm{T}}\boldsymbol{T}_{k})^{-1}\boldsymbol{T}_{k}^{\rm{T}}\), where \(\boldsymbol{W}\) are weights that maintain orthogonal scores.
  3. PLS1 refers to PLS for a single "c" variable. PLS2 is PLS that simultaneously obtains values for two or more "c" variables. Therefore in the equations of Note 2, PLS1 has vectors \(\boldsymbol{c}\) and \(\boldsymbol{q}\) and PLS2 has matrices \(\boldsymbol{C}\) and \(\boldsymbol{Q}\).
  4. When used for multivariate calibration, evaluation data or cross validation may be used to choose the number of PLS factors and assess the accuracy of the prediction (although the same data must not be used to do both). This is important to guard against overfitting.
  5. PLS may also be used in classification. See partial least squares discriminant analysis.
Source:
PAC, 2016, 88, 407. (Vocabulary of concepts and terms in chemometrics (IUPAC Recommendations 2016)) on page 435 [Terms] [Paper]