1

I am reading Elements of Statistical Learning and read the following claim from the text (page 93, Chapter 3.7):

Least squares fitting is usually done via the Cholesky decomposition of the matrix $\mathbf{X}^T\mathbf{X}$ or a QR Decomposition of $\mathbf{X}$. With $N$ obserations and $p$ features, the Cholesky decomposition requires $p^3 + Np^2/2$ operations, while the QR decomposition requires $Np^2$ operations.

I understand Cholesky and QR decompositions individually, but I do not understand where this claim came from. Please help.

cgo
  • 273
  • 2
  • 6

0 Answers0