Suppose \( Ax=b; A\in\mathbb{R}^{m\times n} \) has an exact solution. We indulge in some GE on \(\lbrack A| b\rbrack\) (matrix \(A\) concatenated with \(b\) so that he is its last column). This GE takes \(\frac{2}{3}n^3+O(n^2)\) ops. But we are not done. Now we need to take this new matrix \([U|c]\) and solve \(Ux=c\), which takes \(n^2\) because we are just back substituting.

Now suppose that \( Ax=b; A\in\mathbb{R}^{m\times n} \) does not have an exact solution, which happens most of the time. How does one then produce something close to a solution? The answer is solve \( A^TAx=A^Tb \) instead. In fact this is the goto whenever \(m > n\). As an example lets take linear regression. If we have a bunch of points \( P_1(x_1, y_1), \dots, P_m(x_m, y_m) \) and we want to draw a line closest to them all, we take a 2d line equation \(y=kx+n\) and construct: \(x = \begin{bmatrix} k \\ n \end{bmatrix} \), \(b = \begin{bmatrix} y_1 \\ \vdots \\ y_m \end{bmatrix} \) and finally \(A=\begin{bmatrix}x_1 & 1 \\ \vdots & \vdots \\ x_n & 1 \end{bmatrix}\) and transposing to get \(A^T=\begin{bmatrix}x_1 & \dots & x_n \\ 1&\dots&1 \end{bmatrix}\). If you had a \( y=ax^2+bx+c \) situation then \(x=\begin{bmatrix}a\\b\\c\end{bmatrix}\), \(b\) remains the same and \(A=\begin{bmatrix}x_1^2 & x_1 & 1 \\ \vdots & \vdots & \vdots \\ x_n^2 & x_n & 1 \end{bmatrix}\), and so on.


However we quickly learn that there are better ways computationally provided we have an exact solution. First consider \(A = LU\). The idea is, lets split rectangle A into two triangles: \(L\) (lower) and \(U\) (upper) then we solve \(LUx=b\), or more precisely \(Ux=y\) then \(Ly=b\). There are two ways to split it: full pivoting and partial pivoting.

Here is how full pivoting works, you start with \(L \lbrack U|b\rbrack\) where \(L=I\) now suppose that you do this operation in GE on \(\lbrack U|b\rbrack\): subtract 4 times row 1 from row 2. Now even tho it says subtract you will actually store positive 4 to your \(L\) in row 2 (if it said add then you would store negative). So the corner of \(L\) now looks like \(\begin{bmatrix}1&0&\cdots\\4&1&\cdots \\ 0&0&1&\cdots \\ \vdots&\vdots&\vdots&\ddots \end{bmatrix}\). We repeat this process until \(U\) is upper triangular. And at that point \(L\) is lower triangular and our \(A=LU\) is done. So now what is partial pivoting? It is the same as pivoting but one key difference, before we decide to subtract row 1 from row 2, we scout for other rows and find the maximum row (specifically the one that has the largest first non-zero entry) and we substitute him with row 2, then perform, say 4 times row 1 from row 2, this results in a different matrices \(L\) and \(U\). Note that we do not swap the rows of \(L\) ever, but rows of \(U\) and \(b\) change always. After we are done we need to do back substitution when solving for \(Ux=y\) which takes \(n^2\) ops. And then forward substitution when solving \(Ly=b\) which takes \(n^2-n\) ops.