1

Let $E: ax+by+cz=0$ be a plane in ${\Bbb R}^3$. Hence, its normal vector is $\vec{n} = ({a,b,c})$. To find the orthogonal projection matrix that maps every point in ${\Bbb R}^3$ onto $E$, we may write out the projection matrix as

$$ P = I - \frac{\vec{n} \cdot \vec{n}^{T} }{\| \vec{n} \|_2^{2} }$$

where $I$ denotes the identity matrix for ${\Bbb R}^3$. Why is that, intuitively? Why does the matrix multiplication in the numerator yield something that projects onto the plane?


Related

Samuel
  • 1,051

3 Answers3

4

If $v \in \Bbb R^3$, then $v = Pv+w$, where $Pv \in E$ and $w$ is orthogonal to $Pv$. Now:

  • $Pv \in E$ means that $Pv$ is orthogonal to $n$, hence $n^\top Pv=0$.
  • $w$ being orthogonal to some element of $E$ implies that $w=cn$ for some scalar $c$.

Hence, $$ 0 = n^\top Pv = n^\top(v-w) = n^\top v-n^\top w = n^\top v-n^\top (cn) = n^\top v-c\|n\|^2 $$ and then $$ c = \frac{n^\top v}{\|n\|^2} \quad \text{i.e.} \quad w = \frac{n^\top v}{\|n\|^2} n = \frac{nn^\top}{\|n\|^2}v. $$ Finally: $$ Pv = v-w = v-\frac{nn^\top}{\|n\|^2}v = \left( I-\frac{nn^\top}{\|n\|^2} \right)v. $$

azif00
  • 23,123
  • Very elegant, as always. Thanks a lot. – Samuel Apr 19 '25 at 07:29
  • To really understand that, could you please elaborate on how you swap the multiplication in the numerator, when you manipulate the expression for w? Why can we swap the transposed of n with n and v? I am familiar with it being not commutative. Thank you, I really appreciate your brilliant work! – Samuel Apr 19 '25 at 12:47
  • 2
    @complexlogarithm Of course. Note that $(nn^\top)v = n(n^\top v)$. Now, since $n^\top v$ is just a number, we have that $n(n^\top v) = (n^\top v)n$. Thus: $(n^\top v)n = (nn^\top)v$. That's why $$\frac{n^\top v}{|n|^2} n = \frac{nn^\top}{|n|^2}v.$$ – azif00 Apr 19 '25 at 18:59
1

Note that the plane passes through the origin. In order to find the projection matrix that projects onto the line orthogonal to the plane (i.e., the line spanned by the vector $\bf n$) and passing through the origin, consider the following $1$-dimensional least-squares problem$^\color{magenta}{\star}$

$$ t_{\min} := \arg\min_{t \in {\Bbb R}} \| {\bf x} - t \, {\bf n} \|_2^2 = \dots = \frac{{\bf n}^\top {\bf x}}{{\bf n}^\top {\bf n}}$$

and, thus,

$$ t_{\min} \, {\bf n} = \left( \frac{{\bf n}^\top {\bf x}}{{\bf n}^\top {\bf n}} \right) {\bf n} = {\bf n} \left( \frac{{\bf n}^\top {\bf x}}{{\bf n}^\top {\bf n}} \right) = \underbrace{\left( \frac{\,\,{\bf n} \, {\bf n}^\top }{{\bf n}^\top {\bf n}} \right)}_{=: {\bf P}} {\bf x} = {\bf P} \, {\bf x} $$

is the point on the line that is closest (in the Euclidean norm) to a general ${\bf x} \in {\Bbb R}^d$, i.e., it is the orthogonal projection of $\bf x$ onto the line. Since we have a matrix-vector multiplication, $\bf P$ is the projection matrix that projects onto the aforementioned line and

$$ {\bf I}_d - \dfrac{\,\,{\bf n} \, {\bf n}^\top}{{\bf n}^\top {\bf n}} $$

is the projection matrix that projects onto the orthogonal complement of the aforementioned line, i.e., the plane itself.


$\color{magenta}{\star}$ Note that one does not even need calculus to find $t_{\min}$. From $ \| {\bf x} - t \, {\bf n} \|_2^2 = {\bf x}^\top {\bf x} - 2 t \,{\bf n}^\top {\bf x} + t^2 {\bf n}^\top {\bf n} $ and using the quadratic formula, one obtains $t_{\min} = \frac{{\bf n}^\top {\bf x}}{{\bf n}^\top {\bf n}}$.

  • What a wonderful solution. Thank you! Involving calculus is brilliant. What I don’t quite understand is how you deduce the projection matrix once you found the minimal t. What is the underlying reasoning? – Samuel Apr 19 '25 at 12:44
  • @complexlogarithm I have refined my answer. Is the reasoning clear now? – Rodrigo de Azevedo Apr 19 '25 at 13:02
  • 1
    Yes! Thank you, that is insightful and very creative. Arg denotes for what t the expression holds, correct? – Samuel Apr 19 '25 at 13:05
1

Here's a variation on this answer, with a slightly different sequence of operations.

First, the formula can be simplified by defining $\hat n = \dfrac{\vec n}{\lVert \vec n \rVert}.$ Then $$ P = I - \frac{\vec n \vec n^T }{\lVert \vec n \rVert_2^2 } = I - \hat n \hat n^T. $$

Now let $\vec v$ be any vector in $\mathbb R^3.$ Then $$ P \vec v = \left(I - \hat n \hat n^T\right)\vec v = \vec v - \left(\hat n \hat n^T\right)\vec v = \vec v - \hat n \left( \hat n^T\vec v \right). $$

Now recognize that the scalar value of $\hat n^T\vec v$ is the magnitude of the component of $\vec v$ orthogonal to the plane $E$, and $\hat n$ is a unit vector in the direction of that component. The multiplication of the $3\times 1$ matrix $\hat n$ by the $1\times 1$ matrix $\hat n^T\vec v$ gives us a $3\times 1$ matrix equal to $c \hat n$ where $c$ is the scalar value of $\hat n^T\vec v,$ that is, it's the component of $\vec v$ orthogonal to $E.$

It's possibly not the most rigorous approach (is $\hat n^T\vec v$ a matrix or a scalar, really?) but to the extent that you can identify a linear transformation with a matrix, I find it helps my intuition.

David K
  • 108,155