17

Definition of the adjoint operator: A linear operator T on an inner product space V is said to have an adjoint operator $T^{*}$ on V if $\langle T(u),v \rangle= \langle u,T^{*}(v) \rangle$.

Question: Why people come up with that definition? It does not sound intuitive to me. $T^{*}$ is the transpose conjugate of T right, and does that definition follow from definition of inner product space?

3 Answers3

8

The point of the definition is to extend the notion of the "conjugate transpose" so that it makes sense on an arbitrary inner product space. I'm not sure what you mean by "does that definition follow from definition of inner product space". However, I think it might be helpful to see why if $V = \Bbb C^n, W = \Bbb C^m$ with the usual inner product and $T:V \to W$ is the operator on $V$ defined by $T(x) = Ax$, then the adjoint operator $T^*: W \to V$ is $T^*(x) = A^*x$. In other words, taking the adjoint is "the same as" taking the conjugate transpose.

Let $A'$ denote the conjugate-transpose of $A$. Recall that the usual inner product on $\Bbb C^n$ is given by $$ \langle x,y\rangle = y'x = \sum_{k=1}^n x_k \bar y_k. $$ If we define $T(x) = Ax$ and $S(x) = A'x$, then we find that for $x \in V$ and $y \in W$, we have $$ \langle T(x),y \rangle = y'(Ax) = (y'A)x = (A'y)'x = \langle x,S(y) \rangle. $$ So, $S$ is indeed the adjoint operator to $T$.

Ben Grossmann
  • 234,171
  • 12
  • 184
  • 355
7

The adjoint on inner product spaces comes from a more general construction. If $X$ and $Y$ are Banach spaces and $T : X \to Y$ is a bounded linear operator, then $T$ induces a map from the dual of $Y$ to the dual of $X$, that is a $T^*:Y^*\to X^*$ defined by

$T^*y^*(x)=y^*(T(x))\tag 1$

So, if $\mathbb F$ is the scalar field of the spaces $X$ and $Y$, we have that $T^*$ sends an arbitrary $y^*:Y\to \mathbb F$ to a $T^*y^*:X\to \mathbb F$, which acts on an arbitrary $x\in X$ as in $(1).$

The reason this definition is useful is that knowledge of the properties of the dual space often provides answers to questions about the space itself.

Of course, one has to check that $T^*y^*$ is a bounded linear operator. Linearity is immediate, and boundedness follows from the calculation

$|y^*(T(x))| \leq \| y^* \| \| T \| \| x \| \tag2$

To specialize this to your case, suppose $X=Y=V$ an inner product space and $T:V\to V$ is a bounded linear operator. By the Riesz theorem, there is a bijection

$v\leftrightarrow \langle \cdot,v\rangle\ \text{between the elements of}\ V\ \text{and those of}\ V^*\tag 3$

Let $y,w\in V$ be the elements corresponding to $y^*$ and $T^*y^*$, respectively. Then, $\langle T(v),y\rangle=\langle v,w\rangle$. But, $T^*$ sends $y^*$ to $T^*y^*$ so applying the correspondence $(3)$, we have $T^*y=w$, from which it follows that

$\langle T(v),y\rangle=\langle v,T^*y\rangle \tag4$

Matematleta
  • 30,081
1

Maybe some concrete computations would provide a useful way to think about this. Consider two subspaces $\mathcal{U}$ and $\mathcal{V}$ of $\mathbb{R}^{l}$, the first being of dimension $n=\dim\mathcal{U}$ and the second of dimension $m=\dim\mathcal{V}$. For example in 3D you can imagine one of the spaces being a line passing through the origin, and the other a plane passing through the origin (not necessarily parallel or orthogonal). You can measure angles and lengths on those subspaces by inheriting the standard inner product from $\mathbb{R}^l$:$\langle a,b\rangle = a^Tb$. In order to reduce the representation of vectors from the spaces to their effective dimensions (e.g. useful for computations) you can introduce two bases for the spaces: $U\in\mathbb{R}^{l\times n}$ and $V\in\mathbb{R}^{l\times m}$. Then any vector $u\in\mathcal{U}$ can be written uniquely as $u=U[u]_U$, and similarly $v=V[v]_V$. You can then define inner products wrt the bases as follows: $$\langle u, x\rangle = \langle U[u]_U,U[x]_U\rangle = [u]_U^T(U^TU)[x]_U = \langle [u]_U, [x]_U\rangle_{(U^TU)}$$ Similarly you can define $\langle [v]_V, [y]_V\rangle_{(V^TV)}$ for the coordinate representations wrt $V$ of vectors in $\mathcal{V}$. Now assume you are given some linear operator $T:\mathcal{U}\to\mathcal{V}$. A natural question arises what would be a corresponding operator $T^*:\mathcal{V}\to\mathcal{U}$ that preserves the inner product restricted to the two subspaces: $$\langle Tu, v\rangle|_{\mathcal{V}} = \langle u, T^*v\rangle|_{\mathcal{U}}.$$

We can then also use the coordinate representation of $T$ and $T^*$ wrt $U$ and $V$ to explicitly compute the relation: \begin{align*} \langle Tu, v\rangle|_{\mathcal{V}}&=\langle [T]_{V,U}[u]_U, [v]_V\rangle_{(V^TV)}\\ &= [u]_U^T[T]_{V,U}^T(V^TV)[v]_V \stackrel{!}{=} [u]_U^T(U^TU)[T^*]_{U,V}[v]_V \\ &= \langle [u]_U, [T^*]_{U,V}[v]_V\rangle_{(U^TU)} = \langle u,T^*v\rangle|_{\mathcal{U}}\end{align*}

If we wish for the equality to hold regardless of the choice of $u$ and $v$ we need that: $$[T]^T_{V,U}(V^TV) = (U^TU)[T^*]_{U,V} \iff [T^*]_{U,V} = (U^TU)^{-1} [T]^T_{V,U} (V^TV)$$

So for example if $Tu$ and $v$ were orthogonal, the operator $T^*$ would make it so that $u$ and $T^*v$ are orthogonal. So you can think of $T^*$ geometrically as the corresponding operator to $T$ that preserves the inner product.

More generally, the space $\mathcal{U}$ and $\mathcal{V}$ need not be subspaces of $\mathbb{R}^l$. You can for instance pick $\mathcal{U}$ to be the space of polynomials of at most degree $n-1$: $\mathcal{P}_{n-1}(\mathbb{C})$, and $\mathcal{V} = \mathbb{C}^m$. You can define some inner products for the two spaces, e.g. $\langle p, q\rangle_{\mathcal{U}} = \int_a^b \overline{p(x)}q(x)h(x)\,dx$ and $\langle v,w\rangle_{\mathcal{V}} = \overline{v}^TGw$. Now you can take $T$ to be for instance some derivative operators followed by an evaluation at specific points $$T=\begin{bmatrix}\delta_{a_1} \circ \frac{d^{k_1}}{dx^{k_1}}\\ \vdots \\ \delta_{a_m}\circ\frac{d^{k_m}}{dx^{k_m}}\end{bmatrix}.$$

You can again introduce bases for $\mathcal{U}$ (e.g. the monomial or Lagrange basis) and for $\mathcal{V}$ (e.g. the standard basis) and compute the adjoint. This time around your coordinate inner products would involve $\overline{U}^THU$ and $\overline{V}^TGV$, however, because the inner products were not the standard ones.

But as before $[T]_{V,U}\in\mathbb{C}^{m\times n}$ and $$[T^*]_{U,V} = (\overline{U}^THU)^{-1}\overline{[T]_{V,U}}^T(\overline{V}^TGV)$$

lightxbulb
  • 2,378