How to write a matrix equation as a product of tensors, how to show how simplify answers with metric tensor and its inverse, request for reference

Question

Context

I am studying quantum field theory. I have continually struggled to understand tensor notation. There seems to be different styles of writing. For example, I have seen $$ \Lambda_\mu^\nu,~ {\Lambda_\mu}^\nu,~{\Lambda^\mu}_\nu. $$

Question 1. How are these three different from each other?

Further, and in particular, let $$ {x'}^\mu = {a^\mu}_\nu \,{x}^\nu, $$ where ${a^\mu}_\nu$ are the elements of a Lorentz transformation.

Bethe and Jackiw expound on Dirac equation and its formal theory [1]. In so doing there is a detail that I do not follow. This is $$ {a_\mu}^\nu\,{a^\mu}_\lambda = a^{\mu\nu}\,a_{\mu\lambda} = a^{\nu\mu}\,a_{\lambda\mu} = {\delta^\nu}_\lambda \qquad\qquad (\text{for}~{a^\mu}_\nu~\text{real}). $$

The metric tensor here is $g$, where $- g_{00} = g_{11} = g_{22} = g_{3 3} =1$ and zero for the other elements. and the inverse metric tensor is written as $g^{\mu\nu}$.

I know that for any Lorentz transform $\Lambda$ that [2]
$$ \Lambda~g~\operatorname{Transpose}(\Lambda)~g^{-1} = I \tag{1}. $$

Question 2. Can you kindly show in a step-by-step fashion how to write Eq. (1) as a tensor equation?

Questions 3. Can you kindly show in step-by-step fashion that your answer to Question 2 simplifies to $$ {a_\mu}^\nu\,{a^\mu}_\lambda {\delta^\nu}_\lambda \qquad\qquad (\text{for}~{a^\mu}_\nu~\text{real})? $$

Questions 4. Can you kindly show in step-by-step fashion that your answer to Question 2 simplifies to $$ a^{\mu\nu}\,a_{\mu\lambda} = {\delta^\nu}_\lambda \qquad\qquad (\text{for}~{a^\mu}_\nu~\text{real})? $$

Questions 5. Can you kindly show in step-by-step fashion that your answer to Question 2 simplifies to $$ a^{\nu\mu}\,a_{\lambda\mu} = {\delta^\nu}_\lambda \qquad\qquad (\text{for}~{a^\mu}_\nu~\text{real})? $$

Question 6. I have several books on tensors and yet, I still fill lost about these notational details. So, lastly, can you kindly provide a reference to a book that go into these notational details?

Bibliography

[1] Bethe and Jackiw, “Intermediate Quantum Mechanics”, p. 361.

[2] How is the Lorentz group, $\text{O}(1,3)$, defined using set theoretic notation?

In brief, the spacing is arbitrary and irrelevant — BUT if you are going to be raising and lowering indices, you want to keep track of the order, and hence you introduce the spacing so that a lowered (raised) index stays in the correct slot. — Ted Shifrin, May 06 '25 at 18:21

Roland F · Answer 1 · 2025-05-06T06:57:06.017

The key word for tensor notation on Riemannian manifolds is 'northwest - south east' notation (Wheeler) for the Jacobian, the basis transformation of the tangent spaces.

$$\partial_{x^\mu} y(x)^{a} ={\Lambda(x)^a}_{\mu} $$

$$\partial_{y^a} x(y)^\mu ={\left(\left(\Lambda(y)\right)^{-1}\right)^a}_{\mu} $$

Wheeler exchanges the object class designator character between the invariant objects and their indices, such that $x$ is the point in any chart, $\Lambda(x,y) = \Lambda(y,x)^{-1}$ a chart-chart map and the charts in use are denoted by the character class used in index notation.

$$\partial_{x^a} x^\mu= {\Lambda(x)^\mu}_{\ a} $$

in short Einstein-Wheeler notation

$$x^a_{\quad , \mu} = \Lambda^a_{\quad \mu}$$

Let $\sigma:(1,\dots,d) \to (n_1=\mu, n_2=\nu ,\dots,n_d)$ an enumeration, fixing the orientation, $e(x)_\mu,\dots $ a oriented tangent basis, $$g(x)_{\mu\ \nu} \ = \ e(x)_\mu \cdot e(x)_\nu$$ the local matrix of the inner products of the local tangent basis.

Now consider linear functionals on the tangent basis. This tensor object class displays Wheelers notation in a much more naturally explicit way

$$\mathbf d f(x) = \sum_\mu f_{,\mu } \mathbf d x^\mu .$$

A tangent basis is called holonomic, if the linearized functions $\mathbf d x^\mu$ of the coordinates are the projections to the components of the tangent space

$$\mathbf d x^\mu\left(\sum_\nu v(x)^\nu \ e(x)_\nu \right) = v(x)^\mu$$ which is accomplished by

$$\mathbf d x^\mu \left(e_\nu\right) \ = \ {\delta^\mu}_{\ \nu} $$

Now the rest of the tensor algebra is easy: The metric on the the differentials is the inverse

$$\mathbf d x^\mu \cdot \mathbf d x^\nu = g^{\mu \ \nu} =\left(g^{-1}\right)_{\mu\ \nu} ,$$ so

$$g(x)^{\lambda \mu} \ g_{\mu \nu} = {\delta^\mu}_{\ \nu}$$

and any general tensor of rank n is an ordered linear combination of the basis in tangent space

$$T(x) = \sum_{\nu_1,\dots, \nu_n} \ t(x)^{\nu_1,\dots, \nu_n} \ e(x)_{\nu_1} \otimes \ \dots \otimes e(x)_{\nu_n}$$

The tensor product is simply generated by collecting all scalar factors as the product in it's left most scalar, i.e. in the tensor product the common scalar factor, its component in the basis product, is considered as a commuting tensor or rank 0.

$$1 \otimes e(x)_{\nu_1} \otimes \ \dots \otimes c \ e(x)_{\nu_k} \otimes \dots \otimes e(x)_{\nu_n} = c\ \otimes e(x)_{\nu_1} \otimes \ \dots \otimes e(x)_{\nu_k} \otimes \dots \otimes e(x)_{\nu_n}$$

Since

$$\sum \mathbf d x^\lambda \left ( \ g^{\mu\ \nu} \ e_\nu \right) = {\delta ^\lambda}_{\ \nu} g^{\mu \nu} = \ g^{\lambda \nu} $$

we can identify the basis with lifted index by the inverse metric as identical to the basis of linear forms and the same for the reverse

$$e_\mu = g_{\mu \nu} \ \mathbf d x^\nu$$ .

Now any tensor has a complete representation set where each index position, lower or upper, denotes the position in the mixed tensor product bases.

In physics generally the point of view is a set of invariant objects presented in different charts, because the underlying algebra is not axiomatically fixable by ignorance and open to deformations.

In pure mathematics on the other hand algebraic symbols denote fixed structural objects, that change it's symbol by any algebraic action.

Ted Black · Accepted Answer · 2025-05-06T18:42:28.727

A very good reference for the questions you are asking is Naber.

The basic point here is that for any mixed tensor $$ {T_a}^b:={T^d}_c g_{ad}g^{bc}. $$ For a tensor like $\delta^b_a$ $$ {\delta_a}^b={\delta^d}_c g_{ad}g^{bc}=g_{ac}g^{bc}={\delta^b}_a=\delta^b_a $$ and so the position of the covariant and the contravariant index is not important. However if we start with something like $$ \def\tb#1{{\scriptstyle #1}} {T^b}_a=\begin{array}{cc} & \begin{matrix} \tb{a=0 }& \tb{a=1 } & \tb{a=2 }& \tb{a=3 }\end{matrix} \\ \begin{matrix} \tb{b=0}\\ \tb{b=1}\\ \tb{b=2}\\ \tb{b=3} \end{matrix} & \begin{bmatrix} 1 & -1 & 2 & 0 \\ 1 & -1 & 0 & 2 \\ 1 & 0 & -1 & 2 \\ 0 & 1 & -1 &2 \end{bmatrix} \end{array} $$ then $$ {T_a}^b={T^d}_c g_{ad}g^{bc}= \begin{array}{cc} & \begin{matrix} \tb{a=0 }& \tb{a=1 } & \tb{a=2 }& \tb{a=3 }\end{matrix} \\ \begin{matrix} \tb{b=0}\\ \tb{b=1}\\ \tb{b=2}\\ \tb{b=3} \end{matrix} & \begin{bmatrix} 1 & -1 & -1 & 0 \\ 1 & -1 & 0 & 1 \\ -2 & 0 & -1 & -1 \\ 0 & 2 & 2 &2 \end{bmatrix} \end{array}. $$ Since the two matrices are not equal it makes sense to distinguish them by writing the indices as above.

A Lorentz transformation has the property of preserving vector norms $$ (\pmb \Lambda \pmb x)^\top \pmb g (\pmb \Lambda \pmb x) = \pmb x^\top \pmb g \pmb x $$ for all $\pmb x \in \mathcal{M}$ ($\mathcal{M}$ is the pseudo-Euclidean vector space known as the Minkowski spacetime). Therefore $$ \pmb \Lambda^\top \pmb g \pmb \Lambda = \pmb g $$ Multiply both sides by $\pmb g^{-1}$ to get $$ \pmb g^{-1}\pmb \Lambda^\top \pmb g \pmb \Lambda = \pmb I $$ from which we conclude that $$ \pmb \Lambda^{-1}=\pmb g^{-1}\pmb \Lambda^\top \pmb g. $$ If we start with a matrix equation like $$ \pmb \Lambda\pmb \Lambda^{-1}=\pmb \Lambda \pmb g^{-1} \pmb \Lambda^\top \pmb g=\pmb I $$ we can use the standard convention for matrix multiplication to write it as $$ \sum_{j,k,l}\Lambda_{ij} g^{-1}_{jk} \Lambda^\top_{kl} g_{lm}=I_{im} $$ Since $\Lambda^\top_{kl} =\Lambda_{lk}$, if we use the convention that contravariant indices represent matrix rows while covariant indices represent matrix columns, we can rewrite the last expression as $$ \sum_{j,k,l}{\Lambda^{i}}_j g^{jk} {\Lambda^l}_k g_{lm}=\delta^i_m $$ where we have used the fact that $g^{-1}$ is a (2,0) tensor and $g$ is a (0,2) tensor. If we use the Einstein summation convention we can write $$ {\Lambda^{i}}_j g^{jk} {\Lambda^l}_k g_{lm}=\delta^i_m \tag{*} $$ $\Lambda$ represents a Lorentz trasformation and so it is a (1,1) tensor. The transpose of a (1,1) tensor is given by $$ ({\Lambda^b}_a)^\top=({\Lambda_c}_a g^{cb})^\top=g^{bc}({\Lambda_c}_a )^\top =g^{bc}\Lambda_{ac}={\Lambda_a}^b. $$ So $(*)$ can be written as $$ {\Lambda^{i}}_j {\Lambda_m}^j=\delta^i_m $$ i.e. the inverse of a Lorentz transformation is its transpose. Note that the transpose of a Lorentz transformation is not the transpose of the matrix. For example, if $$ {\Lambda^b}_a=\begin{array}{cc} & \begin{matrix} \tb{a=0 }& \tb{a=1 } & \tb{a=2 }& \tb{a=3 }\end{matrix} \\ \begin{matrix} \tb{b=0}\\ \tb{b=1}\\ \tb{b=2}\\ \tb{b=3} \end{matrix} & \begin{bmatrix} \gamma & -\gamma\beta & 0 & 0 \\ -\gamma\beta & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \end{array} $$ then $$ \Lambda^\top ={\Lambda_a}^b={\Lambda^d}_c g_{ad}g^{cb} = \begin{array}{cc} & \begin{matrix} \tb{a=0 }& \tb{a=1 } & \tb{a=2 }& \tb{a=3 }\end{matrix} \\ \begin{matrix} \tb{b=0}\\ \tb{b=1}\\ \tb{b=2}\\ \tb{b=3} \end{matrix} & \begin{bmatrix} \gamma & \gamma\beta & 0 & 0 \\ \gamma\beta & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \end{array}. $$ Since $\gamma^2=1/(1-\beta^2)$ we have $$ {\Lambda^{i}}_j {\Lambda_m}^j= \begin{bmatrix} \gamma & -\gamma\beta & 0 & 0 \\ -\gamma\beta & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} \gamma & \gamma\beta & 0 & 0 \\ \gamma\beta & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}= \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}. $$

From this point getting alternative expressions involves index juggling. For example, \begin{alignat*}{2} {\Lambda^{i}}_j {\Lambda_m}^j &=\Lambda^{ik} g_{kj} \Lambda_{ml} g^{lj} \quad && {\Lambda^{i}}_j =\Lambda^{ik} g_{kj};\; {\Lambda_m}^j=\Lambda_{ml} g^{lj}\\ & =\Lambda^{ik}\Lambda_{ml} \delta^l_k \quad && g_{kj}g^{lj}=\delta^l_k \\ &=\Lambda^{ik}\Lambda_{mk} \quad && \Lambda_{mk}=\Lambda_{ml}\delta^l_k. \end{alignat*}

How to write a matrix equation as a product of tensors, how to show how simplify answers with metric tensor and its inverse, request for reference

2 Answers2