19

I am horribly confused by the cluster of terminology and operations surrounding "change of basis" operations. Finding alternate references on this topic only seems to add to the confusion as there doesn't appear to be a consistent approach to defining and notating these operations. Perhaps someone will be able to clarify just one simple aspect of this which is as follows:

Let $u = \{u_1, \dots, u_n \}$ and and $w = \{w_1, \dots, w_n\}$ be bases for a vector space $V$. Then, necessarily, there exists a unique linear operator $T:V \rightarrow V$ such that $T(u_i) = w_i$. Now, the most natural thing in the world to call the matrix of this operator is the change of basis matrix from $u$ to $w$. Give this operator a vector in $u$ and it spits out a vector in $w$. Now, whether it is correct I don't know, but I've seen the matrix of this operator called the change of basis matrix from $w$ to $u$, reversing the target and source bases. This latter interpretation makes no sense because because it takes vectors in $u$ and produces vectors in $w$! I've seen this interpretation in more than one place so it can't just be a fluke. So...which is it?

ItsNotObvious
  • 11,263
  • 1
    the change of basis is also a bijection, so its inverse also defines $T$ uniquely. I agree that such a terminology is quite unnecessary. –  Aug 23 '11 at 22:29
  • Please state which sources call it "...from w to u". I've not seen that reversal of terminology in standard texts. – Shaun Ault Aug 23 '11 at 22:29
  • 1
    I've seen it in more than one place, but here is a concrete example: http://math.stanford.edu/~conrad/diffgeomPage/handouts/orient.pdf – ItsNotObvious Aug 23 '11 at 22:33
  • I've seen some people say that if $T$ is a linear operator represented by a matrix $A$ in some basis, then $T(x)=x^TA$ and some people use $T(x)=Ax$. If you use the first way, then maybe it makes more sense to use the "backwards" wording? Just a guess. I don't have any idea if this actually coheres with reality. – Matt Aug 23 '11 at 23:49
  • 2
    In which basis are you writing the "matrix of this operator"? – Willie Wong Aug 24 '11 at 00:23
  • 1
    A personal opinion: The most convenient way to define a basis of an $n$-dimensional $K$-vector space $V$ is to say that it's a linear isomorphism from $K^n$ to $V$. The categorical viewpoint is really helpful: try to express everything in terms of objects and arrows, and things get easier... – Pierre-Yves Gaillard Aug 24 '11 at 01:28
  • I know that Qiaochu answered a question almost identical to this one. I'm going to look for it... – Arturo Magidin Aug 24 '11 at 03:19
  • 1
    Here it is. Would this be a duplicate, then? – Arturo Magidin Aug 24 '11 at 03:21
  • @Arturo: Funny, my first thought on reading the question was that Qiaochu might answer it :-) – joriki Aug 24 '11 at 04:00
  • @Arturo: Yes, I've voted to close as a duplicate of that question. By the way, does anyone know why the automatic "possible duplicate" notice no longer appears when you do that? – joriki Aug 24 '11 at 04:15
  • Also related perhaps (although it's about confusion between transformation of basis vectors and of coordinates, rather than about choice of terminology): http://math.stackexchange.com/questions/36387/confusion-on-different-representations-of-2d-rotation-matrices/36399#36399 – Hans Lundmark Aug 24 '11 at 05:43
  • @After looking at all of the comments and examining the "duplicate" posts, my question still stands: Is T as it is defined above the "change of basis matrix from u to w" or the "change of basis matrix from w to u"? – ItsNotObvious Aug 24 '11 at 11:37
  • 2
    Willie Wong's question still stands: in which basis are you writing the "matrix of this operator"? You go from linear operator $T$ to matrix of this operator as if there was only one such thing, but the matrix representing $T$ represents it with respect to some particular basis and until you tell us which basis you're using your question makes no sense. See also Arturo's answer. – Gerry Myerson Aug 25 '11 at 06:22
  • I agree. This whole change of basis concept is so horribly described in most mathematical texts and documents. – Your neighbor Todorovich Jan 15 '24 at 15:39

5 Answers5

17

Maybe looking at the one-dimensional case will clarify the point of confusion. "seconds" and "minutes" are both units of time and can be taken to be bases of a one-dimensional real vector space representing time.

If I ask, what is the factor that takes me from the basis {seconds} to the basis {minutes}? Then the answer is 60. (The (1 by 1) matrix consisting of the number 60 is the $T$ of the question.)

However, if I ask, 120 seconds is equal to how many minutes? Then the factor I need to apply is 1/60.

In either case, I am "going from seconds to minutes", but in the first case I am changing the basis elements themselves, from the basis {seconds} to the basis {minutes}, while in the other case, I am converting a fixed unit of time from seconds to minutes. The matrices in the two cases are inverses of each other.

The difference in terminology depends on which of these procedures you think should be called the "change of basis matrix".

Ted
  • 35,732
  • 2
    Ok, this has been the clearest, most direct and to-the-point explanation. Thanks! – Ilya Sep 24 '11 at 01:28
8

The "change of basis matrix from $\beta$ to $\gamma$" or "change of coordinates matrix from $\beta$-coordinates to $\gamma$-coordinates" is the matrix $A$ with the property that for every vector $v\in V$, $$A[v]_{\beta} = [v]_{\gamma},$$ where $[x]_{\alpha}$ is the coordinate vector of $x$ relative to $\alpha$. This matrix $A$ is obtained by considering the coordinate matrix of the identity linear transformation, from $V$-with-basis-$\beta$ to $V$-with-basis-$\gamma$; i.e., $[\mathrm{I}_V]_{\beta}^{\gamma}$.

Now, you say you want to take $T\colon V\to V$ that sends $v_i$ to $w_i$, and consider "the matrix of this linear transformation". Which matrix? With respect to what basis? The matrix of $T$ relative to $\beta$ and $\gamma$, $[T]_{\beta}^{\gamma}$, is just the identity matrix. So not that one.

Now, if you take $[T]_{\beta}^{\beta}$; i.e., you express the vectors $w_i$ in terms of the vectors $v_i$, what do you get? You get the matrix that takes $[x]_{\gamma}$ and gives you $[x]_{\beta}$; that is, you get the change-of-coordinates matrix from $\gamma$ to $\beta$. To see this, note that for example that $[w_1]_{\gamma} = (1,0,0,\ldots,)^t$, so $[T]_{\beta}^{\beta}[w_1]_\gamma$ is the first column of $[T]_{\beta}^{\beta}$, which is how you express $w_1$ in terms of $\beta$.

Which is why it would be the "change of basis matrix from $\gamma$ to $\beta$. Because, as Qiaochu mentions in the answer I linked to, the "translation" of coordinates vectors achieved by this matrix goes "the other way": it translates from $\gamma$-coordinates to $\beta$-coordinates, even though you "defined" $T$ as "going" from $\beta$ to $\gamma$.

Arturo Magidin
  • 417,286
3

If $(u_1,\ldots,u_n)$ and $(w_1,\ldots,w_n)$ are bases of $V$ then there is indeed a unique linear transformation $T:\ V\to V$ such that $T(u_i)=w_i$ $(1\leq i\leq n)$, but this transformation is of no help in understanding what is going on here.

What is at stake is the following: Any vector $x\in V$ has some coordinates $(x_1,\ldots, x_n)$ with respect to the "old" basis $(u_1,\ldots,u_n)$ and another set of coordinates $(\bar x_1,\ldots, \bar x_n)$ with respect to the "new" basis $(w_1,\ldots,w_n)$. The vectors $x$ do not move, but you want to know the connection between the $x_k$ and the $\bar x_i$.

The data about this coordinate transformation are stored in a matrix $T=(t_{ik})_{1\leq i\leq n,\ 1\leq k\leq n}$ in the following way: Any "new" basis vector $w_i$ is a linear combination of the old basis vectors $u_k$, therefore there are (given) numbers $t_{ik}$ such that $$w_i=\sum_{k=1}^n t_{ki} u_k\ .$$ This is to say that in the columns of $T$ we see the "old coordinates" of the "new" basis vectors. Now any vector $x\in V$ has "new coordinates" $\bar x_i$. Writing this out we have $$x=\sum_{i=1}^n \bar x_i w_i= \sum_{i,k} \bar x_i t_{ki} u_k= \sum_{k=1}^n \Bigl(\sum_{i=1}^n {t_{ki} \bar x_i}\Bigr) u_k\ ,$$ and we see that the "old coordinates" $x_k$ of the same vector $x\in V$ are given by $$x_k\ =\ \sum_{i=1}^n t_{ki}\bar x_i\ .$$ If we write our "coordinate vectors" as column vectors we therefore have the formula $x=T\ \bar x$.

One has to get accustomed to the fact that the symbol $x$ denotes at the same time the "geometric object" $x$ and its "coordinate vector" with respect to the "old basis".

  • It's not quite true that $T$ is "of no help", since the coordinate matrix of $T$, viewed as a linear transformation $(V,(u_1,\ldots,u_n))\to (V,(w_1,\ldots,w_n))$, is precisely the matrix that transforms "coordinates" relative to $(w_1,\ldots,w_n)$ into coordinates relative to $(u_1,\ldots,u_n)$. – Arturo Magidin Aug 24 '11 at 18:54
  • "One has to get accustomed to the fact that the symbol $x$ denotes ..." I think that is why most introductory texts give some variant of the notation used by Arturo in his post below/above, where $x$ is the geometric vector and $[v]_\alpha$ is its coordinate representation in a coordinate system $\alpha = (e_1, \ldots, e_n)$. – Willie Wong Aug 24 '11 at 18:56
0

Now, the most natural thing in the world to call the matrix of this operator is the change of basis matrix from to .

It seems the most natural thing in the world to you. Interesting.

(And let me abruptly say that calling it the change of basis matrix from to , reversing the target and source bases, seems to me completely illogical!)

But wait: what matrix, exactly? As far as $V$ is an abstract vector space (v.s.), $T$ is an endomorphism which admits one possible matrix representation for each choice of a couple of bases of $V$. Which one are you picking up?

(If $V$ was $R^3$, instead, then there would exist one intrinsic matrix representation of $T$. The idea is that the vector $(1,2,3)\in R^3$ has one intrinsic “name”, $(1,2,3)$, and infintely many “nicknames,” one for each possible basis of $R^3$, given by the components of $(1,2,3)$ in the considered basis. And the same applies for $T$.)

Actually, the name change of basis matrix from $u$ to $w$ should be deserved for something completely different.

Even though $V$ is a completely abstract v.s., there actually is just one matrix that serves as a change of basis. It's definable by observing that $$ w_i=c_{ji}\,u_j~, $$ (please, not the other way round, as someone claims: that would be a completely non-sense, because in Mathematics new things are defined in terms of old ones, not vice versa!) where in the r.h.s. I assumed an implied summation over the repeated index $j$ (as in the Einstein notation). Since it's clear that we have $n^2$ coefficients $c_{ij}$, it's also clear that they can be arranged in a $n\times n$ matrix, paying attention to the fact that the first index, $j$, must be a row-index, while the second, $i$, a column-index.

So doing, you end up with the matrix $C=[c_{ji}]$. There you are the change of basis matrix from to .

N.B.: Just in the case $V=F^n$ (being $F$ a field, e.g. $R$ or $C$), the previous equation may be put in a matrix form. (But $C$ is a matrix in any case.) Calling $U$ the matrix obtained by putting side-by-side the $n$ vectors $u_i$ (we can do that because, in this case, the $u_i$ are $n$-tuples!), and $W$ the matrix obtained by putting side-by-side the $n$ vectors $w_i$, the previous equation assumes the form $$ W=U\,C~, $$ in this order.

The advocates of what I called a non-sense please consider the following example.

Suppose you have an object of length 10cm. Now I ask you to measure it in terms of a new unit: the "pippolo." (This sounds a very funny name to an Italian ear.) You'd ask me, in turn, "how long is a pippolo, in cm?" Right? (Would you really ask me, instead, "how many pippolos is a cm long?" Come on!) Well: 1pp (the new object) = 2cm (the old object).

Now, since 1pp = 2cm, you know that the object is 10/2 = 5pp.

OK: the multiplication by 2, in this case, is the matrix of change of basis, while the division by 2 (its inverse!) is the matrix of change of representation.

This example is just a silly one, but this inverse relation is a core feature of this problem.

0

Multiplying by the change of basis matrix from the standard basis $\alpha$ to another basis $\beta$ maps a vector to the vector that has those coordinates according to $\beta$.

So it makes sense if you think of it not as mapping a vector to some coordinates for that vector, but rather as mapping a vector to a different vector that plays the same role in the new basis.