Coupling bra-ket notation to the definition of Hilbert space

Question

I was studying quantum mechanics and the meaning of the bra-ket notation. We define the following ket that belongs to the Hilbert space:

$$|\psi\rangle \in \mathcal{H} \tag{1}$$

We define two vectors $|\alpha\rangle$ and $|\beta\rangle$ as;

$$ |\alpha\rangle\rightarrow \begin{align} \boldsymbol{a} &= \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{m} \end{bmatrix} \end{align} $$

$$ |\beta\rangle\rightarrow \begin{align} \boldsymbol{b} &= \begin{bmatrix} b_{1} \\ b_{2} \\ \vdots \\ b_{m} \end{bmatrix} \end{align} $$

The classical way of describing the inner product between two vectors is given as;

$$ \langle\alpha|\beta\rangle = a_1^{\ast}b_1+a_2^{\ast}b_2+...+a_N^{\ast}b_N \tag{2} $$

Now, 'Introduction to Quantum Mechanics' by David Griffiths states that vectors in quantum mechanics, for the most part, are functions that live in infinite-dimensional spaces. The previous notation can be awkward to use, because the inner product, just defined, does not always exist for $N\rightarrow\infty$.

To represent a physical state the wavefunction $\Psi$ must be normalized:

$$\int_{-\infty}^{\infty}|\Psi|^2dx=1$$

The set of all square-integrable functions are defined as:

$$ f(x) \;\; \text{such that} \;\; \int_a^b|f(x)|^2dx<\infty \tag{3} $$

This constitutes a much smaller vector space. Mathematicians call it $L^2(a,b)$, while physicists call it Hilbert space. In another book, 'Quantum Mechanics: A modern development by Leslie E. Ballentine', I have seen the definition of the Hilbert space defined for the set of vectors ${\psi_i}$ as:

$$\lim_{i\to\infty}=||\psi_i-\chi||=0 \tag{4}$$

Where $\psi_i=\sum_n^{i}c_n\phi_n$. So Hilbert space for vectors is defined as the linear combination of an infinite set of vectors that converge towards a vector $\chi$. Here $\phi_n: n=1,2,...$ so we can construct a vector $\psi_n=\sum_nc_n\phi_n$ as a linear combination of an infinite set of functions. This must mean that the constructed vector is 'a vector of infinite elements', compliant with vectors $|\alpha\rangle$ and $|\beta\rangle$, or more generally that $\psi$ is a vector in the basis of an infinite set of vectors and thus a vector of infinite dimension. This is how I understand the Hilbert space of vectors.

The function (3) defines the norm of a $\textit{function}$, so why is this applicable to the $\textit{vectors}$ as well? Also, why does this equation make sure that the inner product of equation (2) converges? Lastly, how does (4) show the same as (3)?

I want to point out that I know how to show that eq. (3) is a vector space, but I struggle to see how this vector space describes the vector given in eq. (1). The concept of Hilbert space confuses me a lot and I struggle to find a satisfactory explanation.

Edit: Maybe we can only say that the Hilbert space defined as in eq. (3) describes that the inner-product converges in eq. (2), but then why is eq. (1) true -It does not describe an inner product?

Maybe the definition of Hilbert space for functions is completely seperated from the definition of Hilbert space for vectors. This is my interpretation thus far.

Might be worth reading some math textbook explanations of Hilbert space. — littleO, Jul 06 '23 at 17:59
Maybe Sheldon Axler’s measure theory book. On the other hand, it’s easy to get bogged down in all the math theory while a few basic ideas might be enough for you to get by. For example, you can have a vector space whose elements are functions rather than $n$-tuples. You can define an inner product for functions, so your vector space consisting of functions can be an inner product space. — littleO, Jul 06 '23 at 19:49
Thank you. I think I understand it a little better now. The Hilbert space can be defined for functions as you mention, but the definition of Hilbert space for vectors requires a completely different procedure. 'Quantum Mechanics: A Moderne Approach' by Leslie E. Ballentine describes this procedure pretty well, and, for now, I'm satisfied with the derivation. Of course; if people have more input, I would love to hear their answers, so I will keep the question up. — Rasmus Andersen, Jul 06 '23 at 20:47
That being said, I think the book you mention does a great job explaining the topic. I will definitely give it a more in-depth read. — Rasmus Andersen, Jul 06 '23 at 20:55

Al.G. · Accepted Answer · 2023-07-08T19:15:56.110

First, let's make it clear that a Hilbert space is not a particular vector space, but is rather any vector space that has a well-defined inner product and is complete w.r.t. the norm derived from it. Complete here means that you're allowed to take the limit of any sequence of vectors whose consequitive term difference tends to the zero vector, i.e. one whose elements get closer and closer (such a sequence is called fundamental).

If you think about it, a sequence whose elements get arbitrarily close (the closeness being measured by the norm: $\|a_{n} - a_{m}\|\to 0$) is a perfect candidate for one that has a limit. Unfortunately, for the infinite-dimentional case, not every fundamental sequence ought to have a limit, hence we need the definition of "completeness".

That said, $L^2(a,b)$ is a particular vector space which happens to fulfill the criteria for being a Hilbert space. Simpler examples include every finite-dimentional vector space - they're Hilber spaces, too, with the inner product defined by your eq. 2 (w.r.t. the particular basis you've chosen to represent your vectors as columns).

Let's look at your questions now.

The function (3) defines the norm of a function, so why is this applicable to the vectors as well?

I think you're accustomed to vectors being matrices of size $n\times 1$. That's true, but is only part of the story. More generally, the term vector describes an element of any vector space. When you get to know infinite-dimensional spaces, functions can become vectors, too - even if you cannot write them as finite column matrices!
Any set in which you can take (finite) linear combinations is a vector space, and its elements are called vectors. E.g. for $f,g:(a,b)\to\mathbb R$ we can always get another function $af+bg$ for any real $a,b$, hence they form a vector space and we cal $f,g$ vectors.

Also, why does this equation make sure that the inner product of equation (2) converges?

The two equations - 2 and 3 - are unrelated. They refer to different vector spaces. For finite-dimensional spaces, an inner product can be defined with your eq. 2, while for the function space $L^2$, the inner product is $\left\langle f,g\right\rangle :=\int_{a}^{b}f(x)g(x)\text{dx}$ (if your functions are complex-valued, you take $f^*$).

You ask why does the integral $\int_{a}^{b}fg\text{dx}$ converge, given that $f,g$ are square-integrable, and that is a very reasonable question. An easy answer uses the fact that $|ab|\le \frac 1 2 (a^2 + b^2)$ (expand $0\le(a+b)^2$ to see this): $$\int_{a}^{b}fg\le\int_{a}^{b}\left|fg\right|\le\frac{1}{2}\left(\int_{a}^{b}f^{2}+\int_{a}^{b}g^{2}\right)<\infty$$ (Omiting the x argument here for brevity). The last inequality follows from the square integrability of $f$ and $g$.

Lastly, how does (4) show the same as (3)?

Actually, it does not. Your eq. 4 defines the limit of the sequence $\psi_i$, not a vector space. Ballentine constructs a Hilbert space formally (read "artificially") from a countable set of "abstract" elements $\psi_i$ which we'll consider an ortonormal basis.

Given all the $\psi_i$, Balentine constructs a vector space $V$ of all formal finite linear combinations of $\psi_i$. It is an inner product space by your eq. 2: two finite combinations of $\psi_i$s are multiplied component-wise, and this induces a norm on every such vector. This space, though, is not yet Hilbert, because it is not complete (remember that a Hilber space is one that is both complete and has an inner product). For example, the sequence with terms $v_{n}=\sum_{i=1}^{n}\frac{1}{i}\psi_{i}$ is fundamental ($\|v_n-v_{m}\|=\sum_{n+1}^m\frac 1 {i^2}\to 0$ sufficiently fast), but has no limit in $V$ (each next term is linearly independent of the span of all previous terms, so its limit would be an infinite linear combination which does not exists in $V$ by construction). Hence Balentine includes "artificially" all such limit points, calling the resulting space $\mathcal{H}$, to make it complete. Thus we get "artificially" some infinite linear combinations, apart from the finite ones (this process is called completion, here's a wiki article on it).

What you can ask now is, how the heck is this space (constructed from a countably many $\psi_i$s with your eq. 4) related to the square-integrable functions $f:(a,b)\to \mathbb R$ (your eq. 3)? It is a very good question, allow me to answer it in the next section.

I want to point out that I know how to show that eq. (3) is a vector space, but I struggle to see how this vector space describes the vector given in eq. (1).

I think you're confused because you don't see how the space Balentine constructs from the $\psi_i$s is the same as the space of square-integrable functions. It is not apriori clear what functions are the $\psi_i$s, and the question is highly non-trivial.

First you need to realize that the elements of the $\mathcal{H}$ described above (from Balentine's book) $v_{n}=\sum_{i=1}^{n}c_{i}\psi_{i}$ are completely determined by the coeffiecients $c_i$ in front of the $\psi_i$s. That is, its elements are in fact square-integrable sequences, and this space is called $l^2$. To get a high-level overview about why this space - from your eq. 2 - is the same as the one described above, allow me to throw some links:

First, the space $l^2$ is a Hilbert one: Show that $l^2$ is a Hilbert space.
Second, a special type of Hilbert spaces - called separable - are isometrically isomorphic to that space, see here: http://mathonline.wikidot.com/separable-hilbert-spaces-are-isometrically-isomorphic-to-2
And third, the space $L^2$ is separable: Elegant proof that $L^2([a,b])$ is separable

You don't need to read through all these, just make sure to understand the logic flow:

the Balentine space is the same as $l^2$;
$L^2$ and $l^2$ are separable and Hilbert;
all separable Hilbert spaces are isometrically isomorphic (the same for all practical purposes);
hence $L^2$ is isomorphic to the artifically constructed (in Balentine's book) space $\mathcal H$.

And if you're interested in what square-integrable functions from $L^2$ correspond to the $\psi_i$s from the artificially built $l^2$, look closely at the answer by Jess Madnick in the first question linked. It may give you an intuition about how such a big random space like $L^2$ may be constructed from countably many functions only.

I'm not sure whether you expected such an answer, it's just that the truth is not so simple :) As the comments to your question say, you can try reading some books on these topics, just try not to get too distracted from your main goal. I had a similar experience with differential geometry and classical mechanics, and there's a risk you get aside from your target if you dive too deep in the details. I hope my answer gives you a way to move forward without needing to spend another month reading functional analysis books (which is a wonderful experience on its own, don't get me wrong :)).

Small caveat: $\lVert a_n-a_{n-1}\rVert \to 0$ does not imply $a_n$ converges even in a complete space (consider $a_n=\sum_{k=1}^n \frac{1}{k}$). The correct condition is the sequence being Cauchy. — Mor A., Jul 07 '23 at 11:49
Oh yess, but it'd be correct to write $|a_{n}-a_{m}|\xrightarrow[m\to\infty]{}0$ for each n, right? — Al.G., Jul 07 '23 at 13:29
No, that would imply the sequence is constant, the way people often shorten the full description of Cauchy is as $\lVert a_n - a_m \rVert \xrightarrow[m,n \to \infty]{} 0$ (with the meaning being exactly the definition of "Cauchy sequence") — Mor A., Jul 07 '23 at 13:42
Thank you very much for your answer. You state that elements of the Hilbert space (defined for vectors), $\psi_n=\sum_{i=1}^nc_i\psi_i$, is completely determined by the coefficients $c_i$ in front of $\psi_i$. We require that $\sum_n|c_n|^2$. By this understanding I see a connection between Hilbert space defined for vectors [eq. (4)] and Hilbert space defined for functions [eq. (3)]. You also emphasize that the vector space must be complete (satisfy the cauchy-sequence) in order to be a Hilbert space, which is an important point. Very satisfying answer! — Rasmus Andersen, Jul 07 '23 at 15:36
@MorA. thank you for the corrections, I understand it now. It seems I had a wrong "wordy-intuitive" understanding of cauchy sequences. Their terms come not just close to each other consequitively (like the $\sum\frac 1 k$ example), but the entire tail has to stay close together. — Al.G., Jul 08 '23 at 19:00

score 3 · Answer 2 · answered Jul 07 '23 at 11:34

It is possibly helpful to step back a little bit and think about things you probably know by now. Remember the concept of a vector space from linear algebra. It basically constitutes a set with two operations: sum (of elements of the set) and multiplication by scalar. These operations satisfy some properties, which are listed in any linear algebra book.

This definition states the requirements which a set must met to be called a vector space: it must have two such operations which satisfy the listed properties. Now, as you may know, there is a lot of vector spaces out there. For instance, the set of real number $\mathbb{R}$ and the set of complex numbers $\mathbb{C}$ are vector spaces, but also the Cartesian products $\mathbb{R}^{n}$ and $\mathbb{C}^{n}$. Another example is the set of all real/complex-valued $n\times n$ matrices, with standard sum and multiplication by scalar.

In short, there are a lot of different examples of vector spaces, and they do not necessarily are related to each other.

This is what happens in the present case of Hilbert spaces as well. The following definitions are important.

Definition (Inner Product Space): An inner product space (also called a pre-Hilbert space) is a complex vector space $V$ together with an inner product, which is a map $\langle \cdot,\cdot \rangle: V \times V \to \mathbb{C}$ which satisfies the following properties:

$\langle x, y \rangle = \overline{\langle y,x\rangle}$ for every $x,y \in V$. Here the overline means complex conjugate.
$\langle ax+by, z\rangle = a\langle x,z\rangle + b\langle y,z\rangle$ for every $x,y,z \in V$ and $a,b \in \mathbb{C}$.
$\langle x, x\rangle \ge 0$ and $\langle x, x \rangle = 0$ if, and only if $x=0$.

You can check that the product you defined in (1) satisfy these properties and is a particular case of an inner product.

If $V$ is an inner product space, it is common to write $\langle x, x\rangle$ by $\|x\|^{2}$. This is because the inner product $\langle x, x\rangle$ defines a norm, which is another related concept (search for normed spaces). A norm allows us to consider distances, by defining the distance between points $x,y \in V$ by $d(x,y) :=\|x-y\|$. Distances allows us to talk about convergence.

Definition: A sequence $\{x_{n}\}_{n\in \mathbb{N}}$ of elements of an inner product space $V$ is said to converge to $x \in V$ if for every $\varepsilon > 0$ there exists some $n_{0} \in \mathbb{N}$ such that $d(x_{n},x) = \|x_{n}-x\| \le \varepsilon$ holds for every $n \in n_{0}$.

The key concept in the theory of Hilbert (and also Banach) spaces is the notion of a Cauchy sequence.

Definition: A sequence $\{x_{n}\}_{n\in \mathbb{N}}$ of elements of an inner product space $V$ is called a Cauchy sequence if for every $\varepsilon > 0$ there exists $n_{0} \in \mathbb{N}$ such that $d(x_{n},x_{m}) = \|x_{n}-x_{m}\| \le \varepsilon$ whenever $n,m \ge n_{0}$.

Roughly speaking, a Cauchy sequence is a sequence in which every two elements become closer and closer together as $n$ increases.

You may ask: is every convergent sequence a Cauchy sequence? The answer is yes! On the other hand, is every Cauchy sequence a convergent sequence? The answer is: not necessarily! This leads to the important definition.

Definition (Hilbert Space): A Hilbert space is an inner product space $V$ in which every Cauchy sequence is convergent.

Once again, you have different examples of Hilbert spaces, and they are not necessarily related.

Example: Consider $L^{2}(\mathbb{R})$ which is the space of functions $f: \mathbb{R}\to \mathbb{C}$ which are square integrable, that is: $$\int_{\mathbb{R}}|f(x)|^{2}dx < +\infty$$ This is a vector space when equipped with the following operations: $$(f+g)(x) := f(x)+g(x) \quad \mbox{and} \quad (a f)(x) = af(x)$$ for $f,g \in L^{2}(\mathbb{R})$ and $a \in \mathbb{C}$. We can now define an inner product on this space, which is a space of functions: $$\langle f, g\rangle_{L^{2}} := \int_{\mathbb{R}}\overline{f(x)}g(x)dx$$ You can check that this inner product is well-defined in the sense that it does satisfy the listed properties of an inner product. Finally, one can prove that every Cauchy sequence in this space converges, so it is a legit Hilbert space!

Another example of a Hilbert space is a finite-dimensional vector space with inner product defined as in your formula (1).

In short, in your post you basically introduced two different inner product/Hilbert spaces, which are not related and appear in quantum mechanics in different situations. This is because the first basic postulate of quantum mechanics states that to every quantum system there is an associated Hilbert space, and the state of a constituent of this system is described by a vector (an element) of this Hilbert space. It tells nothing about which Hilbert space one needs to consider, and different realizations are considered for different situations, according to the properties of the system.

Thank you a lot for your answer. It definitely makes sence to view the Hilbert space as different separated vector spaces that are defined by the necessary properties of the given vectors or functions. This clarifies a lot of the confusion I had. — Rasmus Andersen, Jul 07 '23 at 15:06

score 1 · Answer 3 · answered Jul 07 '23 at 09:05

1

Hilbert spaces are basically a (potentially) infinite-dimensional (complex) vector space equipped with a scalar product. The continuous analog of the inner product you mention is then simply given by $\langle\alpha|\beta\rangle = \int\mathrm{d}x\ \alpha^*(x)\beta(x)$.

It is also to be noted that physicists usually work in spaces bigger than square-integrable functions, aka $L^2$, since wavefunctions corresponding to plane waves do not belong to $L^2$ for instance.

answered Jul 07 '23 at 09:05

Abezhiko

14,205

Thank you. This explanation does not necessarily explain the specific questions I had, but you express a good point. You say that physicists usually work in larger spaces than L^2. Can you mention some of these vector spaces? To me Hilbert space seems like a very broad space, and I struggle to imagine a larger space necessary for physical theories. – Rasmus Andersen Jul 07 '23 at 15:12
1

@RasmusAndersen Physicists usually don't bother with specifying their Hilbert spaces; they find solutions to a given equation and then work implicitly in a sufficiently large space to contain them. Nonetheless, those are often distributional spaces, since states/wavefunctions such as Dirac deltas or other "divergent" expressions will be considered. – Abezhiko Jul 08 '23 at 07:10

Coupling bra-ket notation to the definition of Hilbert space

3 Answers3