Optimization problem (in linear algebra course!)

Question

Let $a_1, a_2, \ldots, a_n$ be real numbers such that $a_1 + \cdots + a_n = 0$ and $a_1^2 + \cdots +a_n^2 = 1$. What is the maximum value of $a_1a_2 + a_2a_3 + \cdots + a_{n - 1}a_n + a_na_1$?

I'd like to emphasize that this was found in a random linear algebra exercise sheet so one might expect that there exists a clever solution based on matrix manipulation...

Personally I tried to conceive the problem geometrically and analytically. Notably $n=1$ has no meaning, $n=2$ gives $-1$, $n=3$ gives $-1/2$. This did not reveal much about the general case except for the fact that Lagrangian (system of partial derivatives and all that jazz) seems to imply that ANY combination satisfying the constraints gives the same value (value I'm trying to maximize) - but this needs some further checking.

Back to the linear algebra I see traces of matrices, but I need to somewhat simply express $A$ and $B$ (see below) in terms of one another before anything useful can be done...

$$A = \operatorname{diag}(a_1,a_2,\ldots,a_{n-1},a_n);$$ $$B = \operatorname{diag}(a_2,a_3,\ldots,a_{n-1},a_n,a_1);$$

P.S. The problem was originally taken from this very forum but it's quite and old post and I don't seem to be able to leave a comment there.

By the symmetry of the problem, and the fact that if $a_i$ give the maximum then the set of $-a_i$ gives the exact same value and also comply with the conditions, I'd guess the value is 0. — vonbrand, Mar 08 '13 at 16:13
The set $a_1 = - a_3 = \sqrt{2} / 2$, $a_2 = a_4 = \ldots = a_n = 0$ complies with the conditions, and gives 0 as a sum for all $n > 3$, so $-1$ definitely isn't the maximal value. — vonbrand, Mar 08 '13 at 16:30
@vonbrand So for all $n>3$ the maximum value in question is non-negative. Thanks for this remark. Now if only I could prove that it is non-positive... ;) — Pranasas, Mar 08 '13 at 17:00
I'm fooling around with $(a_1 + \ldots + a_n)^2$, which combines many of the ingredients of the problem, but no real progress still... — vonbrand, Mar 08 '13 at 17:24
For $n = 6$ consider $a_1 = a_2 = - a_4 = - a_5 = \frac{1}{2}$, $a_3 = a_6 = 0$. Then the sum is $\frac{1}{2}$. — vonbrand, Mar 08 '13 at 18:50
@vonbrand This is a nice remark. Also, following the very same logic it means that for $n=6k, ; k \in \mathbb{N}$ we can construct $$a_{1+6(i-1)}=a_{2+6(i-1)}=-a_{4+6(i-1)}=-a_{5+6(i-1)}=\frac{1}{2 \sqrt{k}}$$ $$a_{3+6(i-1)}=a_{6+6(i-1)}=0, ; i \in {0,1, \ldots , k-1}$$ and the sum is still $\frac{1}{2}$! — Pranasas, Mar 08 '13 at 19:15
Why so much complication... the values I gave work for all $n \ge 6$ as long as $a_k = 0$ for $k \ge 6$. It looks like this is the maximum... — vonbrand, Mar 08 '13 at 19:22
@vonbrand Maybe a maximum. Or maybe not. I am still intrigued as to why this is a linear algebra exercise though. — Pranasas, Mar 08 '13 at 19:40
@Pranasas It's intriguing indeed because this is a nonconvex optimization problem and thus one that isn't solved with linear algebra only. However, the unit ball constraint makes me think of eigenvectors/eigenvalues or singular vectors/singular values. I'm not sure what the connection is though. — Dominique, Mar 08 '13 at 21:08

Hagen von Eitzen · Accepted Answer · 2013-03-09T10:34:24.973

EDIT: So here we go with a completely rewrite in the language of linear algebra.

In $V=\mathbb C^n$ with standard scalar product $\langle e_j,e_k\rangle=\delta_{jk}$ consider the linear map $$\begin{matrix}\phi\colon &V&\longrightarrow &V\\ &(x_1,x_2,\ldots,x_{n-1},x_n)&\longmapsto&(x_2,x_3,\ldots,x_n,x_1), \end{matrix}$$ that is $e_1\mapsto e_n$ and $e_k\mapsto e_{k-1}$ for $1<k\le n$. The problem statement asks us to maximize $\langle \phi a,a\rangle$ subject to the conditions

$$\begin{align}\tag{c1}\langle a,v_n\rangle &= 0,\\ \tag{c2}\langle a,a\rangle&=1,&\text{and}\\ \tag{c3}a&\in \mathbb R^n.\end{align}$$

Let $\zeta\in\mathbb C$ be a primitive $n$th root of unity, for example $\zeta=e^{\frac{2\pi}n}=\cos\frac{2\pi}{n}+i\sin\frac{2\pi}{n}$. For $0\le k<n$ let $$v_k=\sum_{\nu=1}^n \zeta^{\nu k}e_\nu=(\zeta^k,\zeta^{2k},\ldots ,\zeta^{nk}).$$ Then $$\langle v_k,v_k\rangle = n,\qquad\langle v_k,v_j\rangle=0\quad\text{for }j\ne k,\qquad \phi(v_k)=\zeta^kv_k,$$i.e. the $v_k$ are an orthogonal eigenvector basis of $V$. Thus if $$\tag1a=\sum_{k=1}^n c_kv_k$$ with $c_k\in\mathbb C$, then $$\langle a,a\rangle = n\sum_{k=1}^n|c_k|^2\qquad\text{and}\qquad\langle \phi a,a\rangle = n\sum_{k=1}^n \zeta^k|c_k|^2.$$

Condition (c$3$) implies that especially $\langle \phi a,a\rangle \in\mathbb R$ and condition (c$1$) is simply that $c_n=0$ in $(1)$. From this we obtain the bound $$\begin{align}\langle \phi a,a\rangle& =n\sum_{k=1}^{n} \zeta^k|c_k|^2\\& =n\sum_{k=1}^{n-1} \zeta^k|c_k|^2\\&=n\Re\left(\sum_{k=1}^{n-1} \zeta^k|c_k|^2\right)\\&=n\sum_{k=1}^{n-1} \Re(\zeta^k)|c_k|^2\\\tag2&\le \max\bigl\{\Re(\xi)\mid \xi^n=1,\xi\ne1\bigr\}\cdot n\sum_{k=1}^{n-1}|c_k|^2\\&=\cos\frac{2\pi}{n}\cdot\langle a,a\rangle.\end{align}.$$ Note that the special choice $a=\frac 1{\sqrt n} (v_1+v_{n-1})=\frac 1{\sqrt n} (v_1+\overline{v_1})$ yields $a\in \mathbb R^n$, $\langle a,a\rangle=1$ and equality in $(2)$. Therefore $$\max\bigl\{\langle \phi a,a\rangle\mid a\in\mathbb R^n,\langle a,a\rangle=1,\langle a,v_n\rangle=0\bigr\}= \cos\frac{2\pi}{n} .$$ Note that for $n=2$ and $n=3$, we recover the results $\cos\pi=-1$ and $\cos\frac{2\pi}3=-\frac12$, confirming what you found.

score 6 · Answer 2 · answered Mar 09 '13 at 09:12

6

The problem can be solved in terms of linear algebra.

There is an orthonormal basis $(e_0,e_1,\ldots,e_{n-1})$ of $\Bbb R^n$ that diagonalizes the quadratic form $$f(x):=\sum_i x_i x_{i+1}\ .$$ When $(\xi_0,\ldots,\xi_{n-1})$ are the corresponding cordinates the form $f$ appears as $$\tilde f(\xi)=\sum_{i=0}^{n-1}\lambda_i \xi_i^2\ .$$ The $e_i$ are eigenvectors of the circulant matrix $$A:={1\over2}\left[\matrix{ 0&1&0&0&0&1 \cr 1&0&1&0&0&0 \cr 0&1&0&1&0&0 \cr 0&0&1&0&1&0 \cr 0&0&0&1&0&1 \cr 1&0&0&0&1&0 \cr}\right]\ ,$$ drawn here for the case $n=6$, and the $\lambda_i$ are the corresponding (real) eigenvalues. Of course $e_0={1\over\sqrt{n}}(1,1,\ldots,1)$. As all $e_i$ with $i\geq1$ are orthogonal to $e_0$ these $e_i$ automatically satisfy the linear constraint "$\sum_k a_k=0$". It follows that $$\max\bigl\{f(a)\ \bigm|\ {\rm ``constraints''}\bigr\}=\max\left\{\sum_{i=1}^{n-1}\lambda_i \xi_i^2\ \biggm|\ \sum_{i=1}^{n-1} \xi_i^2=1\right\}=\max_{1\leq i\leq n-1}\lambda_i\ .$$ To compute the $\lambda_i$ note that the $n$ complex vectors $$c_j:=(1,\omega^j,\omega^{2j},\ldots,\omega^{(n-1)j})\qquad(0\leq j\leq n-1)\ ,$$ where $\omega:=e^{2\pi i/n}$, are linearly independent eigenvectors of $A$. The eigenvalues are now immediate; the largest apart from $\lambda_0=1$ is $\alpha:=\cos{2\pi\over n}$. This value $\alpha$ is the solution to our problem; it is $>0$ for all $n\geq5$.

answered Mar 09 '13 at 09:12

Christian Blatter

232,144

Why would the complex vectors $c_j$ be eigenvectors of the real symmetric matrix $A$? Surely this matrix has real eigenvectors. In addition I think you need to consider $2A$ instead of $A$. – Dominique Mar 09 '13 at 15:32
Sorry, your $A$ is correct. – Dominique Mar 09 '13 at 15:44
@Dominique: $A$ can be considered as a complex matrix. The eigenvectors $c_j$ and $c_{n-j}$ belong to the same eigenvalue, and out of these you could construct real eigenvectors. But we don't need them; it's enough to know the eigenvalues. – The $f$ as given by the OP corresponds to my $A$. – Christian Blatter Mar 09 '13 at 15:47
It is bothersome (and somewhat inelegant in my opinion) to have to resort to complex arithmetic to solve a real symmetric eigenvalue problem. I read your answer on Wikipedia but I'd rather see a real-only answer still. – Dominique Mar 09 '13 at 16:01
@ChristianBlatter So eigenvectors have to be orthogonal for this to work. $c_k \cdot c_l = (\omega^{(k+l)n} -1)/(\omega^{(k+l)} -1)$ (sum of geometric progression). How is it non-zero for all $k=l$? I need to figure this out although I'm pretty sure your reasoning is correct. – Pranasas Mar 20 '13 at 16:25
@Pranasas: The $c_j$ are linearly independent. Furthermore $c_j$ and $c_{n-j}$ belong to the same (real) eigenvalue $\lambda_j$. Putting $a_j:=(c_j+c_{n-j})/2$, $\ b_j:=(c_j-c_{n-j})/(2i)$ you get two real eigenvectors corresponding to $\lambda_j$, and hopefully they are orthogonal. Finally $a_j$'s and $b_j$'s belonging to different $\lambda_j$ are orthogonal anyway. – Christian Blatter Mar 20 '13 at 16:42

Dominique · Answer 3 · 2013-03-09T16:48:23.047

EDIT: Here is an answer that uses a minimum amount of complex arithmetic.

Consider $$ H := \frac{1}{2} \begin{bmatrix} 0 & 1 & & & & & 1 \\ 1 & 0 & 1 & & & & \\ & 1 & 0 & 1 & & & \\ & & \ddots & \ddots & \ddots & & \\ & & & 1 & 0 & 1 & \\ & & & & 1 & 0 & 1 \\ 1 & & & & & 1 & 0 \end{bmatrix}. $$ Then the problem may be rewritten $$ \max_v \ v^T H v \quad \text{subject to} \ e^T v = 0, \ \|v\|_2 = 1, $$ where $e = (1, 1, \ldots, 1)$. The problem amounts to finding the largest eigenvalue of $H$ and one of its associated eigenvectors in the subspace $e^T v = 0$. It's easy to check that when $n > 2$, $v_1 := \frac{1}{\sqrt{n}} (1, 1, \ldots, 1)$ is always an eigenvector of $H$ associated to the eigenvalue $1$.

But $H$ is a circulant matrix and it is real and symmetric. When $n=2$, $H$ is $\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$ whose largest eigenvalue is $+1$. For any $n > 2$, properties of circulant matrices imply that the largest eigenvalue of $H$ is always $1$. So now the problem amounts to finding the second largest eigenvalue of $H$ and an associated eigenvector.

According to Gray the eigenvalues of a circulant matrix are the discrete Fourier transform (DFT) of its first row. The standard notation for the first row of a circulant matrix is $$ \begin{bmatrix} c_0 & c_{n-1} & c_{n-2} & \cdots & c_1 \end{bmatrix}. $$ Here $c_1 = c_{n-1} = 1/2$ and all others are zero. The DFT of this sequence is given by $$ \lambda_k = \sum_{j=0}^{n-1} c_j \, e^{-i j k 2\pi / n} = \tfrac{1}{2} 2 \cos(2 k \pi / n) = \cos(2 k \pi / n), \quad k = 0, \ldots, n-1, $$ i.e., a real sequence (thankfully because $H$ is symmetric!) The largest eigenvalue, obtained for $k=0$, is indeed $1$. The second largest is $\cos(2\pi/n)$.

The bound in my answer shows that the objective is at most 1. — vonbrand, Mar 08 '13 at 22:21

vonbrand · Answer 4 · 2013-03-08T19:58:00.733

A partial solution, but needs more math than fits in a comment :-(

From an idea in Lohwater's "Inequalities":

We have, applying the AM-GM inequality ad nauseam: $$ \begin{align*} a_1^2 + a_2^2 &\ge 2 a_1 a_2 \\ a_2^2 + a_3^2 &\ge 2 a_2 a_3 \\ \vdots\\ a_{n - 1}^2 + a_n^2 &\ge 2 a_{n - 1} a_n \\ a_n^2 + a_1^2 &\ge 2 a_n a_1 \end{align*} $$ Adding up the whole mess and dividing by 2: $$ a_1^2 + a_2^2 + \ldots + a_n^2 \ge a_1 a_2 + a_2 a_3 + \ldots + a_n a_1 $$ This proves that the sum is at most 1.

Another idea to get a tighter bound: Let $n = 2 k$, where $k \ge 2$, and take $a_i = \frac{1}{\sqrt{n}}$ if $1 \le i \le k$, $a_i = - \frac{1}{\sqrt{n}}$ if $k < i \le k$. Then the sums: $$ \begin{align*} a_1 a_2 + \ldots + a_{k - 1} a_k &= \frac{k - 1}{n} \\ a_{k + 1} a_{k + 2} + \ldots + a_{2 k - 1} a_{2 k} &= \frac{k - 1}{n} \\ a_{2 k} a_1 = a_k a_{k + 1} &= - \frac{1}{n} \end{align*} $$ Adding up gives $a_1 a_2 + \ldots a_n a_1 = \frac{n - 4}{n}$.

The same idea works as long as $n \ge 4$; consider a stretch of $k$ positives and a stretch of $n - k$ negatives, giving two steps changing signs wrapping around. The stretches give $(k - 1) + (n - k - 1) = n - 2$ pairs that multiply to $\frac{1}{n}$ and two products of $- \frac{1}{n}$ for the steps, for a total of $\frac{n - 4}{n}$.

score 2 · Answer 5 · answered Mar 08 '13 at 19:50

Another partial answer. I think the key is to think geometrically about this. Your constraints describe the intersection of the unit sphere with the vector subspace orthogonal to $(1, 1, \ldots, 1)$.

For $n=2$, the problem is $$ \min \ -2xy \quad \text{subject to} \ x+y=0, \ x^2 + y^2 = 1. $$ Introducing Lagrange multipliers, the first-order optimality conditions can be written \begin{align*} -2y - \lambda_1 - 2 x \lambda_2 & = 0, \\ -2x - \lambda_1 - 2 y \lambda_2 & = 0, \\ x + y & = 0, \\ x^2 + y^2 & = 1. \end{align*} Adding the first two equations together and using $x+y=0$, we find $\lambda_1 = 0$. This means that the objective function naturally takes its maximum value on the plane $x+y=0$, so this constraint could be removed without changing the solution. Now we see that $\lambda_2 = 1$ and $x=-y=\pm 1 / \sqrt{2}$. The optimal objective value is $-1$ (or $+1$ if we return to the maximization problem).

For $n = 3$, in the same way as above, we find $\lambda_1 = 0$ and $\lambda_2 = 1/2$. So as you say, any $(x,y,z)$ satisfying the constraints gives the same objective value. As you say, this value is $-1/2$.

I guess the point of the exercise is to show that the objective is constant along certain ellipses.

For general $n$, the optimality conditions are \begin{align*} -a_2 - a_n - \lambda_1 - 2 a_1 \lambda_2 & = 0, \\ -a_{i+1} - a_{i-1} - \lambda_1 - 2 a_i \lambda_2 & = 0, \\ -a_{n-1} - a_1 - \lambda_1 - 2 a_n \lambda_2 & = 0, \\ \sum_{i=1}^n a_i & = 0, \\ \sum_{i=1}^n a_i^2 & = 1. \end{align*} Adding together the first three sets of equations, we have again $\lambda_1 = 0$. After removing $\lambda_1$ from the first three sets of equations, squaring them and adding them together, I get $$ \lambda_2 = \pm \frac{1}{\sqrt{2}} \sqrt{1 - f(a)}, $$ where $f(a) = -\sum_{i=1}^{n-1} a_i a_{i+1} - a_n a_1$ is the objective being minimized. I'll be happy to complete my answer when I discover more but this may already put you on track.

score 0 · Answer 6 · answered Mar 09 '13 at 15:15

I think I got it now. In my previous answer I prove that given the restrictions the sum $a_1 a_2 + a_2 a_3 + \ldots + a_{n - 1} a_n + a_n a_1 \le 1$. This bound is rather sharp, in that there is a way of getting very near. Note that the sum to maximize and the restrictions are cyclic, and the restrictions are invariant to sign change of all $a_i$. Call the maximal value $S_n$.

First, for $n = 1$ the problem makes no sense. For $n = 2$ the only option is $a_1 = - a_2 = \frac{1}{\sqrt{2}}$, which gives $S_2 = -1$. For $n = 3$, the only option is $a_1 = - a_2 = \frac{1}{\sqrt{2}}$ and $a_3 = 0$, again $S_3 = -1$. For $n = 4$ an option is $a_1 = - a_3 = \frac{1}{\sqrt{2}}$ and $a_2 = a_4 = 0$, which shows $S_4 \ge 0$. For $n = 5$, extending the case $n = 4$ with $a_5 = 0$ gives again $a_5 \ge 0$.

Take now $n \ge 6$. The AM-GM inequality indicates we should have two stretches as long as possible with the same values (one positive, one negative), with as high an absolute value as possible. To avoid subtractions, separate the stretches with 0. There are two cases to consider, $n$ even or odd.

First consider the case $n = 2k$. Set two stretches of length $k - 1$, separated by $0$, of absolute value $\frac{1}{\sqrt{2 (k - 1)}}$. Each stretch adds $k - 2$ products, which gives: $$ S_{2 k} \ge \frac{2 k - 4}{2 k - 2} = 1 - \frac{1}{k - 1} $$

For $n = 2 k + 1$, take a stretch of length $k - 1$ of values $\frac{1}{\sqrt{2 k - 2}}$, which adds $k - 2$ products, and a stretch of length $k$ of values $\frac{1}{\sqrt{2 k}}$, which adds $k - 1$ products; separate by 0. The sum turns out: $$ S_{2 k + 1} \ge \frac{k - 2}{2 k - 2} + \frac{k - 1}{2 k} = \frac{2 k^2 - 4 k + 1}{2 k (k - 1)} = 1 - \frac{2 k - 1}{2 k (k - 1)} $$

Optimization problem (in linear algebra course!)

6 Answers6

Linked