4

I have a nonlinear system of equations as $$ \left(\mathbf{K}_{\mathbf{L}}+\mathbf{K}_{\mathbf{N L}}(\mathbf{X})\right) \mathbf{X}=\mathbf{F} $$ in which $\mathbf{K}_{\mathbf{N L}}(\mathbf{X})$ represents the nonlinear stiffness matrix which is dependent to $\mathbf{X}$. I'm solving with Picard iteration like this:

  1. first ignore the nonlinear stiffness matrix and solve the linear matrix for $\mathbf{X}$.
  2. put the resulting $\mathbf{X}$ in the nonlinear stiffness matrix and solve the full equation for $\mathbf{X}$.
  3. check convergence and repeat 2 if the convergence is not satisfied.

the problem i have here is when the force vector($\mathbf{F}$) is small the nonlinear equation solves very fast but when i increase the force beyond some threshold it gets ages to converge. i have tried to solve it using Matlab fsolve function with algorithms like 'trust-region' and 'levenberg-marquardt' but the same thing happens with large force vectors.

is there any way i can improve the convergence speed ?

p.s. heres a gif of the result vector $\mathbf{X}$ inside the convergence loop with a force vector slighly over the threshold.

enter image description here

edit(more details): so my problem is bending of a nonlinear timoshenko beam that has three governing equations as below: $$ -\frac{d}{d x}\left\{A_{x x}\left[\frac{d u}{d x}+\frac{1}{2}\left(\frac{d w}{d x}\right)^{2}\right]+B_{x x} \frac{d \phi_{x}}{d x}\right\}=0 $$ $$ -\frac{d}{d x}\left\{A_{x x} \frac{d w}{d x}\left[\frac{d u}{d x}+\frac{1}{2}\left(\frac{d w}{d x}\right)^{2}\right]+B_{x x} \frac{d w}{d x} \frac{d \phi_{x}}{d x}\right\}-\frac{d}{d x}\left[S_{x x}\left(\frac{d w}{d x}+\phi_{x}\right)\right]=q $$ $$ -\frac{d}{d x}\left\{D_{x x} \frac{d \phi_{x}}{d x}+B_{x x}\left[\frac{d u}{d x}+\frac{1}{2}\left(\frac{d w}{d x}\right)^{2}\right]\right\}+S_{x x}\left(\frac{d w}{d x}+\phi_{x}\right)=0 $$ along with the proper boundary conditions and using finite difference, when assembled they form: $$ \left(\mathbf{K}_{\mathbf{L}}+\mathbf{K}_{\mathbf{N L}}(\mathbf{X})\right) \mathbf{X}=\mathbf{F} $$

  • Have you tried ode23s and others? – user10354138 Jul 26 '20 at 09:56
  • @user10354138 whats the difference between fsolve and ode23s ? – omgtheykilledkenny Jul 26 '20 at 10:23
  • 1
    The ordinary Picard algorithm converges linearly (notice that the scheme revolves around a fixed point iteration). So, it is not "superfast" even if you are close to the solution, this can be a nightmare (or a night with eyes open until the end of computation) for some large systems. Still, you may want to check if conditions for using the scheme hold, i.e. the function and its derivative are continuous at the point where you start your iterations (typically the origin). – Basco Jul 28 '20 at 21:23
  • the function is continuous but i didn't really consider the derivative. do you mean the derivative with respect to time or X itself? because X is has three separate variables{u;w;s} – omgtheykilledkenny Jul 29 '20 at 12:19
  • Your numerical question sounds like a conditioning issue. The solver of ode23s might resolve the issue since they're designed for solving "stiff" problems. I've got a few questoins: 1) What do you mean by a nonlinear matrix? and 2) Also, what is the error tolerance your solver is using? Lowering your error tolerance may yield faster "convergence" – Zim Jul 29 '20 at 19:37
  • 1
    This approach is reasonable only if $\mathbf K_{NL} \ll \mathbf K_L$. Otherwise you should use linearization of $\mathbf K_{NL}$ either full (Newton's method) or partial. The less is the nonlinear term, the faster the convergence will be. – uranix Jul 29 '20 at 20:25
  • thanks for your suggestion @Zim i edited my question to answer your first question. about your second question i don't think lowering the error tolerance is gonna fix it because it is obvious from the gif i provided that the solution is not converging and it is oscillating with very big difference. i think @uranix is right because as i increase the force, $\mathbf{K}_{\mathbf{N L}}\left(\mathbf{X}\right)$ gets bigger and the approach gets inefficient. but for the Newtons method i have used matlabs fsolve and from the documentations it uses Newtons methos,but the same happens. – omgtheykilledkenny Jul 30 '20 at 09:54
  • It may help if you put more details in. Best, just let us know what $K_L$ and $K_{NL}(X)$ are. Right now the question is too general to offer anything better than wild shots (and, from what you are saying, it looks like they don't help much) – fedja Jul 30 '20 at 16:19
  • okay i will add more details to the post, thanks @fedja – omgtheykilledkenny Jul 30 '20 at 19:53
  • Are $A,B,D,S$ constants and $u,w,\phi_x$- unknown functions? If so, one thing that strikes me is that your first equation allows one to eliminate $u$ and all non-linearity from the second two if we introduce the constant parameter $C={Axx[]+\dots}$ and then to quickly solve the resulting linear system and solve for $u$ in the end (another linear equation), so it looks like you are down to a one-variable problem $H(C)=something$ where $H$ is obtained by solving a purely linear system and something, apparently, is determined by the boundary conditions. Am I missing anything? – fedja Jul 30 '20 at 21:08
  • @fedja yes you are right about the constants and unknown functions but i am not understanding your method for eliminating u and the nonlinear parts. could you elaborate on that some more? thank you – omgtheykilledkenny Jul 31 '20 at 05:33
  • According to the first equation, the full expression in braces is constant. Denote that constant by $C$. Then, $\frac{du}{dx}+\frac 12(\frac{dw}{dx})^2=(C-B_{xx}\frac {d\phi_x}{dx})/A_{xx}$. Now plug this into the second and the third equations and treat $C$ as a known parameter. You'll get two linear ODE's for $w$ and $\psi_x$ that you can solve as quickly as you can solve your linear systems in general. Now, when you have your $w$ and $\phi_x$, you can solve the first equation for $u$ in no time. The result, will satisfy all 3 equations, so only the boundary conditions may be hurt. (contd) – fedja Jul 31 '20 at 14:55
  • What exactly are the boundary conditions, by the way? Can you post them too? – fedja Jul 31 '20 at 14:55
  • If none of these suggestions help you on thing that you can of course always do is throw more compute at it. In particular if you have access to a fast graphics card, GPU computing can provide pretty substantial gains for linear algebra problems. Also automatic differentiation libraries can help in providing accurate derivatives. You could try for example with pytorch's torch.optim.LBFGS optimizer. – Hyperplane Aug 04 '20 at 10:58

3 Answers3

3

Here's a simple idea you could try with very little extra effort: Your GIF shows that it's oscillating back and forth, a phenomenon that can also occur in classical gradient descent algorithms if the problem is bad conditioned. A very popular and powerful method to alleviate this kind of problem is called momentum, which basically consists of averaging over previous iterations.

So instead of throwing away all the previous iterates, you can do something like

$$ x_{k+1} = (1-\beta)g(x_{k}) + \beta x_k$$

Note that when $\beta=0$, we recover a standard fixed point iteration. Consider a simple fixed point problem like $x=\cos(x)$, which exhibits the oscillatory phenomenon. Then, starting from the same seed here are the residuals $|x_*-x_k|$ for different values of $\beta$:

$$ \small\begin{array}{lllllll} k & \beta=0 & \beta=0.1 &\beta=0.2 &\beta=0.3 &\beta=0.4 &\beta=0.5 \\\hline 0 & 5.45787 &5.45787 &5.45787 &5.45787 &5.45787 &5.45787 \\1 & 0.2572 & 0.777267 & 1.29733 & 1.8174 & 2.33747 & 2.85754 \\2 & 0.19566 & 0.538475 & 0.690985 & 0.555697 & 0.107195 & 0.610102 \\3 & 0.116858 & 0.162927 & 0.0696096 & 0.00419339 & 0.00218156 & 0.0454083 \\4 & 0.0835784 & 0.0908543 & 0.0249916 & 0.000723828 & 8.0351e-06 & 0.0070347 \\5 & 0.053654 & 0.0431759 & 0.00828335 & 0.000124022 & 3.34983e-08 & 0.0011389 \\6 & 0.0371882 & 0.0224696 & 0.00282738 & 2.12772e-05 & 1.39595e-10 & 0.000185622 \\7 & 0.0245336 & 0.0112062 & 0.000955803 & 3.64953e-06 & 5.81757e-13 & 3.02859e-05 \\8 & 0.0167469 & 0.00571477 & 0.000324182 & 6.26001e-07 & 2.44249e-15 & 4.94232e-06 \\9 & 0.0111768 & 0.00288222 & 0.000109831 & 1.07377e-07 & 1.11022e-16 & 8.06552e-07 \end{array} $$

A well chosen momentum can speed up convergence tremendously! A variant of this idea specific to fixed point iterations appears to be known as Anderson Acceleration.

Hyperplane
  • 12,204
  • 1
  • 22
  • 52
  • I like the proposal, but although I cannot confirm it from the information in the question as posed by @omgtheykilledkenny, it seems that those oscillations are the result of plotting the solution for the linear stiffness and the update considering the nonlinear part. If using damping factors, it may be necessary to consider dissipation in some problems (physical systems may not conserve energy under some of these schemes). – Basco Jul 29 '20 at 00:50
  • Thank you for your proposal @Hyperplane. i am going to try this method and will let you know how it works. if i got it correctly the function g(x) gives the result of my system of nonlinear equations, right? – omgtheykilledkenny Jul 29 '20 at 12:14
  • Isn't $g(x_k)$ supposed to be $x_{k-1}$? Otherwise you are just using the previous iterate. – Mick Aug 02 '20 at 13:24
  • @Mick $g$ returns the solution to the linear system with stiffness matrix evaluated at $x_k$, i.e. $g(x_k) = \text{solve}(K_L+K_{NL}(x_k),, F)$ – Hyperplane Aug 04 '20 at 10:37
1

I assume that $K_{NL}(0) = 0$.

Currently you are using the iteration $$ (K_L + K_{NL}(X_n))X_{n+1} = F $$ with $X_0 = 0$. Instead, first solve $$ (K_L + K_{NL}(X_\sigma))X_\sigma = \sigma F $$ for small $\sigma$, using this method. This converges quickly as you noticed. Then solve $$ (K_L + K_{NL}(X_{\sigma'}))X_{\sigma'} = \sigma' F $$ with the same iteration for some $\sigma' > \sigma$, using the same iteration but now starting with the previously found $X_\sigma$ and not with the zero vector. And so on until the right hand side is $F$.

For example, choose $\sigma = N^{-1}, \, \sigma' = 2 N^{-1}$ and so on, for sufficiently large $N$.

Of course Anderson acceleration is also a good idea here :)

Hans Engler
  • 16,092
  • oh so you mean that i start solving the nonlinear equations for small forces and with their solution start increasing the force and converge for force? hummm seems like a reasonable idea. let me try that. i have read the Anderson acceleration papers but it looks so complicated however i trie the suggestion of @hyperplane but it gives different answers for different beta ratios. – omgtheykilledkenny Aug 03 '20 at 07:59
1

You can try some of the convergence acceleration algorithms which can work very well for fixed-point (Picard-type) iterations. If you are using R, there is a package called SQUAREM which implements a reliable, convergence acceleration scheme. It is based on the paper (Varadhan and Roland, Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm, Scandinavian Journal of Statistics, 2008). EM algorithms are essentially Picard-like algorithms - they are contraction mappings which converge slowly.

user67724
  • 249