Abstract
Lanczos iterative methods for solving a large sparse linear systems typically face the latent breakdown which strikes every time these methods are deployed. A number of approaches to deal with this issue have been investigated. One of them is by switching between the solvers preemptively breakdown. However, the problem is not fully solved yet. Here, we propose switching models combined with particular Lanczos iterative methods. The first model is by using the last iterate as the switching point with an unlimited number of iterations, the second model is by using the iterate with the minimum residual norm as the initial point and the third model is by using the iterate with the minimum of minimum residual norms as the switching point. These three models lead algorithms of SLULast, SLUMinRes, and SLUMoM, respectively. The parallel version of the proposed algorithms is also provided to speed up their convergence. In this case, we constructed the parallel of SLUMoM and we call it pSLUMoM. The numerical results showed that our switching models performed better than the existing switching strategy in terms of robustness and efficiency. In fact, under a parallel framework, pSLUMoM showed a performance gain of up to 50% in our experiments.
Introduction
Today most iterative methods for solving a large sparse linear system are based on Krylov subspace methods. Consider a nonsingular system
The improvement of the original Lanczos has been made for decades. One interesting was proposed by Brezinski,5,3 by implementing the theory of formal orthogonal polynomial (FOP) as follows. Define the residual vector
Although theoretically, the Lanczos methods terminates after some
Breakdown in the Lanczos methods, or other non-stationary iterative methods, is a disorder that manifests itself whenever such methods are deployed. It mainly occurs when the orthogonal polynomials
A number of approaches have been proposed to handle the breakdown of Lanczos-types. The famous one, proposed in the late Nineteen Nineties, by jumping over the non-existing orthogonal was called the look-ahead method or the recursive zoom (MRZ). 3 Other techniques along with restarting the equal set of algorithms and switching among one kind of Lanczos-type algorithms had been taken into consideration by Farooq and Salhi 7 and Maharani and Salhi. 8 These approaches had been proven to carry out higher than MRZ in the context of robustness. Furthermore, the works which attempted to speed up the convergence of Lanczos-types by embedding the interpolation and extrapolation model had been recommended by Maharani et al.,9,10 including also the parallel version and their implementation on cloud computing. 11 Recently, Lanczos-types are hybridized with the machine learning technology: as can be seen by Thalib et al., 12 the authors have explored the support vector machine (SVM) combined with Lanczos algorithms to solve the nonlinear problems, hence implemented it into the engineering problems. 13 Finally, the enhanced of Lanczos-types were summarized by Maharani et al. 14
In this article, we focus on switching between Lanczos-type algorithms by considering quality point to switch with. We also avoid fixing the number of iterations but monitoring the solution process for signs of breakdown. Note that the existing switching strategy, 7 uses a fixed number of iterations to switch. This, obviously, has the disadvantages explained below.
The switching strategy allows us to switch between one algorithm to another algorithm before the first one breaks down. By switching, the polynomial basis of the Krylov subspace is changed. Therefore, the breakdown can be avoided. Since we do not know when a breakdown is likely to occur, fixing the number of iterations is arbitrary and may be counter-productive. Indeed, if it is too low, then we switch well before breakdown which means that the solution process is halted unnecessarily early losing potentially opportunity for more progress in the solution process. If, on the other hand, it is too large, then breakdown would occur well before the number of iterations has been reached which means that we now have to reset the process from scratch and restart it which is time consuming and may require manual intervention. A preemptive switch based on monitoring the progress of the solution process is desirable since one only switch near the point of breakdown. In some situations, it may well be that there is no need for switching at all. Strategies with no fixed number of iterations to switch are, therefore, attractive. Note that it is possible to monitor the progress of the solution process by checking the quality of the iterates and the denominators of some components of the iterates. When these become too small, orthogonality is lost and the quality of the iterates deteriorates from the point of view of the residual norm, just before the breakdown. 8 Therefore, switching to another Lanczos-type algorithm once this has been observed is recommended. This is what is implemented in this article.
Note, before we reach a breakdown, some good points have been already generated. The goodness or quality of points is measured by their residual norm. The one with the lowest residual norm can be used as a switching point. Switching from this point is also a good one since we use the best iterate amongst the iterates generated by the Lanczos algorithm. This particular rule of switching is adapted from the restarting strategy discussed by Maharani and Salhi. 8 Furthermore, another model of switching is also proposed in this article, namely switching based on the best collection of iterates with the lowest residual norms. This model is promising since it can reduce the computational time, moreover, it is suitable for the parallel version.
This article is organized as follows. The “Introduction” section is the introduction, the “Switching between Lanczos-Type algorithms: Traditional switching strategy” section consists of materials and methods that explains about review of the derivation of Lanczos-type algorithms via the Krylov subspace method, the breakdown phenomenon in Lanczos-type algorithms, and the switching models. The “Proposed switching models” section presents both the theoretical and numerical results of our proposed switching models. Finally, the discussion and conclusion are put in the “Proposed switching models” and “Results and discussion” sections, respectively.
Switching between Lanczos-type algorithms: Traditional switching strategy
The switching approach was introduced for the first time by Farooq and Salhi,
7
to combat breakdown in Lanczos-type algorithm. By using the Brezinski’s scheme, Lanczos Orthores has the following formula
Proposed switching models
Switching model based on last iterate with unfixed number of iterations
We improved the fixed number of iterations of the switching method by considering the unfixed one, as illustrated as follows. Consider other Lanczos types, say Orthodir, for thus we switch after the Orthores algorithm breaks down. The Orthodir algorithm is formed from the combination of two Lanczos formulas, called A
Consider Lanczos algorithms A
All procedures above are put in a frame algorithm called, SLULast, as displayed in Algorithm 1, namely switching Lanczos-type algorithms for the unlimited number of iterations from the last iterate immediately prior to breakdown.
The SLULast algorithm
Switching model based on iterate with the minimum residual norm for unfixed number of iterations
We propose algorithm SLUMinRes (switching Lanczos types using the iterate with the minimum residual norm for the unfixed number of iterations), presented in Algorithm 3. As explained in the previous section, we switch between Lanczos types, particularly A
The MinRes algorithm
The SLUMinRes algorithm
Switching model from the minimum of the minimum residual norms
This model adopts the SLUMinRes algorithm, where it uses the iterate with the minimum of the minimum residual norms of some Lanczos types. We consider some Lanczos types, A
The SLUMoM algorithm
Results and discussion
The theoretical results
Our proposed models, SLULast, SLUMinRes, and SLUMoM, not only yield the good approximate solutions, but also can cut the iterations from the traditional switching. However, we need to justify theoretically that our models consistently meet the lowest residual norms. We adopt8,25 to prove that switching always converges and thus yields an iterate with the residual norm which is smaller than a certain tolerance.
Suppose we solve an SLE using SLUMinRes. Denote
According to the definition of the Krylov subspace given by Saad
1
, iterate
We put the theoretical view above in the following theorem.
Suppose we solve an SLE using SLULast or SLMinRes algorithm. Below are the characteristics of the iterates generated by the algorithm.
The sequence of iterates generated by SLULast or SLUMinRes algorithm, The associated residual norms of the iterates generated by SLULast or SLUMinRes algorithm, decreases as the number of cycles increased, that is, for any
Numerical results
We implemented all the switching models including SLULast, SLUMinRes, and SLUMoM algorithms on several SLEs, ranging from dimensions
The model problem that was used for testing our algorithms is based on the Poisson equation on the open unit square
Comparison of traditional switching and the proposed models
In this testing problem, three switching models are compared for solving system (39). To test the validation of the models, we use two exact solutions,

Performance of SLUMinRes and SLUMoM on SLEs of dimensions 1000–5000 for

Performance of SLUMinRes and SLUMoM on SLEs of dimensions 1000–10,000 for
The comparison of GMRES, traditional switching and switching models SLULast, SLUMinRes, and SLUMoM, with
The comparison of switching SLULast, and switching models SLUMinRes and SLUMoM, with
Parallel of switching models
The switching models for finding the solution of large-scale SLEs, is still time-consuming, particularly on single processor machines. Therefore, the parallel environment is needed to cope with this situation. In general, one sequential program will be split into several sub-programs in a parallel machine and they are executed in several nodes. According to Amdahl’s Law, the parallel machine uses to finish the whole of the task is as fast time as its slowest core. 15 The previous MPI and open MPI is a well-known parallel algorithm for solving 1D heat equation was described. 16 For solving high dimensional problems of SLEs particularly, the distribution of subsets of the equation can be done by sending each into a number of computing machines or cores. One has been done was by using the accelerated projection-based consensus. 17 Other methods, by utilizing the parallel iterative methods, for instance, such as by Sultanov et al., 18 Ma et al., 19 and Anzt et al. 20 The recent parallel works on GMRES running on NVIDIA GPGPU accelerator for solving systems of linear equations based on the sparse matrices, were discussed by Minin et al. 21
In this study, we focus on parallel using the parallel computing toolbox (PCT) provided by MatLab. Particularly, we use parfor-loop which is a built-function available at PCT. The parfor-loop is useful in situations where many loop iterations of a simple calculation are required. It divides the loop iterations into groups so that each worker executes some portion of the total number of iterations. The parfor-loop is also useful when the loop iterations take a long time to execute because the workers can execute iterations simultaneously. In principle, when using the parfor-loop, the data is sent from the client to workers, and the results are sent back to the client and pieced together. This is illustrated in Figure 3.

The concept of parallel tasks on several CPUs. 22
In relation to our switching models, the most suitable model for using the parfor-loop principal is the SLUMoM algorithm. It is because there are some sub-programs (A

The parallel scheme of pSLUMoM algorithm.
Comparison of SLUMoM and pSLUMoM
Using the same testing problems of SLEs, we present the numerical results of the switching model SLUMoM and its parallel version, pSLUMoM. We calculated the speedup of the parallel scheme could reach against the sequential one when solving several SLEs. It is visualized clearly in Figures 5 and 6 for, respectively, the true solution

Comparison time between switching model the minimum of minimum residual norms and its parallel version for solving

Comparison time between switching model the minimum of minimum residual norms and its parallel version for solving
Discussion
Overall, the proposed switching models, SLULast, SLUMinRes, and SLUMoM, performed better compared with the traditional switching, in terms of accuracy, efficiency, and robustness. As we can see from Tables 1 and 2, although all switching models obtained the approximate solutions with the residual norms of
According to Tables 3 and 4, the pSLUMoM algorithm could reduce the computational time up to 50% from the sequential SLUMoM. This is a good start as we just used the parfor parallel provided in Matlab, with the limited number of workers, namely only 4. We might get better results when we used other parallel platforms and several workers.
The comparison of switching model SLUMoM and pSLUMoM with
The comparison of switching model SLUMoM and pSLUMoM with
Conclusion
The new fashions of switching approach take into account the iterate with the minimum residual norm and the iterate with the minimum of minimum residual norms preceding breakdown. They additionally do unlimiting the number of iterations before switching, which is different from the conventional one. These proposed procedures result in new algorithms, specifically SLULast, SLUMinRes, and SLUMoM.
The results showed that the performances of SLUMinRes and SLUMoM in solving several SLEs are better than SLULast and the conventional switching, in terms of efficiency and accuracy. It is understandable that the iterate with the lowest residual norm is the best point for switching. For the SLUMoM algorithm particularly, the parallel version was created using the parfor function to reduce the computational time.
There are new open questions in this study, unfortunately, the performance of SLUMinRes doesn’t seem to follow the theoretical results. It may be visible from the outcomes that on a few cycles of SLUMinRes, the residual norms are nonetheless fluctuating rather than decreasing. Our point of view in this case is that the orthogonal polynomials,
Footnotes
Acknowledgements
We would like to say thanks to Prof. Abdellah Salhi for his idea of modified switching. We would also thanks to Universiti Malaysia Terengganu for all facilities provided during this research.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project is supported financially by Research Intensified Grant Scheme (RIGS) Universiti Malaysia Terengganu Phase 1/2019, Number RIGS/1/2018/ICT04/UMT/02/1.
