Sage Journals: Discover world-class research

Abstract

This paper develops a block diagonal preconditioned Uzawa splitting (BDP-US) method for solving saddle point problems by generalizing the Uzawa splitting iteration method proposed by Li and Ma (Numer Math Theory Methods Appl 2018; 11: 235–246). A sufficient condition is then provided to ensure the convergence of the BDP-US method. Meanwhile, a preconditioner on the basis of the BDP-US method is proposed, the spectral properties of the preconditioned matrix is analyzed, and the choice of the parameters for this matrix splitting iteration method is discussed. Numerical results are provided to support the obtained results, and demonstrate the effectiveness of BDP-US method as well as the corresponding preconditioner.

Keywords

Saddle-point problems iteration method convergence spectral properties

Mathematics Subject Classification: 65F10 65F08

Introduction

Consider the linear system of the $2 \times 2$ block structure:

A z \equiv (\begin{matrix} A & B \\ - B^{T} & O \end{matrix}) (\begin{matrix} x \\ y \end{matrix}) = (\begin{matrix} f \\ - g \end{matrix}) \equiv b,

(1)

where $A \in R^{m \times m}$ is symmetric and positive definite, $B \in R^{m \times n}$ with $n << m$ , $x, f \in R^{m}$ , $y, g \in R^{n}$ , $m = 2 p^{2}, n = p^{2}$ , and $p$ denotes the mesh size. $B^{T}$ is the transpose of $B$ , and $O$ is the zero matrix of proper dimension. It is known that the linear system has a unique solution when $B$ is of full column rank (see Benzi et al.,¹ Bai and Bai,² Bai and Pan³).

Linear system of form (1) is called a saddle-point problem, and arises from many important applications in various fields of science and engineering, including computational fluid dynamics, optimization, optimal control, constrained and weighted least-squares estimation as well as many others.^4–8 Then, designing effective algorithms to solve it has attracted researchers’ attention. In general, for a large scale problem (1), only iterative methods are computationally feasible due to excessive storage and/or computational cost. By using some algebraic properties and the structure of the coefficient matrix, many iterative methods have been developed to solve (1), such as Uzawa type methods,^9–12 SOR (successive overrelaxation) type methods,^13–16 HSS (Hermitian and skew-Hermitian splitting) type methods and its accelerated variants,^17–20,28 and Krylov subspace methods with suitable preconditioners.^21–24 In particular, Bai²⁸ discussed the optimal parameters in the HSS-like methods for saddle-point problems, and provided the quasi-optimal parameters of the HSS iteration method.

Recently, based on the shift-splitting iteration method²¹ and the classical Uzawa method,⁹ Li and Ma²⁵ proposed the following Uzawa splitting (US) iteration method

{\begin{matrix} x^{k + 1} = x^{k} + 2 (α I_{m} + A)^{- 1} (f - A x^{k} - B y^{k}) \\ y^{k + 1} = y^{k} + τ Q^{- 1} (B^{T} x^{k + 1} - g) \end{matrix}

to solve singular saddle-point problem (1) with $r ank (B) \leq n$ and $r ank (\cdot)$ being the rank of the matrix, where $Q \in R^{n \times n}$ is a symmetric positive definite matrix, $α$ and $τ$ are two positive constants. Even though this method is a special case of the parameterized inexact Uzawa (PIU) method,¹⁵ it has better performance than the Uzawa method and Uzawa-SOR methods.²⁶ Moreover, this method is convergent when

\frac{2 α}{τ} > λ_{max},

where $λ_{max}$ is the largest eigenvalue of the matrix $B Q^{- 1} B^{T}$ . But, the US method might converge very slowly as the scale of the problem increases. In fact, at this case, the value of $λ_{max}$ may become larger, which makes $α$ too large, or $τ$ too small. For example, when $p = 8, 16, 24$ , the given values of $(α, τ)$ in Li and Ma²⁵ are $(221, 0.57)$ , $(921, 0.47)$ , and $(1237, 0.46)$ , respectively. It should be mentioned that no strategy for selecting iteration parameters can be found in Li and Ma,²⁵ and it requires to accurately solve a linear system with the coefficient matrix $α I_{m} + A$ in per iteration of the US method. However, it is usually impractical to directly solve (1), and may be expensive to solve it by the conjugate gradient method without preconditioning when $m$ is large.

To overcome the above shortfalls, in this paper, we (a) develop a BDP-US method for solving the saddle-point problem (1) by generalizing the US method; (b) provide a sufficient condition to ensure the convergence of the BDP-US method; (c) present a splitting preconditioner and analyze the spectral properties of the preconditioned matrix; and (d) discuss the choice of the parameters for this BDP-US method. The proposed method includes the US method in Li and Ma²⁵ as a special case. Numerical results demonstrate that the BDP-US method is effective and feasible for solving the singular saddle point problems (1). Moreover, the splitting preconditioner can improve the convergence rate of Generalized Minimal Residual (GMRES) method.

The remainder of this paper is organized as follows. In Section 2, the BDP-US iteration method is presented. The convergence properties of the BDP-US method and the spectral properties of the preconditioned matrix are analyzed in Section 3 and 4, respectively. The choice of the parameters for the BDP-US method is discussed in Section 5, and numerical experiments are provided in Section 6. Finally, we draw some conclusions in Section 7.

In the following, we denote by $I_{n}$ the identity matrix of order $n$ , by $Re (\cdot)$ the real part of a complex number, by the superscripts $T$ the transpose of a matrix or vector, by $ρ (\cdot)$ the spectral radius of a matrix, by $(u; v)$ the vector $(u^{*}, v^{*})^{*}$ and by $λ_{max} (\cdot)$ the largest eigenvalues of a symmetric and positive definite matrix.

The BDP-US iteration method

In this section, we shall develop the BDP-US iteration method for solving the linear system (1).

Let

\begin{matrix} P^{- 1} = (\begin{matrix} P_{1} & O \\ O & P_{2} \end{matrix}), \end{matrix}

where $P_{1} \in R^{m \times m}$ and $P_{2} \in R^{n \times n}$ are positive definite matrices. Then by left and right multiplying the linear system (1) with $P^{- 1}$ and $P^{- T}$ respectively, we obtain

\begin{matrix} (\begin{matrix} P_{1} & O \\ O & P_{2} \end{matrix}) (\begin{matrix} A & B \\ - B^{T} & O \end{matrix}) (\begin{matrix} P_{1}^{T} & O \\ O & P_{2}^{T} \end{matrix}) (\begin{matrix} P_{1}^{- T} & O \\ O & P_{2}^{- T} \end{matrix}) (\begin{matrix} x \\ y \end{matrix}) \\ = (\begin{matrix} P_{1} & O \\ O & P_{2} \end{matrix}) (\begin{matrix} f \\ - g \end{matrix}), \end{matrix}

which can be written as

\hat{A} \hat{z} \equiv (\begin{matrix} P_{1} {AP}_{1}^{T} & P_{1} {BP}_{2}^{T} \\ - P_{2} B^{T} P_{1}^{T} & O \end{matrix}) (\begin{matrix} P_{1}^{- T} x \\ P_{2}^{- T} y \end{matrix}) = (\begin{matrix} P_{1} f \\ - P_{2} g \end{matrix}) \equiv \hat{b} .

(2)

Similar to Bai et al.,²¹ we can define the corresponding shift-splitting of $P_{1} {AP}_{1}^{T}$ as

\begin{matrix} P_{1} {AP}_{1}^{T} = \frac{1}{2} (α I_{m} + P_{1} {AP}_{1}^{T}) - \frac{1}{2} (α I_{m} - P_{1} {AP}_{1}^{T}), \end{matrix}

(3)

where $α$ is a positive constant. For the available approximation ${\hat{z}}^{k} = ((P_{1}^{- T} x^{k})^{T}, (P_{2}^{- T} y^{k})^{T})^{T}$ , $k = 0, 1, \dots$ , the following iteration scheme is first applied to the first $m$ equations in (2) to compute the update $x^{k + 1}$ :

\begin{matrix} \frac{1}{2} (α I_{m} + P_{1} {AP}_{1}^{T}) (P_{1}^{- T} x^{k + 1}) = \frac{1}{2} (α I_{m} - P_{1} {AP}_{1}^{T}) (P_{1}^{- T} x^{k}) \\ + (P_{1} f - P_{1} {BP}_{2}^{T} (P_{2}^{- T} y^{k})) . \end{matrix}

(4)

Then the Uzawa iteration method is used to the last $n$ equations in (2) to compute the update $y^{k + 1}$ :

\frac{1}{τ} Q (P_{2}^{- T} y^{k + 1} - P_{2}^{- T} y^{k}) = P_{2} B^{T} P_{1}^{T} (P_{1}^{- T} x^{k + 1}) - P_{2} g,

(5)

where $Q \in R^{n \times n}$ is symmetric and positive definite, and $τ$ is a positive constant.

For simplicity, we denote $\hat{x} = P_{1}^{- T} x$ , $\hat{y} = P_{2}^{- T} y$ , ${\hat{x}}^{k} = P_{1}^{- T} x^{k}$ and ${\hat{y}}^{k} = P_{2}^{- T} y^{k}$ . Then the results in (4) and (5) suggest the following BDP-US method for the linear system (2):

\begin{matrix} {\begin{matrix} {\hat{x}}^{k + 1} = {\hat{x}}^{k} + 2 (α I_{m} + P_{1} {AP}_{1}^{T})^{- 1} (P_{1} f - P_{1} {AP}_{1}^{T} {\hat{x}}^{k} - P_{1} {BP}_{2}^{T} {\hat{y}}^{k}), \\ {\hat{y}}^{k + 1} = {\hat{y}}^{k} + τ Q^{- 1} (P_{2} B^{T} P_{1}^{T} {\hat{x}}^{k + 1} - P_{2} g), \end{matrix} \end{matrix}

(6)

\begin{matrix} {\hat{z}}^{k + 1} = T (α, τ) {\hat{z}}^{k} + H (α, τ) \hat{b}, \end{matrix}

(7)

where the iteration matrix

T (α, τ) = (\begin{matrix} T_{11} (α, τ) & T_{12} (α, τ) \\ T_{21} (α, τ) & T_{22} (α, τ) \end{matrix})

with

\begin{matrix} T_{11} (α, τ) = (α I_{m} + P_{1} {AP}_{1}^{T})^{- 1} (α I_{m} - P_{1} {AP}_{1}^{T}), \\ T_{12} (α, τ) = - 2 (α I_{m} + P_{1} {AP}_{1}^{T})^{- 1} (P_{1} {BP}_{2}^{T}), \\ T_{21} (α, τ) = τ Q^{- 1} (P_{2} B^{T} P_{1}^{T}) (α I_{m} + P_{1} {AP}_{1}^{T})^{- 1} (α I_{m} - P_{1} {AP}_{1}^{T}), \\ T_{22} (α, τ) = I_{n} - 2 τ Q^{- 1} (P_{2} B^{T} P_{1}^{T}) (α I_{m} + P_{1} {AP}_{1}^{T})^{- 1} (P_{1} {BP}_{2}^{T}), \end{matrix}

and

H (α, τ) = (\begin{matrix} 2 (α I_{m} + P_{1} {AP}_{1}^{T})^{- 1} & O \\ 2 τ Q^{- 1} (P_{2} B^{T} P_{1}^{T}) (α I_{m} + P_{1} {AP}_{1}^{T})^{- 1} & τ Q^{- 1} \end{matrix}) .

On the other hand, let

M (α, τ) = (\begin{matrix} \frac{1}{2} (α I_{m} + P_{1} {AP}_{1}^{T}) & O \\ - P_{2} B^{T} P_{1}^{T} & \frac{1}{τ} Q \end{matrix})

(8)

and

N (α, τ) = (\begin{matrix} \frac{1}{2} (α I_{m} - P_{1} {AP}_{1}^{T}) & - P_{1} {BP}_{2}^{T} \\ O & \frac{1}{τ} Q \end{matrix}),

(9)

then we obtain a splitting $A = M (α, τ) - N (α, τ)$ , and

T (α, τ) = M (α, τ)^{- 1} N (α, τ) .

(10)

Overall, we can perform the BDP-US method as follows: Given an initial guess ${\hat{z}}^{0} = (({\hat{x}}^{0})^{T}, ({\hat{y}}^{0})^{T})^{T}$ with ${\hat{x}}^{0} \in R^{m}$ and ${\hat{y}}^{0} \in R^{n}$ , two positive parameter $α$ and $τ$ ; for $k = 0, 1, 2, \dots$ , compute ${\hat{z}}^{k + 1}$ by the iteration scheme (6) until the iteration sequence ${{\hat{z}}^{k}}_{k = 0}^{\infty}$ converges. Thus we obtain the converged sequence ${(x^{k}, y^{k})}$ with $x^{k} = P_{1}^{T} {\hat{x}}^{k}$ and $y^{k} = P_{1}^{T} {\hat{y}}^{k}$ .

Clearly, this BDP-US method reduces to the US method in Li and Ma²⁵ by setting $P_{1} = I_{m}$ and $P_{2} = I_{n}$ , and different $Q$ , $α$ and $τ$ can be selected in actual computation.

Convergence analysis of the BDP-US method

In this section, we shall analyze the convergence of the BDP-US iteration method, and present a sufficient condition for its convergence. To this end, we need the following result.

Lemma 3.1 ²⁷ Both roots of the real quadratic equation $x^{2} - bx + c = 0$ are less than one in modulus if and only if $| c | < 1$ and $| b | < 1 + c$ .

It is well known that the BDP-US method (7) is convergent if and only if $ρ (T (α, τ)) < 1$ . Let $λ$ be an eigenvalue of $T (α, τ)$ , and $(u; v)$ with $u \in C^{m}$ and $v \in C^{n}$ be the corresponding eigenvector, then

T (α, τ) (\begin{matrix} u \\ v \end{matrix}) = λ (\begin{matrix} u \\ v \end{matrix}) .

Namely,

N (α, τ) (\begin{matrix} u \\ v \end{matrix}) = λ M (α, τ) (\begin{matrix} u \\ v \end{matrix}) .

This, (8) and (9) imply that

\begin{matrix} (\begin{matrix} \frac{1}{2} (α I_{m} - P_{1} {AP}_{1}^{T}) & - P_{1} {BP}_{2}^{T} \\ O & \frac{1}{τ} Q \end{matrix}) (\begin{matrix} u \\ v \end{matrix}) \\ = λ (\begin{matrix} \frac{1}{2} (α I_{m} + P_{1} {AP}_{1}^{T}) & O \\ - P_{2} B^{T} P_{1}^{T} & \frac{1}{τ} Q \end{matrix}) (\begin{matrix} u \\ v \end{matrix}), \end{matrix}

\begin{matrix} {\begin{matrix} (α I_{m} - P_{1} {AP}_{1}^{T}) u - 2 P_{1} {BP}_{2}^{T} v = λ (α I_{m} + P_{1} {AP}_{1}^{T}) u, \\ λ τ P_{2} B^{T} P_{1}^{T} u = (λ - 1) Qv . \end{matrix} \end{matrix}

(11)

Lemma 3.2 Assume that $P_{1} \in R^{m \times m}$ and $P_{2} \in R^{n \times n}$ are positive definite, $A \in R^{m \times m}$ and $Q \in R^{n \times n}$ are symmetric and positive definite, $B \in R^{m \times n}$ is of full column rank, $α > 0$ and $τ > 0$ . If $λ$ is an eigenvalue of $T (α, τ)$ and $(u; v)$ is the corresponding eigenvector, then $λ \neq 1$ and $u \neq 0$ .

Proof. If $λ = 1$ , then (11) becomes

(\begin{matrix} P_{1} {AP}_{1}^{T} & P_{1} {BP}_{2}^{T} \\ - P_{2} B^{T} P_{1}^{T} & O \end{matrix}) (\begin{matrix} u \\ v \end{matrix}) = 0 .

(12)

From the assumptions, the coefficient matrix in (12) is nonsingular. Thus, $u = 0$ and $v = 0$ , which contradicts with the fact that $(u; v) \neq 0$ since it is an eigenvector of $T (α, τ)$ . Therefore, it must be $λ \neq 1$ .

Suppose that $u = 0$ . Then (11) becomes

{\begin{matrix} - P_{1} {BP}_{2}^{T} v = 0, \\ (λ - 1) Qv = 0 . \end{matrix}

Thus $v = 0$ by $λ \neq 1$ , and the positive definiteness of $Q$ . This is impossible since $(u; v) \neq 0$ . Therefore, our result holds.

Now, we state and prove the convergence result of the BDP-US method.

Theorem 3.1 Suppose that $P_{1} \in R^{m \times m}$ and $P_{2} \in R^{n \times n}$ are positive definite, $A \in R^{m \times m}$ and $Q \in R^{n \times n}$ are symmetric and positive definite, $B \in R^{m \times n}$ is of full column rank, $α > 0$ and $τ > 0$ . Then the BDP-US method is convergent if the parameters $α$ and $τ$ satisfy the following condition:

0 < τ < \frac{2 α}{λ_{max} (B)},

(13)

where

B = P_{1} {BP}_{2}^{T} Q^{- 1} P_{2} B^{T} P_{1}^{T} .

(14)

In particular, when $α = τ$ , the BDP-US iteration method is convergent if $λ_{max} (B) < 2$ .

Proof. Let $λ$ be an eigenvalue of $T (α, τ)$ and $(u; v)$ be the corresponding eigenvector. Then $λ \neq 1$ and $u \neq 0$ by Lemma 3.2, and

v = \frac{λ τ}{λ - 1} Q^{- 1} P_{2} B^{T} P_{1}^{T} u

(15)

by the second equation of (11). In the following, we establish the relationship between the eigenvalue $λ$ and the parameters $α$ and $τ$ in two cases.

Case 1: $P_{2} B^{T} P_{1}^{T} u = 0$ . Then $v = 0$ by (15), and the first equation of (11) becomes

(α I_{m} - P_{1} {AP}_{1}^{T}) u - λ (α I_{m} + P_{1} {AP}_{1}^{T}) u = 0 .

From $u \neq 0$ and multiplying both sides of the above equation by $\frac{u^{*}}{u^{*} u}$ , we get

α - η - λ (α + η) = 0,

where $η = \frac{u^{*} P_{1} {AP}_{1}^{T} u}{u^{*} u} > 0$ by the positive definiteness of $A$ . Thus, $| λ | = | \frac{α - η}{α + η} | < 1$ by $α > 0$ .

Case 2: $P_{2} B^{T} P_{1}^{T} u \neq 0$ . Then substituting (15) into the first equation of (11) yields

(α I_{m} + P_{1} {AP}_{1}^{T}) u λ^{2} - 2 (α I_{m} - B) u λ + (α I_{m} - P_{1} {AP}_{1}^{T}) u = 0,

(16)

where $B$ is defined in (14). Multiplying both sides of the above equation by $\frac{u^{*}}{u^{*} u}$ yields

(α + η) λ^{2} - 2 (α - ξ τ) λ + α - η = 0,

(17)

where $ξ = \frac{u^{*} B u}{u^{*} u} > 0$ due to the positive definiteness of $Q$ .

From (17) and Lemma 3.1, we know that $| λ | < 1$ if and only if

2 | \frac{α - ξ τ}{α + η} | < 1 + \frac{α - η}{α + η},

That is,

0 < τ < \frac{2 α}{ξ} .

(18)

Since $ξ \leq λ_{max} (B)$ , (13) implies (18). Thus our results are obtained.

Spectral properties of the preconditioned matrix $M (α, τ)^{- 1} \hat{A}$

It is worth pointing out that the BDP-US iteration method, a stationary iteration scheme, provides a preconditioner $M (α, τ)$ defined in (8), which can accelerate the convergence rate of GMRES method when applied to the linear system (2). In this section, we shall analyze spectral properties of the preconditioned matrix $M (α, τ)^{- 1} \hat{A}$ .

As is known, when the preconditioner $M (α, τ)$ is applied to accelerate the convergence of Krylov subspace methods, one must solve an intermediate linear system

(\begin{matrix} \frac{1}{2} (α I_{m} + P_{1} {AP}_{1}^{T}) & O \\ - P_{2} B^{T} P_{1}^{T} & \frac{1}{τ} Q \end{matrix}) (\begin{matrix} w_{1} \\ w_{2} \end{matrix}) = (\begin{matrix} r_{1} \\ r_{2} \end{matrix})

(19)

at each iteration, where $w = (w_{1}^{T}, w_{2}^{T})^{T}$ and $r = (r_{1}^{T}, r_{2}^{T})^{T} \in R^{m + n}$ . This system consists of two smaller systems and is solved as follows: First, solve the linear system

(α I_{m} + P_{1} {AP}_{1}^{T}) w_{1} = 2 r_{1},

(20)

then solve $Q w_{2} = τ (r_{2} + P_{2} B^{T} P_{1}^{T} w_{1})$ . Noticing that $n << m$ and $Q \in R^{n \times n}$ , (20) can be solved by either the (sparse) Cholesky factorization or the preconditioned conjugate gradient method, and the much smaller $Q w_{2} = τ (r_{2} + P_{2} B^{T} P_{1}^{T} w_{1})$ can be solved by the Cholesky factorization.

Theorem below gives spectral properties of the preconditioned matrix $M (α, τ)^{- 1} \hat{A}$ .

Theorem 4.1 Assume that $P_{1} \in R^{m \times m}$ and $P_{2} \in R^{n \times n}$ are positive definite, $A \in R^{m \times m}$ and $Q \in R^{n \times n}$ are symmetric and positive definite, $B \in R^{m \times n}$ is of full column rank. Then the real part of any eigenvalue $\hat{λ}$ of $M (α, τ)^{- 1} \hat{A}$ is positive, that is, the preconditioned matrix $M (α, τ)^{- 1} \hat{A}$ is positive stable. Furthermore, $ρ (M (α, τ)^{- 1} \hat{A}) < 1$ if $α > max {λ_{max} (P_{1} {AP}_{1}^{T}), 2 τ λ_{max} (B) - λ_{min} (P_{1} {AP}_{1}^{T})}$ , where $B$ is defined by (14).

Proof. Let $\hat{λ}$ be an eigenvalue of $M (α, τ)^{- 1} \hat{A}$ and $z = (u; v)$ be the corresponding eigenvector. Then

T (α, τ) z = (1 - \hat{λ}) z

by (9) and (10). This implies that $1 - \hat{λ}$ is an eigenvalue of $T (α, τ)$ , and (17) holds at $λ = 1 - \hat{λ}$ by the proof of Theorem 3.1, that is,

(α + η) {\hat{λ}}^{2} - 2 (η + τ ξ) \hat{λ} + 2 τ ξ = 0,

(21)

where $η = u^{*} P_{1} {AP}_{1}^{T} u / (u^{*} u) > 0$ , and $ξ = u^{*} B u / (u^{*} u) \geq 0$ since $u \neq 0$ by Lemma 3.2. Thus

\hat{λ} = \frac{η + τ ξ \pm \sqrt{η^{2} - 2 α τ ξ + τ^{2} ξ^{2}}}{α + η} .

(22)

This, $α > 0$ and $τ > 0$ imply that the real part of $\hat{λ}$ , $R e (\hat{λ}) = (η + τ ξ) / (α + η) > 0$ when $Δ = η^{2} - 2 α τ ξ + τ^{2} ξ^{2} < 0$ , and $\hat{λ} > 0$ when $Δ \geq 0$ since $(η + τ ξ)^{2} - Δ = 2 τ ξ (η + α) > 0$ .

Furthermore, from (21) and Lemma 3.1, $| \hat{λ} | < 1$ if and only if $2 τ ξ < α + η$ and $η < α$ , that is, $α > max {η, 2 τ ξ - η}$ . Thus the assertion holds by $λ_{min} (P_{1} {AP}_{1}^{T}) \leq η \leq λ_{max} (P_{1} {AP}_{1}^{T})$ and $ξ \leq λ_{max} (B)$ .

The parameter choice

In general, the effectiveness of an iteration method depends largely on the choice of its parameters, and it is very important to find a good strategy for approaching the optimal parameters of the iteration method since it is very hard to find the optimal parameters. In particular, Bai²⁸ provided the quasi-optimal parameters of HSS iteration methods to solve the large sparse saddle-point problems. Following the similar analysis as Bai,²⁸ we shall give a strategy to select the iteration parameters of the BDP-US method for solving the linear system (2).

Let $λ$ be an eigenvalue of $T (α, τ)$ and $(u; v)$ be a corresponding eigenvector. When $B$ is of full column rank, $u \neq 0$ and $λ \neq 1$ by Lemma 3.2, and

λ = \frac{α - τ ξ \pm \sqrt{η^{2} - 2 α τ ξ + τ^{2} ξ^{2}}}{α + η}

by (17), where $η = u^{*} P_{1} {AP}_{1}^{T} u / (u^{*} u) > 0$ and $ξ = u^{*} B u / (u^{*} u) > 0$ . In the following, we analyze the choices for the parameters in two cases.

Case 1: $Δ = η^{2} - 2 α τ ξ + τ^{2} ξ^{2} \geq 0$ . Then $α < η$ , or $α \geq η$ and $τ \geq (α + \sqrt{α^{2} - η^{2}}) / ξ$ , or $α \geq η$ and $τ \leq (α - \sqrt{α^{2} - η^{2}}) / ξ$ . So

\begin{matrix} | λ | = \\ {\begin{matrix} \frac{α - τ ξ + \sqrt{Δ}}{α + η}, & if α \geq max {η, τ ξ + \sqrt{α^{2} - η^{2}}}, or η > α \geq τ ξ, \\ \frac{τ ξ - α + \sqrt{Δ}}{α + η}, & if η \leq α \leq τ ξ - \sqrt{α^{2} - η^{2}}, or α \leq min {η, τ ξ} . \end{matrix} \end{matrix}

a) $α \geq max {η, τ ξ + \sqrt{α^{2} - η^{2}}}$ . Then $| λ | \leq (α - τ ξ) / η$ by $α \geq η$ and $Δ \leq (α - τ ξ)^{2}$ . Due to the linearity of $α - τ ξ$ with respect to $α$ and $τ$ , $min max {| λ |} = 0$ at $α = τ ξ = η$ when $α \geq η \geq τ ξ + \sqrt{α^{2} - η^{2}}$ , and $min max {| λ |} = \sqrt{{(α / η)}^{2} - 1}$ at $α = τ ξ + \sqrt{α^{2} - η^{2}}$ when $α \geq τ ξ + \sqrt{α^{2} - η^{2}} \geq η$ .

b) $η > α \geq τ ξ$ . Then

| λ | \leq g (α, τ) = \sqrt{\frac{η - τ ξ}{η + τ ξ}} + \frac{α - τ ξ}{α + τ ξ}

by $Δ \leq η^{2} - (τ ξ)^{2}$ since $- α τ ξ \leq - (τ ξ)^{2}$ . Since

{\begin{matrix} \frac{\partial g (α, τ)}{\partial α} = \frac{2 τ ξ}{{(α + τ ξ)}^{2}} > 0, \\ \frac{\partial g (α, τ)}{\partial τ} = - \frac{2 α ξ}{{(α + τ ξ)}^{2}} - \frac{ξ η}{\sqrt{{(η + τ ξ)}^{3} (η - τ ξ)}} < 0, \end{matrix}

$g (α, τ)$ is increasing and decreasing with respect to $α$ and $τ$ , respectively. Thus, $g (\cdot, τ)$ and $g (α, \cdot)$ attain the minimum when $α = τ ξ$ , which leads to $min max {| λ |} = \sqrt{(η - α) / (η + α)}$ .

c) $η \leq α \leq τ ξ - \sqrt{α^{2} - η^{2}}$ . Then $| λ | \leq (τ ξ - α) / η$ , and $min max {| λ |} = \sqrt{{(α / η)}^{2} - 1}$ at $α = τ ξ - \sqrt{α^{2} - η^{2}}$ .

d) $α \leq min {η, τ ξ}$ . Let $f (α, τ) = (τ ξ - α + \sqrt{Δ}) / (α + η)$ . Then

{\begin{matrix} \frac{\partial f (α, τ)}{\partial α} = - \frac{ξ τ + \sqrt{Δ}}{(α + η) \sqrt{Δ}} - \frac{ξ τ - α + \sqrt{Δ}}{{(α + η)}^{2}} < 0, \\ \frac{\partial f (α, τ)}{\partial τ} = \frac{ξ τ - α + \sqrt{Δ}}{(α + η) \sqrt{Δ}} ξ > 0 . \end{matrix}

This implies that $f (α, τ)$ is decreasing and increasing with respect to $α$ and $τ$ , respectively. Thus, $min max {| λ |} = 0$ at $α = τ ξ = η$ when $τ ξ \geq η \geq α$ ; and $min max {| λ |} = \sqrt{(η - α) / (η + α)}$ at $α = τ ξ$ when $η \geq τ ξ \geq α$ .

Case 2: $Δ = η^{2} - 2 α τ ξ + τ^{2} ξ^{2} < 0$ . That is, $η^{2} + τ^{2} ξ^{2} < 2 α τ ξ$ . Then

| λ | = | \sqrt{{(\frac{α - τ ξ}{α + η})}^{2} + \frac{2 α τ ξ - η^{2} - τ^{2} ξ^{2}}{{(α + η)}^{2}}} | = | \sqrt{\frac{α - η}{α + η}} | < 1 .

In summary, $min max {| λ |} = \sqrt{{(α / η)}^{2} - 1}$ at $α = τ ξ \pm \sqrt{α^{2} - η^{2}} \geq η$ with $α < \sqrt{2} η$ , and $min max {| λ |} = \sqrt{(η - α) / (η + α)}$ at $α = τ ξ$ . Therefore, the best choices of $α$ and $τ$ are $α = η = τ ξ$ .

Numerical experiments

In this section, a numerical example is provided to support the theoretical results and illustrate the effectiveness of BDP-US method. All experiments are conducted in MATLAB (R2015a) and terminated if the current residual

RES : = \frac{\sqrt{∥ P_{1} f - P_{1} A x^{k} - P_{1} B y^{k} ∥_{2}^{2} + ∥ P_{2} g - P_{2} B^{T} x^{k} ∥_{2}^{2}}}{\sqrt{∥ P_{1} f ∥_{2}^{2} + ∥ P_{2} g ∥_{2}^{2}}}

is less than $10^{- 7}$ or the number of iteration steps is greater than the prescribed number $k_{max} = 1000$ . In our computations, the initial guess is set to be zero, and the right-hand side $\hat{b}$ is chosen such that the exact solution of the saddle-point problem (2) is $(1, 1, 1, . . ., 1)^{T} \in R^{m + n}$ .

Example 6.1 ¹⁸ Consider the following linear system, where

\begin{matrix} A = (\begin{matrix} I_{p} \otimes T + T \otimes I_{p} & O \\ O & I_{p} \otimes T + T \otimes I_{p} \end{matrix}) \in R^{2 p^{2} \times 2 p^{2}} and \\ B = (\begin{matrix} I_{p} \otimes F \\ F \otimes I_{p} \end{matrix}) \in R^{2 p^{2} \times p^{2}}, \end{matrix}

$T = tridiag (- 1, 2, - 1) / h^{2} \in R^{p \times p}, F = tridiag (- 1, 1, 0) / h \in R^{p \times p},$ $h = 1 / (p + 1)$ is the discretization mesh-size, ⊗ is the Kronecker product symbol, $m = 2 p^{2}$ and $n = p^{2}$ .

To show the advantages of BDP-US method and preconditioner $M (α, τ)$ , we compare it with three iteration methods (Uzawa-SOR,²⁶ Uzawa-Low,²⁹ and one-parameter variant of preconditioned Uzawa (OVPU)³⁰ methods) and their corresponding preconditioners. In all experiments, the parameter $ω$ in OVPU method is optimal value, while the parameters in the other methods are set to the experimentally computed optimal ones, which leads to the least number of iteration steps.

Take $P_{1}$ , $P_{2}$ , and $Q$ as shown in Table 1, $p = 8, 16$ and $24$ . Tables 2 –4 report the numerical results obtained by these methods for two cases, where “IT” denotes the number of iteration steps and “CPU” is the elapsed CPU time in seconds. From Tables 2 to 4, we see that BDP-US method outperforms the other three methods, and BDP-US-I has a worse performance than BDP-US-II. Thus, the BDP-US method with suitable parameters $α, τ$ , and $Q$ is more efficient.

Table 1.

Choices of the matrices of $P_{1}$ , $P_{2}$ , and $Q$ .

Cases	$P_{1}$	$P_{2}$	$Q$
I	$I_{m}$	$I_{n}$	$B^{T} tridiag (A)^{- 1} B$
II	$tridiag (A)^{- 1}$	$I_{n}$	$B^{T} tridiag (A)^{- 1} B$

Table 2.

Numerical results for example 6.1 ( $p = 8$ ).

Cases	$(ω, α, τ)$	IT	CPU	RES
Uzawa-SOR-I	(1.38,–,0.10)	159	0.0313	9.87E-08
Uzawa-SOR-II	(1.20,–,0.17)	128	0.0288	9.85E-08
Uzawa-Low-I	(1.50,–,0.13)	139	0.0307	8.88E-08
Uzawa-Low-II	(1.52,–,0.26)	87	0.0199	9.90E-08
OVPU-I	(0.4136,–,–)	81	0.0192	9.60E-08
OVPU-II	(0.4136,–,–)	79	0.0190	9.99E-08
BDP-US-I	(–,118,0.248)	68	0.0187	9.83E-08
BDP-US-II	(–,0.50,0.50)	49	0.0084	8.39E-08

Table 3.

Numerical results for example 6.1 ( $p = 16$ ).

Cases	$(ω, α, τ)$	IT	CPU	RES
Uzawa-SOR-I	(1.41,–,0.032)	436	3.5673	9.80E-08
Uzawa-SOR-II	(1.25,–,0.056)	366	2.1211	9.69E-08
Uzawa-Low-I	(1.48,–,0.038)	418	2.3622	9.73E-08
Uzawa-Low-II	(1.48,–,0.075)	272	2.3017	9.78E-08
OVPU-I	(0.1529,–,–)	277	1.7060	9.82E-08
OVPU-II	(0.1529,–,–)	266	1.6631	9.27E-08
BDP-US-I	(–,237,0.137)	112	0.6171	9.30E-08
BDP-US-II	(–,0.308,0.308)	86	0.4954	7.90E-08

Table 4.

Numerical results for example 6.1 ( $p = 24$ ).

Cases	$(ω, α, τ)$	IT	CPU	RES
Uzawa-SOR–I	(1.45,–,0.015)	831	22.8820	9.89E-08
Uzawa-SOR–II	(1.23,–,0.028)	711	18.5213	9.92E-08
Uzawa-Low–I	(1.46,–,0.019)	809	22.6989	9.22E-08
Uzawa-Low–II	(1.45,–,0.041)	520	17.2806	9.11E-08
OVPU-I	(0.0764,–,–)	602	14.0528	9.82E-08
OVPU-II	(0.0764,–,–)	569	12.4945	9.73E-08
BDP-US-I	(–,348,0.093)	154	5.1158	9.48E-08
BDP-US-II	(–,0.238,0.238)	114	2.7059	8.95E-08

Moreover, we accelerate the convergence rate of the GMRES(10) method by using the $M (α, τ)$ , Uzawa-SOR, Uzawa-Low, and OVPU preconditioners (denote by $P_{1}$ , $P_{2}$ , $P_{3}$ , and $P_{4}$ respectively). Tables 5 and 6 list the outer(inner) iteration numbers IT, the CPU and the RES for accelerating these iteration methods by GMRES(10) in two cases when $p = 16$ , where IT, CPU, and RES are the same as Table 2. From Tables 5 and 6, we see that GMRES(10) with preconditioner $P_{1}$ requires less IT and CPU than that with the other three compared preconditioners, and Case II outperforms Case I. Thus, GMRES with the preconditioner $P_{1}$ achieves faster convergence rate.

Table 5.

Outer(inner) iterations, CPU(s), and RES of GMRES(10) for Case I when $p = 16$ .

Preconditioners	IT	CPU	RES
Uzawa-SOR $(P_{2})$	13(9)	2.7271	9.46E-08
Uzawa-Low $(P_{3})$	13(7)	2.5116	9.34E-08
OVPU $(P_{4})$	18(9)	3.2805	7.94E-08
BDP-US $(P_{1})$	5(3)	0.8190	7.93E-08

Table 6.

Outer(inner) iterations, CPU(s), and RES of GMRES(10) for Case II when $p = 16$ .

Preconditioners	IT	CPU	RES
Uzawa-SOR $(P_{2})$	12(8)	2.3917	9.50E-08
Uzawa-Low $(P_{3})$	11(9)	2.2694	9.36E-08
OVPU $(P_{4})$	21(8)	3.8201	8.01E-08
BDP-US $(P_{1})$	4(7)	0.6848	7.72E-08

To see more clearly, Figures 1 and 2 depict the eigenvalue distributions of the corresponding preconditioned matrices for two cases when $p = 16$ . Clearly, the real parts of $λ (P_{1}^{- 1} \hat{A})$ are all nonnegative, which coincides with that of Theorem 4.1, and the eigenvalues of $P_{1}^{- 1} \hat{A}$ are more clustered than the other three preconditioned matrices. Therefore, the preconditioner $P_{1}$ with suitable parameters is more effective than $P_{2}$ , $P_{3}$ , and $P_{4}$ when applied to solve large sparse saddle-point problems.

Figure 1.

Eigenvalue distributions of the corresponding preconditioned matrices for Case I when $p = 16$ : (a) Uzawa-SOR, (b) Uzawa-Low, (c) OVPU, and (d) BDP-US.

Figure 2.

Eigenvalue distributions of the corresponding preconditioned matrices for Case II when $p = 16$ : (a) Uzawa-SOR, (b) Uzawa-Low, (c) OVPU, and (d) BDP-US.

Conclusions

This paper develops a block-diagonally preconditioned US method for solving the saddle-point problems by generalizing the US method in Li and Ma²⁵ and sufficiently utilizing the block diagonal preconditioning technique, and provides a sufficient condition to ensure its convergence. Then, a splitting preconditioner is presented, the spectral properties of the preconditioned matrix are analyzed, and the choice of the parameters for this iteration method is discussed. Finally, numerical experiments are conducted to demonstrate that BDP-US method performs better than some existing methods and the preconditioner $M (α, τ)$ can accelerate the corresponding Krylov subspace methods. Moreover, further explorations should be focused on finding optimal parameters of the BDP-US method and giving more accurate description for the spectral distribution of the preconditioned matrix.

Footnotes

Acknowledgements

The authors would like to thank anonymous reviewers for their valuable comments and suggestions for revising the paper.

Handling Editor: Chenhui Liang

Author contributions

The authors confirm contribution to the paper as follows: Bo Wu created the idea, performed the formulations and prepared this paper under the supervision of Xing-Bao Gao. All authors reviewed the results and approved the final version of the manuscript.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Bo Wu

Data accessibility statement

All the data used in this paper can be obtained upon request to the corresponding author.

References

Benzi

Golub

Liesen

Numercial solution of saddle point problems. Acta Numer 2005; 14: 1–137.

Bai

ZZ.

On nonsingularity of block two-by-two matrices. Linear Algebra Appl 2013; 439: 2388–2404.

Bai

Pan

JY.

Matrix analysis and computations. Philadelphia, PA: SIAM, 2021.

Elman

HC.

Preconditioners for saddle point problems arising in computational fluid dynamics. Appl Numer Math 2002; 43: 75–89.

Gould

NIM

Hribar

Nocedal

On the solution of equality constrained quadratic programming problems arising in optimization. SIAM J Sci Comput 2001; 23: 1375–1394.

Betts

JT.

Practical Methods for optimal control using nonlinear programming. Philadelphia, PA: SIAM, 2001.

Bjorck

Numerical Methods for Least Squares Problems. Philadelphia, PA: SIAM, 1996.

Bai

ZZ.

Structured preconditioners for nonsingular matrices of block two-by-two structures. Math Comput 2005; 75: 791–815.

Arrow

Hurwicz

Uzawa

Studies in nonlinear programming. Stanford, CA: Stanford University Press, 1958.

10.

Liang

Zhang

GF.

On block-diagonally preconditioned accelerated parameterized inexact Uzawa method for singular saddle point problems. Appl Math Comput 2013; 221: 89–101.

11.

Wang

Kong

Uzawa-Low method and preconditioned Uzawa-low method for three-order block saddle point problem. Appl Math Comput 2015; 269: 626–636.

12.

Miao

SX.

A new Uzawa-type method for saddle point problems. Appl Math Comput 2017; 300: 95–102.

13.

Golub

Yuan

JY.

SOR-like methods for augmented systems. BIT Numer Math 2001; 41: 71–85.

14.

Bai

Parlett

Wang

ZQ.

On generalized successive overrelaxation methods for augmented linear systems. Numer Math 2005; 102: 1–38.

15.

Bai

Wang

ZQ.

On parameterized inexact Uzawa methods for generalized saddle point problems. Linear Algebra Appl 2008; 428: 2900–2932.

16.

The modified ASSOR-like method for saddle point problems. J Appl Anal Comput 2021; 11: 1718–1730.

17.

Bai

Golub

MK.

Hermitian and skew-Hermitian splitting methods for non-Hermitian positive definite linear systems. SIAM J Matrix Anal Appl 2003; 24: 603–626.

18.

Bai

Golub

Pan

JY.

Preconditioned Hermitian and skew-Hermitian splitting methods for non-Hermitian positive semidefinite linear systems. Numer Math 2004; 98: 1–32.

19.

Bai

Golub

GH.

Accelerated Hermitian and skew-Hermitian splitting iteration methods for saddle-point problems. IMA J Numer Anal 2007; 27: 1–23.

20.

Bai

Benzi

Regularized HSS iteration methods for saddle-point linear systems. BIT Numer Math 2017; 57: 287–311.

21.

Bai

Yin

YF.

A shift-splitting preconditioner for non-Hermitian positive definite matrices. J Comput Math 2006; 24: 539–552.

22.

Bai

ZZ.

On spectral clustering of HSS preconditioner for generalized saddle-point matrices. Linear Algebra Appl 2018; 555: 285–300.

23.

Cao

Ren

Yao

LQ.

Improved relaxed positive-definite and skew-Hermitian splitting preconditioners for saddle point problems. J Comput Math 2019; 37: 95–111.

24.

Zhu

Yang

AL.

A two-parameter block triangular preconditioner for double saddle point problem arising from liquid crystal directors modeling. Numer Algorithms 2022; 89: 987–1006.

25.

CF.

Semi-convergence analysis of Uzawa splitting iteration method for singular saddle point problems. Numer Math Theory Methods Appl 2018; 11: 235–246.

26.

Zhang

Shang

A class of Uzawa-sor methods for saddle point problems. Appl Math Comput 2010; 216: 2163–2168.

27.

Young

DM.

Iterative solution of large linear systems. New York, NY: Academic Press, 1971.

28.

Bai

ZZ.

Optimal parameters in the HSS-like methods for saddle-point problems. Numer Linear Algebra Appl 2009; 16: 447–479.

29.

Yun

JH.

Variants of the Uzawa method for saddle point problem. Comput Math Appl 2013; 65: 1037–1046.

30.

Yun

JH.

Fast one-parameter variant of preconditioned Uzawa method with a scaled preconditioner for saddle point problems. Appl Math Sci 2016; 10: 109–117.

A block-diagonally preconditioned Uzawa splitting iteration method for solving a class of saddle-point problems

Abstract

Keywords

Introduction

The BDP-US iteration method

Convergence analysis of the BDP-US method

Spectral properties of the preconditioned matrix M ( α , τ ) - 1 A ^

The parameter choice

Numerical experiments

Conclusions

Footnotes

Acknowledgements

Author contributions

Declaration of conflicting interests

Funding

ORCID iD

Data accessibility statement

References

Spectral properties of the preconditioned matrix $M (α, τ)^{- 1} \hat{A}$