18
Different Techniques to Solve Monotone Inclusion Problems

Tanmoy Som1, Pankaj Gautam1, Avinash Dixit1, and D. R. Sahu2

1Department of Mathematical Sciences, Indian Institute of Technology (Banaras Hindu University), Varanasi, Uttar Pradesh, 221005, India

2Department of Mathematics, Banaras Hindu University, Varanasi, Uttar Pradesh, 221005, India

18.1 Introduction

Consider a monotone operator images. A point images is zero of images if images. The set of all zeros of images is denoted by “zer(T).” A wide class of problems can be reduced to a fundamental problem of nonlinear analysis is to find a zero of a maximal monotone operator images in a real Hilbert space images:

A more general problem we can consider here is to find images such that images, for some images. This problem finds many important application in scientific fields such as signal and image processing [1], inverse problems [2,3], convex optimization [4], and machine learning [5].

Suppose images is the gradient of a differentiable convex function images, i.e. images, the most simple approach to solve Problem 18.1 is via gradient projection method, given by

equation

where images are stepsizes. The operator images is called as forward–backward and the above scheme is nothing else than classical method of steepest descent.

In case images is nondifferentiable function, gradient method generalizes to subgradient method, given by

equation

For a general monotone operator images, the classical approach to solve Problem 18.1 is proximal point method, which converts the maximal monotone operator inclusion problem into a fixed point problem of a firmly nonexpansive mapping via resolvent operator. The proximal point algorithm [6,7] is one of the most influential method to solve Problem 18.1 and has been studied extensively both in theory and practice [810]. Proximal point algorithm is given by

equation

where images is regularization parameter.

The operator images is called resolvent operator, introduced by Moreau [11]. The resolvent operator is also known as a backward operator. This method was further extended to monotone operator by Rockafellar [10]. Rockafeller also studied the weak convergence behavior of proximal point algorithm under some mild assumptions. The aim of this chapter is to study the convergence behavior of proximal point algorithm and its different modified form.

In the Section 18.2, we have discussed some definitions and results used to prove the convergence analysis of algorithms. In Section 18.3, we discussed the convergence behavior of proximal point algorithm. In Section 18.4, we have discussed about the splitting algorithms to solve monotone inclusion problems. Section 18.5 deals with the inertial methods to solve monotone inclusion problems. Numerical example is also shown to compare the convergence speed of different algorithms based on inertial technique.

18.2 Preliminaries

Through this chapter, we will denote the Hilbert space by images.

An operator images is said to be uniformly monotone on a subset images if there exists an increasing function images vanishing only at 0 such that images images images images

equation

The set of fixed points of function images is denoted by images.

18.3 Proximal Point Algorithm

18.4 Splitting Algorithms

In many cases, it is difficult to evaluate the resolvent of operator images. Interestingly, sometimes, it is as difficult as the original problem itself. To overcome this problem, the operator images splits into sum of two maximal monotone operators images and images such that resolvent of images and images are easier to compute than the full resolvent images. Based on splitting techniques, many iterative methods has been designed to solve Problem 18.1. Some popular methods are Peaceman–Rachford splitting algorithm [15], Douglas–Rachford splitting algorithm [16], and forward–backward algorithm [17].

18.4.1 Douglas–Rachford Splitting Algorithm

Douglas–Rachford splitting algorithm is proposed to solve a general class of monotone inclusion problems, when images and images are maximally monotone. Let images and images be a sequence in [0,2]. Then, algorithm is as given as follows:

The convergence behavior of Douglas–Rachford splitting algorithm can be summarized as follows:

18.4.2 Forward–Backward Algorithm

The splitting method discussed above is complicated, and the simplest one is forward–backward method. The forward–backward algorithm, the composition of forward step with respect to images followed by backward step with respect to images, is given by

The forward–backward splitting method is studied extensively in [1820]. For the general monotone operators images and images, the weak convergence of Algorithm 18.6 has been studied under some restriction on the step‐size.

Passty studied the convergence of forward–backward, which can be summarized as follows.

18.5 Inertial Methods

The classical proximal point algorithm converts the maximal monotone operator inclusion problem into a fixed point problem of firmly nonexpansive mapping via resolvent operators. Consider images and images are differentiable functions. The proximal point algorithm can be interpreted as one step discretization method for the ordinary differential equations

where images represents the derivative of images and images is the gradient of images. The iterative method mentioned above are one step, i.e. the new iteration term depends on only its previous iterate. Multistep methods have been proposed to accelerate the convergence speed of the algorithm, which are discretization of second‐order ordinary differential equation

where images. System 18.9 roughly shows the motion of heavy ball rolling under its own inertia over the graph of images until friction stops it at a stationary point of images. Inertial force, friction force, and graving force are the three parameters in System 18.9, that is why the system is named heavy‐ball with friction (HBF) system. The energy function is given by

(18.10)equation

which is always decreasing with images unless images vanishes. Convexity of images insures that trajectory will attain minimum point. The convergence speed of the solution trajectories of the System 18.9 is greater than the System 18.8. Polyak [21] first introduced the multistep method using the property of heavy ball System 18.9 to minimize a convex smooth function images. Polyak's multistep algorithm was given by

equation

where images is a momentum parameter and images is a step‐size parameter. It significantly improves the performance of the scheme. Nestrov [22] improves the idea of Polyak by evaluating the gradient at the inertial rather than the previous term. The extrapolation parameter images is chosen to obtain the optimal convergence of the scheme. The scheme is given by

equation

where images

18.5.1 Inertial Proximal Point Algorithm

In 2001, by combining the idea of heavy ball method and proximal point method, Alvarez and Attouch proposed the inertial proximal point algorithm, which can be written as

where images is a set‐valued operator and images and images are the sequences satisfying some conditions.

18.5.2 Splitting Inertial Proximal Point Algorithm

In 2003, Moudafi and Oliny [23] modified the inertial proximal point algorithm by splitting into maximal monotone operators images and images with images is single valued, Lipschitz continuous operator and images‐cocoercive. The proposed algorithm can be given as follows:

18.5.3 Inertial Douglas–Rachford Splitting Algorithm

This section is dedicated to the formulation of an inertial Douglas–Rachford splitting algorithm, which approaches the set of zeros of the sum of two maximally monotone operators and to the investigation of its convergence properties.

18.5.4 Pock and Lorenz's Variable Metric Forward–Backward Algorithm

Recently, with the inspiration of Nestrov's accelerated gradient method, Lorenz and Pock [24] proposed a modification of the forward–backward splitting to solve monotone inclusion problems by the sum of a monotone operator whose resolvent is easy to compute and another monotone operator is cocoercive. They have introduced a symmetric, positive definite map images, which is considered as a preconditioner or variable metric. The algorithm can be given as follows:

where images is an extrapolation factor, images is a step‐size parameter.

Recently, some inertial iterative algorithms have been proposed, which replaced the condition (iii) of Theorem 18.5 with some mild conditions, which makes the algorithms easy to use for real‐world problems. In 2015, Bot et al. [26] proposed the inertial Mann algorithm for nonexpansive operators and studied the weak convergence in real Hilbert space framework. The inertial Mann algorithm for some initial points images:

(18.27)equation

where images, and images are nonempty, closed, and affine subsets of images.

In 2018, Dong et al. [27] proposed a generalized form of inertial Mann algorithm. They included an extra inertial term in inertial Mann algorithm. The algorithm is given as follows:

where images.

Convergence analysis can be summarized as follows:

In 2019, Dixit et al. [28] proposed the inertial normal algorithm and studied the weak convergence of the proposed algorithm. The algorithm is given as follows:

18.5.5 Numerical Example

In order to compare the convergence speed of Algorithm 18.27, 18.43, and 18.44 in real Hilbert space images with Euclidean norm, we consider the nonexpansive mapping

equation

To show images is nonexpansive mapping, we have for images and images

(18.45)equation

This implies that images, thus images is nonexpansive. For comparison, we choose the sequences images, images, and images. The convergence behaviors of the algorithms are shown in figure.

Semilog graph depicting the algorithm between the number of iterations and the logarithmic of norm of function value of the iterations.

Figure 18.1 Semilog graph between number of iterations and iteration value.

Figure 18.1 shows that the Algorithms 18.27 and 18.43 have nearly same convergence speed but greater than the Mann algorithm. The convergence speed of Algorithm 18.44 is highest. This figure shows the importance of Algorithm 18.44 to calculate the zeros of a nonexpansive mapping.

18.6 Numerical Experiments

Further to compare the convergence speed and accuracy of iterative algorithms 18.27, 18.43, and 18.44 on real‐world problems, we conducted numerical experiments for regression problem on high dimensional datasets that are publicly available. Consider the convex minimization problem given by

where images is a linear map, images and images is sparsity controlling parameter. According to the Karush–Kuhn–Tucker (KKT) condition a point images solves 18.46 if and only if

For any images, Eq. 18.47 becomes

(18.48)equation

Thus, images minimizes images iff images is the fixed point of the operator images.

Note that here operator images is nonexpansive as images‐norm is proper lower semicontinuous convex function. Thus, we can apply the Algorithms 18.27, 18.43, and 18.44 to solve the convex minimization problem 18.46.

For numerical experiments, we consider the operator images data matrix having images‐features and images‐samples, where each images is a images‐dimensional vector images contains images responses. We have taken here datasets from two types of cancer patients: Colon‐cancer dataset and Carcinom dataset.

(i) Colon‐cancer dataset: Colon cancer is a cancer of large intestine (colon). This dataset is collected from 62 patients having 2000 gene expressions. It contains 40 tumor biopsies from tumors and 22 normal biopsies from healthy parts of the large intestine of the patient.

(i) Carcinom dataset: Carcinoma cancer occurs because of uncontrolled growth of a cell. This dataset contains 174 samples with 9182 features.

For the numerical experiment, we selected images, images, and images. For this choice of images, images, and images, algorithms satisfies their respective convergence criteria. The sparsity controlling parameter images is taken as images, where images is tuned in the range images in the multiple of 0.1. The maximum number of iteration is set to 1000 and to stop the procedure, difference between consecutive iteration should be less than 0.001.

In first experiment, we compare the convergence speed of the Algorithms 18.27, 18.43, and 18.44. We plotted the graphs between number of iteration and corresponding function value for each algorithm.

image described by caption.

Figure 18.2 The semilog graphs are plotted between number of iteration vs. corresponding objective function value for different datasets. (a) Colon and (b) carcinom.

From Figure 18.2, we can observe that for both the datasets, convergence speed of Algorithm 18.44 is highest and convergence speed of Algorithms 18.27 and 18.43 are nearly the same.

In second experiment, we compare the Algorithms 18.27, 18.43, and 18.44 on the basis of their accuracy for colon dataset and carcinom dataset. We plotted the graphs between number of iterations and corresponding root mean square error (RMSE) value.

image described by caption.

Figure 18.3 The semilog graphs are plotted between number of iterations and corresponding root mean square error of the function. (a) Colon and (b) carcinom.

From Figure 18.3, we observe that the RMSE value for Algorithm 18.44 is least while the RMSE values for Algorithms 18.27 and 18.43 are nearly the same.

Thus, we conclude from the observations of Figures 18.2 and 18.3 that Algorithm 18.44 outperforms over Algorithms 18.27 and 18.43.

References

  1. 1 Byrne, C. (2003). A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Problems 20 (1): 103.
  2. 2 Combettes, P.L. and Wajs, V.R. (2005). Signal recovery by proximal forward‐backward splitting. Multiscale Modeling and Simulation 4 (4): 1168–1200.
  3. 3 Daubechies, I., Defrise, M., and De Mol, C. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 57 (11): 1413–1457.
  4. 4 Bauschke, H.H. and Borwein, J.M. (1996). On projection algorithms for solving convex feasibility problems. SIAM Review 38 (3): 367–426.
  5. 5 Koh, K., Kim, S.‐J., and Boyd, S. (2007). An interior‐point method for large‐scale l1‐regularized logistic regression. Journal of Machine Learning Research 8: 1519–1555.
  6. 6 Minty, G.J. (1962). Monotone (nonlinear) operators in Hilbert space. Duke Mathematical Journal 29 (3): 341–346.
  7. 7 Martinet, B. (1970). Brève communication. Régularisation d'inéquations variationnelles par approximations successives. ESAIM: Mathematical Modelling and Numerical Analysis‐Modélisation Mathématique et Analyse Numérique 4 (R3): 154–158.
  8. 8 Eckstein, J. and Bertsekas, D.P. (1992). On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Mathematical Programming 55 (1–3): 293–318.
  9. 9 Güler, O. (1992). New proximal point algorithms for convex minimization. SIAM Journal on Optimization 2 (4): 649–664.
  10. 10 Rockafellar, R.T. (1976). Monotone operators and the proximal point algorithm. SIAM Journal on Control and Optimization 14 (5): 877–898.
  11. 11 Moreau, J.‐J. (1965). Proximité et dualité dans un espace Hilbertien. Bulletin de la Société mathématique de France 93: 273–299.
  12. 12 Bauschke, H.H. and Patrick, L.C. (2011). Convex Analysis and Monotone Operator Theory in Hilbert Spaces, vol. 408. New York: Springer.
  13. 13 Alvarez, F. and Attouch, H. (2001). An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set‐Valued Analysis 9 (1–2): 3–11.
  14. 14 Alvarez, F (2004). Weak convergence of a relaxed and inertial hybrid projection‐proximal point algorithm for maximal monotone operators in Hilbert space. SIAM Journal on Optimization 14 (3): 773–782.
  15. 15 Peaceman, D.W. and Rachford, H.H. Jr. (1955). The numerical solution of parabolic and elliptic differential equations. Journal of the Society for Industrial and Applied Mathematics 3 (1): 28–41.
  16. 16 Douglas, J. and Rachford, H.H. (1956). On the numerical solution of heat conduction problems in two and three space variables. Transactions of the American Mathematical Society 82 (2): 421–439.
  17. 17 Passty, G.B. (1979). Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. Journal of Mathematical Analysis and Applications 72 (2): 383–390.
  18. 18 Combettes, P.L. and Wajs V.R. (2005). Signal recovery by proximal forward‐backward splitting. Multiscale Modeling and Simulation 4 (4): 1168–1200.
  19. 19 Lions, P.‐L. and Mercier, B. (1979). Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis 16 (6): 964–979.
  20. 20 Bauschke, H.H., Matous&c.breve;ková, E., and Reich, S. (2004). Projection and proximal point methods: convergence results and counterexamples. Nonlinear Analysis: Theory Methods & Applications 56 (5): 715–738.
  21. 21 Polyak, B.T. (1964). Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics 4 (5): 1–17.
  22. 22 Nesterov, Y.E. (1983). A method for solving the convex programming problem with convergence rate . Doklady Akademii Nauk SSSR 269: 543–547.
  23. 23 Moudafi, A. and Oliny, M. (2003). Convergence of a splitting inertial proximal method for monotone operators. Journal of Computational and Applied Mathematics 155 (2): 447–454.
  24. 24 Lorenz, D.A. and Pock, T. (2015). An inertial forward‐backward algorithm for monotone inclusions. Journal of Mathematical Imaging and Vision 51 (2): 311–325.
  25. 25 Alvarez, F. and Attouch, H. (2001). An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set‐Valued Analysis 9 (1–2): 3–11.
  26. 26 Bot, R.I., Csetnek, E.R., and Hendrich, C. (2015). Inertial Douglas–Rachford splitting for monotone inclusion problems. Applied Mathematics and Computation 256: 472–487.
  27. 27 Dong, Q.‐L., Cho, Y.J., and Rassias, T.M. (2018). General inertial Mann algorithms and their convergence analysis for nonexpansive mappings. In: Applications of Nonlinear Analysis (ed. T.M. Rassias), 175–191. Cham: Springer.
  28. 28 Dixit, A., Sahu, D.R., Singh, A.K., and Som, T. (2019). Application of a new accelerated algorithm to regression problems. Soft Computing 24 (2): 1–14.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.47.221