D. Guiberta; D. Tromeur-Dervouta,b,* a ICJ UMR5208 Université Lyon 1 - CNRS, Centre de Développement du Calcul Scientifique Parallèle Université Lyon 1
b IRISA/SAGE INRIA Rennes
Time domain decomposition methods can bo a complementary approach to improve parallelism on the forthcoming parallel architectures with thousand of processors. Indeed. for a given modelled problem. e.g. through a partial differential equation with a prescribed discretization- the multiplication of the processors leads to a decrease of the computing load allocated to each processors, then the ratio of the computing time with the communication time decreases, leading to less efficiency. Nevertheless, for the evolution problems, the difficulty comes from the sequential nature of the computing. Previous time steps are required to compute the current time step. This last constraint occurs in all time integrators used in computational fluid dynamics of unsteady problems.
During these last years, some attempts have been proposed to develop parallelism in time solvers. Such algorithms as Parareal [6] or Pita [2] algorithms are multiple shooting type methods and are very sensitive to the Jabobian linearization for the correction stage on stiff non-linear problems as shown in [4].
Moreover, this correction step is an algorithm sequential and often much more time consuming to solve for the time integrator than the initial problem.
In this paper we investigate a new solution to obtain a parallelism in time with a two level algorithm that combines pipelined iterations in time and deferred correction iterations to increase the accuracy of the solution. In order to focus on the benefit, of our algorithm we approach CFD problems as an ODE (eventually DAE) problem. From the numerical point of view the development of time domain decomposition methods for stiff ODE problems allows to focus on the time domain decomposition without perturbation coming from the space decomposition. In other side, classical advantages of the space decomposition on the granularity are not available. Indeed, from the engineering point of view in the design with ODE/DAE. the efficiency of the algorithm is more focus on the saving time in the day to day practice.
The traditional approach to solve CFD problems starts from the partial differential equation, which is discretized in space with a given accuracy and finally a time discretization with a first order scheme or a second order time scheme is performed to solve problems (usually implicitly due to Courant-Friedrich-Lax condition constraint).
Let consider the 2D Navier-Stokes equations in a Stream function - vorticity formulation:
then applying a backward (forward Euler or Runge-Kutta are also possible) time discretisation. we obtain that follows:
The discretization in space of the discrete in time equations leads to solve a linearized system with an efficient preconditionned solver such as Schwarz-Krylov method or multi-grid:
Let us consider an another approach that is the method of lines or ODE to solve CFD problems. Firstly, the Navier-Stokes problem is written in a Cauchy equation form:
then the discretization in space of the f(t, ω, ψ) leads to a system of ODE:
This last equation can be solved by an available ODE solver such as SUNDIALS (SUite of Nonlinear and Differential/ALgebraic equation Solvers (http://www.llnl.gov/casc/sundials/)) that contains the latest developments of classical ODE solvers as LSODA [8] or CVODE [1]
Figure 1 corresponds to the lid driven cavity results computed by the ODE method a Reynold number of 5000 and up to 10000. These results are obtained for the impulse lid driven with a velocity of 1. Let us notice that the convergence is sensitive to the singularities of the solution.
The advantages and drawbacks of the PDEs and ODE are summarized in Table 1
Table 1
Advantages and drawbacks of ODE and PDEs approaches for CFD problems
PDEs | ODE | |
1 | Fast to solvo one time step | slow to solve one time step (control of time step) |
2 | Decoupling of variables | coupling of variables (closest to incompressibility constraint) |
3 | computing complexity is reduced with the decoupling | high computing complexity () control of error, whole system |
4 | B.C. hard to handle with impact on the linear solver (data structure/choice) | B.C. conditions easy to implement |
5 | CFL condition even with implicit solve (decoupling of variables) | CFL condition replaced by the control on time step |
6 | Building of good preconditionner is difficult | Adaptive time step allows to pass stiffness |
7 | Adaptive time step hard to handle | Adaptive time step is a basic feature in ODE |
In conclusion the ODE approach put the computing pressure on the time step control while PDEs approach put the computing pressure on the preconditionned linear/nonlinear solver.
The high time consumming drawback for ODE explains why this approach is not commonly use in CFD. Nevertheless grid computing architecture could give a renewed interest to this ODE approach.
The spectral deferred correction method (SDC) [7] is a method to improve the accuracy of a time integrator iteratively in order to approximate solution of a functional equation with adding corrections to the defect.
Let us summarize the SDC method for the Cauchy equation:
Let an approximation of the searched solution at time tm. Then consider the polynomial q0 that interpolate the function f(t,y) at P points i.e:
with are points around tm. Then if we define the error between the approximate solution and the approximate solution using q0:
Then we can write the error between the exact solution and the approximation :
The defect , satisfies:
For the spectral deferred correction method an approximation of δ(tm) is computed considering for example a Forward Euler scheme:
Once the defect computed, the approximation is updated
Then a new q0 is computed with using the value for a new iterate. This method is a sequential iterative process as follows: compute an approximation y0, then iterate in parallel until convergence compute δ and update the solution y by adding the correction term δ.
We propose in the next section a parallel implementation to combine deferred correction method with time domain decomposition.
The main idea to introduce parallelism in the SDC is to consider the time domain decomposition and to affect a set of time slices to each processors. Then we can pipeline the computation, each processor dealing a different iterate of the SDC.
Table 2 gives the times to run the timo domain decomposition pipelined SDC method on an ALTIX350 with IA64 1.5Ghz/6Mo processors and with numalink communication network. The lid driven parameters are a Reynolds number of 10 and a regular mesh of 20 × 20 points, the final time to be reach is T=5. The number of time slices has been set equal to 805. The number of SDC iteration is 25. The stage [1] correspond to the sequential first evaluation of the solution on the time slices (with an rtol of 10−1). The time [2.2] is the time of the pipelined to reach the last processor, results show that the ratio of this time [2.2] on the computation is increasing up to 1/4 for 8 processors case. The Speed-up corrected line gives the parallel capability of the parallel implementation when the pipeline is full.
Table 2
Time, Speed-up and Efficiency of the Time domain decomposition pipelined SDC
#proc | 1 | 2 | 4 | 8 |
[1] (s) | 17.12 | 18.68 | 17.36 | 16.41 |
[2.2] (s) | 1.35e-5 | 72.39 | 113 | 129 |
[2.3] f(t,y)eval (s) | 1.66e-2 | 7.25e-3 | 3.13e-3 | 1.126e-3 |
[2.3] q0 computing (s) | 5.99e-3 | 5.5e-3 | 5.90e-3 | 5.94e-3 |
euler | 0.288 | 0.25 | 0.19 | 0.16 |
Total (s) | 2062 | 1113 | 742 | 423 |
Efficiency | 100% | 92.6% | 69.4% | 60.9% |
Speed-up | 1 | 1.85 | 2.77 | 4.87 |
Speed up Corrected | 1 | 1.98 | 3.27 | 7.01 |
It exhibits quite reasonable speed-up up to 8 processors for the present implementation.
Figure 2 gives the convergence of the defect on the time interval for different iteration numbers (1.5.9,13,17). Symbols represent the runs on 1 processor (x). 2 processors (o). 8 processors (). Firstly, we remark that the number of processors does not impact the convergence properties of the method. Secondly, the convergence behavior increases for the last times due to the steady state of the solution. Nevertheless, a good accuracy is obtained on the first times in few SDC iterates.
In this work we have investigated the numerical solution of 2D Navier-stokes with ODE solvers in the perspective of the time domain decomposition which can be an issue for the grid computing.
we proposed to combine time domain decomposition and spectral deferred correction in order to have a two level algorithm that can be pipelined. The first results proved some parallel efficiency with a such approach.
Nevertheless, the arithmetical complexity of the ODE solver for the lid driven problem consider here, is too much costly. Some scheme such as the C(p, q, j) [3] schemes can decouple the system variables in order to reduce the complexity.
* This work was funded by the thema “mathématiques” of the Région Rhône-Alpes thru the project: “Développement de méthodologies mathématiques pour le calcul scientifique sur grille”
3.135.190.113