Solvers and Scalability
45 minutes intermediate
Lesson Objectives
Learn about MFEM's support for efficient solvers and preconditioners.
Note
MFEM is designed to be highly scalable and efficient on a wide variety of platforms: from laptops to GPUaccelerated supercomputers. The solvers described in this lesson play a critical role in this parallel scalability.
Scalable algebraic multigrid preconditioners from hypre
MFEM comes with a large number of example codes that demonstrate different physical applications, finite element discretizations, and linear solvers:
 Example 1 solves a Poisson problem,
 Example 2 solves a linear elasticity problem,
 Example 3 solves a definite Maxwell (electromagnetics) problem, and
 Example 4 solves graddiv diffusion problem.
The parallel versions of these examples (ex1p
, ex2p
, ex3p
, and ex4p
) each use
suitable algebraic multigrid (AMG) preconditioners from the
hypre solvers library. We describe sample runs with
each of these examples in more details below.
Example 1: Poisson problem and AMG

First, make sure you are in the examples subdirectory:
cd ~/mfem/examples

Build the parallel version of Example 1:
make ex1p

Run the parallel version of Example 1, solving a Poisson problem:
./ex1p

After forming the linear system, MFEM uses hypre to construct and apply an AMG preconditioner. Details of the AMG preconditioner are provided in the example output under the headers
BoomerAMG SETUP PARAMETERS
andBoomerAMG SOLVER PARAMETERS
. 
Click here to view the terminal output

A key feature of AMG methods is their scalability: with default options, convergence is achieved in only 18 conjugate gradient iterations.

Let's see what happens if we increase the mesh refinement. Edit
ex1p.cpp
changing line 153 as follows:@@ 150,7 +150,7 @@ int main(int argc, char *argv[]) ParMesh pmesh(MPI_COMM_WORLD, mesh); mesh.Clear(); {  int par_ref_levels = 2; + int par_ref_levels = 3; for (int l = 0; l < par_ref_levels; l++) { pmesh.UniformRefinement();

This adds one additional level of refinement, making the problem roughly 4 times as large in 2D, or 8 times as large in 3D.

Rebuild the example (
make ex1p
) and rerun it:./ex1p

Although the number of unknowns for this problem has increased by roughly 4x, the iteration count remains at 18 due to the scalability of the AMG preconditioner.

Let's now try a 3D problem. For that, we just need to choose a 3D mesh using the
m
ormesh
command line argument. 
Because these problems are more computationally expensive, let's first reduce the refinement level, setting
int par_ref_levels = 1;
in theex1p.cpp
source code. 
Rebuild the example (
make ex1p
) and rerun it using the threedimensional Fichera mesh:./ex1p m ../data/fichera.mesh
. Convergence is attained in only 16 iterations.

Finally, let's take a look at the parallel scalability of the solvers:
 Increase the refinement level:
int par_ref_levels = 2;
 Recompile:
make ex1p
 Now run the 3D example on 8 cores:
mpirun np 8 ./ex1p m ../data/fichera.mesh
 This is an example of a weak scaling test: the problem size and the number of processors are both increased by a factor of 8.
 Because the PCG iteration counts remain roughly constant, the total time to solution should remain roughly fixed (minus some overhead and communication cost), even though we are solving a problem that is 8 times larger.
 Increase the refinement level:
Example 2: Linear Elasticity

This example demonstrates solving a linear elasticity cantilever beam problem with different materials.

This example is designed to work with any of the "beam" meshes provided by MFEM. Run
ls ../data  grep beam
to list the available 2D and 3D meshes:beamhexnurbs.mesh
,beamhex.mesh
,beamhex.vtk
,beamquadamr.mesh
,beamquadnurbs.mesh
,beamquad.mesh
,beamquad.vtk
,beamtet.mesh
,beamtet.vtk
,beamtri.mesh
,beamtri.vtk
,beamwedge.mesh
, andbeamwedge.vtk
. 
The elements and boundaries of these meshes are assigned attributes/materials suitable for the cantilever problem:
+++
boundary > material  material < boundary
attribute 1  1  2  attribute 2
(fixed) +++ (pull down)

Build the example with
make ex2p
. 
Try running
./ex2p
in the terminal to run a 2D elasticity problem. 
As in Example 1, the linear system is solved using AMG.

For this example, two types of AMG solvers can be used:

A special version of AMG designed specifically for elasticity (see this paper).

AMG for systems.


To enable the special elasticity AMG, add the flag
elast
to the command line, otherwise, AMG for systems will be used. For example:./ex2p elast
. 
The polynomial degree (order) can be changed with the
order
command line argument (o
for short). For example:./ex2p o 2
. By default, loworder $(p=1)$ elements are used.
Warning

Additionally, static condensation can be used to eliminate interior highorder degrees of freedom and obtain a smaller system. For
order 1
, this has no effect. For higherorder problems, static condensation can improve efficiency. 
In this example, as before, the mesh refinement level can be controlled in the source code through
par_ref_levels
.
Note
make ex2p
).

Running with more than one MPI rank will partition the mesh and run the problem in parallel. Here is a sample 3D run:
mpirun np 8 ./ex2p m ../data/beamhex.mesh

Try experimenting with different discretization, solver, and parallelization options.
Examples 3 and 4: the de Rham Complex

The next two examples demonstrate the use of vector finite element spaces.

Example 3 solves an electromagnetics problem using $H(\mathrm{curl})$ finite elements.

Example 4 solves a graddiv problem using $H(\mathrm{div})$ finite elements.


Standard multigrid methods don't always work well for these problems, so we need specialized solvers! (See here for a paper on this topic.)

For $H(\mathrm{curl})$ problems, we use the AMS solver from hypre.

For $H(\mathrm{div})$ problems, we either use the ADS solver from hypre or a special hybridization solver.

Try experimenting with different options to get a feel for the performance of the discretizations and solvers:

Change the mesh (2D or 3D) using the
mesh
(m
) command line argument. For example:mpirun np 16 ex3p m ../data/beamhex.mesh
. 
Change the polynomial degree using the
order
(o
) command line argument. For example:mpirun np 32 ex4p m ../data/squarediscnurbs.mesh o 3
. 
Run problems in parallel using
mpirun
. 
For
ex4p
, enable hybridization using thehb
flag. For example:mpirun np 48 ex4p m ../data/starsurf.mesh o 3 hb
.

Note
make ex3 ex4 ex3p ex4p
MFEM's native Multigrid solver

The previous examples (
ex1p
,ex2p
,ex3p
, andex4p
) all used algebraic multigrid methods. MFEM also supports geometric ($h$ and $p$multigrid) methods. 
These solvers are illustrated in Example 26 (and its parallel variant); see the ex26.cpp and ex26p.cpp source files.

Mesh refinement can be set using the
geometricrefinements
(gr
) command line argument. 
The finite element order can be controlled using the
orderrefinements
(or
) command line argument.
Warning

This example runs matrixfree using MFEM's partial assembly algorithms. Matrixfree methods are much more efficient for highorder problems and also work better on GPU architectures.

Try comparing the performance of
ex1p
andex26p
for higherorder problems. For example, compare the run time of the following two runs:mpirun np 32 ./ex26p m ../data/fichera.mesh or 2 mpirun np 32 ./ex1p m ../data/fichera.mesh o 1

Both examples solve a degree4 Poisson problem with 1,884,545 degrees of freedom, but one of them is significantly faster.

Explore how the number of CG iterations changes as
or
andgr
are increased. (For large problems, it may be worth runningex26p
in parallel withmpirun
.)
Loworderrefined methods

Examples 1, 2, 3, and 4 used algebraic methods applied to the discretization matrix for each of the problems. Example 26 showed how to use geometric multigrid together with matrixfree methods.

Loworderrefined (LOR) is an alternative matrixfree methodology for solving these problems. The LOR solvers miniapp provides matrixfree solvers for the same problems solved in Examples 1, 3, and 4.

Go to the LOR solvers miniapp directory:
cd ~/mfem/miniapps/solvers

Run
make plor_solvers
to build the parallel LOR solvers miniapp. 
The
fetype
(orfe
) command line argument can be used to choose the problem type.
fe h
solves an $H^1$ problem (Poisson, equivalent toex1
). 
fe n
solves a Nedelec problem (Maxwell in $H(\mathrm{curl})$, equivalent toex3
). 
fe r
solves a RaviartThomas problem (graddiv in $H(\mathrm{div})$, equivalent toex4
).


As usual, the
mesh
(m
) argument can be used to choose the mesh file. (Keep in mind that MFEM's meshes in the data directory are now found in../../data
relative to the miniapp directory.) 
The number of mesh refinements in serial and parallel can be controlled with the
refineserial
andrefineparallel
(rs
andrp
) command line arguments 
The polynomial degree can be controlled with the
order
(o
) argument. 
Compare the performance of highorder problems with
plor_solvers
to that of Examples 1, 3, and 4. Here are some sample runs to compare:// 2D, 5th order, 256,800 DOFs mpirun np 8 ./plor_solvers fe n m ../../data/star.mesh rs 2 rp 2 o 5 novis mpirun np 8 ../../examples/ex3p m ../../data/star.mesh o 5 // 3D, 2nd order, 2,378,016 DOFs mpirun np 24 ./plor_solvers fe n m ../../data/fichera.mesh rs 2 rp 2 o 3 novis mpirun np 24 ../../examples/ex3p m ../../data/fichera.mesh o 3

For more details on how LOR solvers work in MFEM, see the HighOrder MatrixFree Solvers talk (PDF, video) from the 2021 MFEM community workshop.
Additional solver integrations
In addition to the hypre AMG solvers and MFEM's builtin solvers illustrated above, MFEM also integrates with a number of thirdparty solver libraries, including:

PETSc — see the
~/mfem/examples/petsc
directory 
SuperLU — see the
~/mfem/examples/superlu
directory 
STRUMPACK — see
~/mfem/examples/ex11p.cpp

Ginkgo — see the
~/mfem/examples/ginkgo
directory 
AmgX — see the
~/mfem/examples/amgx
directory
These thirdparty libraries are not preinstalled in the AWS image, but you can still peruse the example source code to see the capabilities of the various integrations.
Note
MFEM_USE_*
configuration variables, e.g., by specifying MFEM_USE_PETSC=YES
.
Questions?
Next Steps
Back to the MFEM tutorial page