Parallel Performance Impacted by Choice of MPI Library

Is your CFD solver not giving you the parallel performance that you think it should? Not getting the parallel scalability that you think it ought to give? Look into using a different version of MPI (“Message Passing Interface”). Most parallel CFD codes use MPI, and you might be surprised at how much effect the specific version choice can have.

I learned about this the hard way when I had to port the CHOPA ("Compressible High Order Parallel Acoustics") aeroacoustics solver to an IBM iDataPlex system. I have used Intel compilers for many years, so when I saw that they were available on the system, it was an obvious choice to use them along with the corresponding MPI libraries. CHOPA was soon up and running, but it didn't take long before I realized that the code was running much too slow. It was running even slower than it had on the old system that was being retired to make way for the new IBM system.

At the suggestion of one of the support people, I tried OpenMPI instead of Intel's version, and discovered that the code's parallel performance improved dramatically. So I used OpenMPI with CHOPA for a time until, out of the blue, we started seeing random slowdowns on some of our runs (but not others). The problem got worse and worse until the code was essentially unusable.

Working with the support personnel again, it was suggested that I try IBM's custom flavor of MPI. I did, and again, the performance improved immediately. It was even faster than OpenMPI had been before the odd slowdowns started (we never did explain that). CHOPA is now running at least four or five times faster than it was when we were trying to use Intel's MPI library for the same cases on the same system.

We have since discovered that the same parallel performance issue affects CHOPA on other systems as well. We see performance improve by factors of two or three (depending on the case) when switching from Intel MPI to MPICH on our in-house cluster. And colleagues have seen similar performance improvements on other systems.

I should emphasize that my point is not to pick on Intel or any other vendor. I have heard other people comment that their codes run faster with Intel's MPI than with others. It just happens that the Intel MPI libraries are especially sensitive to something that CHOPA does (we still don't know what) . With another code, it might be a different MPI that causes issues. Unfortunately, I don't have a way to know a priori which version of MPI is going to work best with a given solver, or if a particular version is going to cause problems. It's very much a trial and error situation.

So, my advice is to be alert for otherwise unexplained code slow-downs when moving to a new system. If you can, do some timings using different versions of MPI to see if it has an effect on your particular CFD solver.

Obviously, this tip is mainly for those of you who have the ability to recompile your CFD solver of choice. It won't apply so much for those of you using pre-compiled binaries, though you might find yourselves affected by the issue even if you can't do much about it directly (but be sure to let the developers know so they can look into it).

When you are ready, return from reading about my struggles with parallel performance to the main tips and tricks page.

Or return to the Innovative CFD home page and check out the other topics.

New! Comments

Have your say about what you just read! Leave a comment in the box below.