Larry blogs that improving your app's performance means concurrent programming. Not just OpenMP, which is very cool, as he points out elsewhere, but all the hard stuff: "disk contention, memory locks, cache corruption, etc". Still, here's a tempting paragraph from that DevX article:
It's perhaps surprising that C++, with its reputation for difficulty, actually provides one of the easiest ways to exploit multi-core and multiprocessor systems. OpenMP, a multiplatform API for C++ and Fortran, uses compiler instructions to automatically generate all of the support code needed to parallelize code sections. In the simplest case, which is what we're going to focus on for this article, simply wrapping a processor-intensive loop in a #pragma block can lead to about a 70 percent performance increase on a dual-core or dual-processor system and enjoy a similar "free lunch" on the quad-core systems that you build in the future.
That's right. Concurrency is vital, and C++ takes care of one kind of concurrency astonishingly easily. It's true. Later in the article he plops a #pragma just before each of two loops, and his app runs 70% faster. How's that for fun? Go on, read the article, try it yourself.
Kate