Internal OpenMP: How parallel directive works

Posted by

Let’s start with parallel directive, which is maybe the most widely used directive in OpenMP. In this post, I’ll use the following basic example to demonstrate how OpenMP library internals works.

void func() {
  int i = 0, j = 1;
#pragma omp parallel
  {
    ++i;
    ++j;
  }
}

After compilation, it is transformed to following code snippet by compiler (well, different compiler may generate code with a little different style but essentially they are same):

void omp_outlined(int gtid, int btid, int *a, int *b) {
  ++(*i);
  ++(*j);
}
void func() {
  int i = 0, j = 1;
  ident_t loc;
  // some operation on loc
  // ...
  __kmpc_fork_call(loc, 2, omp_outlined, &i, &j);
}

Since our focus is not on front end so we just briefly introduce what is done by compiler. The compiler will take all statements in the parallel region and outline them into a new function, omp_outlined in this case. This function is actually of type void(int, int, ...) where the first argument is global thread id, and the second one is bound thread id. We’ll introduce them later. The rest of arguments are actually variadic standing for all variables captured by the parallel region. In the example above, the integers i and j are used in the parallel region so they’re captured. In the outlined function, the statements are pretty simple which is just to operate the two variables using each pointer. What’s more, the compiler also creates a new variable of type ident_t describing a source location. It is mainly for debug. After that, a function call to __kmpc_fork_call is generated, and that’s almost it.

Although the example above is pretty simple, it actually shows the essence of how OpenMP compiler deals with OpenMP directive, that is, outlines parallel regions into dedicated functions, and then emits function calls correspondingly. It is actually pretty good for feature developments because it hides all details into the runtime function in case of breaking compilation. However, it also prevents classic compiler optimizations like constant propagation, etc. We’ll not detail them here but we will come back to this topic in another series about how OpenMP compiler works.

Alright, back to our topic.

(To be continued…)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s