Chapter 10. OpenMP C/C++ API Multiprocessing Directives

This chapter provides an overview of the multiprocessing directives that MIPSpro C and C++ compilers support. These directives are based on the OpenMP C/C++ Application Program Interface (API) standard, version 2.0, which is available in the 7.4.1 release. Programs that use these directives are portable and can be compiled by other compilers that support the OpenMP standard.

The complete OpenMP standard is available at http://www.openmp.org/specs . See that documentation for complete examples, rules of usage, and restrictions. This chapter provides only an overview of the supported directives and does not give complete details or restrictions.

To enable recognition of the OpenMP directives, specify -mp on the cc or CC command line.

In addition to directives, the OpenMP C/C++ API describes several library functions and environment variables. Information on the library functions can be found on the omp_lock(3), omp_nested(3), and omp_threads(3) man pages. Information on the environment variables can be found on the pe_environ(5) man page.


Note: The SGI multiprocessing directives, including the Origin series distributed shared memory directives, are outmoded. Their preferred alternatives are the OpenMP C/C++ API directives described in this chapter.


Using Directives

Each OpenMP directive starts with #pragma omp, to reduce the potential for conflict with other #pragma directives with the same name. They have the following form:

#pragma omp directive-name [clause[ clause] ...] new-line

Except for starting with #pragma omp, the directive follows the conventions of the C and C++ standards for compiler directives.

Directives are case-sensitive. The order in which clauses appear in directives is not significant. Only one directive name can be specified per directive.

An OpenMP directive applies to at most one succeeding statement, which must be a structured block.

Conditional Compilation

The _OPENMP macro name is defined by OpenMP-compliant implementations as the decimal constant, yyyymm, which will be the year and month of the approved specification. This macro must not be the subject of a #define or a #undef preprocessing directive.

#ifdef _OPENMP
iam = omp_get_thread_num() + index;
#endif

If vendors define extensions to OpenMP, they may specify additional predefined macros.

If an implementation is not OpenMP-compliant, or if its OpenMP mode is disabled, it may ignore the OpenMP directives in a program. In effect, an OpenMP directive behaves as if it were enclosed within #ifdef _OPENMP and #endif. Thus, the following two examples are equivalent:

if(cond)
{
   #pragma omp flush (x)
}
X++;

if(cond)
   #ifdef )OPENMP
   #pragma omp flush (x)
   #endif
x++;

parallel Construct

The #pragma omp parallel directive defines a parallel region, which is a region of the program that is to be executed by multiple threads in parallel.

When a thread encounters a parallel construct and no if clause is present, or the if expression evaluates to a nonzero value, a team of threads is created. This thread becomes the master thread with a thread number of 0. If the value of the if expression is zero, the region is serialized.

Work-sharing Constructs

A work-sharing construct distributes the execution of the associated statement among the members of the team that encounter it. The work-sharing directives do not launch new threads, and there is no implied barrier on entry to a work-sharing construct.

The sequence of work-sharing constructs and barrier directives encountered must be the same for every thread in a team.

OpenMP defines the following work-sharing constructs:

  • The #pragma omp for directive identifies an iterative work-sharing construct that specifies the iterations of the associated loop should be executed in parallel. The iterations of the for loop are distributed across threads that already exist.

  • The #pragma omp sections directive identifies a non-iterative work-sharing construct that specifies a set of constructs that are to be divided among threads in a team. Each section is executed once by a thread in the team. Each section is preceded by a sections directive, although the sections directive is optional for the first section.

  • The #pragma omp single directive identifies a construct that specifies that the associated structured block is executed by only one thread in the team (not necessarily the master thread).

Combined Parallel Work-sharing Constructs

Combined parallel work-sharing constructs are short cuts for specifying a parallel region that contains only one work-sharing construct. The semantics of these directives are identical to that of explicitly specifying a parallel directive followed by a single work-sharing construct.

  • The parallel for directive is a shortcut for a parallel region that contains one for directive.

  • The #pragma omp parallel sections directive provides a shortcut form for specifying a parallel region containing one sections directive.

Master and Synchronization Constructs

The following list describes the synchronization constructs:

  • The #pragma omp master directive identifies a construct that specifies a structured block that is executed by the master thread of the team.

  • The #pragma omp critical directive identifies a construct that restricts execution of the associated structured block to one thread at a time.

  • The #pragma omp barrier directive synchronizes all the threads in a team, each thread waiting until all other threads have reached this point.

  • The #pragma omp atomic directive ensures that a specific memory location is updated atomically.

  • The #pragma omp flush directive, explicit or implied, identifies precise synchronization points at which the implementation is required to provide a consistent view of certain objects in memory. This means that previous evaluations of expressions that reference those objects are complete and subsequent evaluations have not yet begun.

  • A #pragma omp ordered directive must be within the dynamic extent of a for or parallel for construct that has an ordered clause. The structured-block following an ordered directive is executed in the same order as iterations in a sequential loop.

Data Environment Constructs

The #pragma omp threadprivate directive makes file-scope, namespace-scope, or static block-scope variables local to a thread but global within the thread. This directive is not implemented for block-scope variables requiring dynamic initialization in C++.

Several directives accept clauses that allow a user to control the scope attributes of variables for the duration of the construct. Not all of the clauses are allowed on all directives, but the clauses that are valid on a particular directive are included with the description of the directive. Usually, if no data scope clauses are specified for a directive, the default scope for variables affected by the directive is share.

The following list describes the data scope attribute clauses:

  • The private clause declares the variables in list to be private to each thread in a team.

  • The firstprivate clause provides a superset of the functionality provided by the private clause.

  • The lastprivate clause provides a superset of the functionality provided by the private clause.

  • The shared clause shares variables that appear in the list among all the threads in a team. All threads within a team access the same storage area for shared variables.

  • The default clause allows the user to specify the data-sharing attributes of variables.

  • The reduction clause performs a reduction on the scalar variables specified, with the operator specified.

  • The copyin clause lets you assign the same value to threadprivate variables for each thread in the team executing the parallel region. For each variable specified, the value of the variable in the master thread of the team is copied to the threadprivate copies at the beginning of the parallel region.

  • The copyprivate clause provides a mechanism to use a private variable to broadcast a value from one member of a team to the other members.

Directive Binding

Some directives are bound to other directives. A binding specifies the way in which one directive is related to another. For instance, a directive is bound to a second directive if it can appear in the dynamic extent of that second directive. The following rules apply with respect to the dynamic binding of directives:

  • The for, sections, single, master, and barrier directives bind to the dynamically enclosing parallel directive, if one exists. If no parallel region is currently being executed, the directives are executed by a team composed of only the master thread.

  • The ordered directive binds to the dynamically enclosing for directive.

  • The atomic directive enforces exclusive access with respect to atomic directives in all threads, not just the current team.

  • The critical directive enforces exclusive access with respect to critical directives in all threads, not just the current team.

  • A directive cannot bind to a directive outside the closest dynamically enclosing parallel directive.

Directive Nesting

Dynamic nesting of directives must adhere to the following rules:

  • A parallel directive dynamically inside another parallel directive logically establishes a new team, which is composed of only the current thread, unless nested parallelism is enabled.

  • for, sections, and single directives that bind to the same parallel directive are not allowed to be nested inside each other.

  • critical directives with the same name are not allowed to be nested inside each other.

  • for, sections, and single directives are not permitted in the dynamic extent of critical, ordered, and master regions if the directives bind to the same parallel as the regions.

  • barrier directives are not permitted in the dynamic extent of for, ordered, sections, single, master, and critical regions if the directives bind to the same parallel as the regions.

  • master directives are not permitted in the dynamic extent of for, sections, and single directives if the master directives bind to the same parallel as the regions.

  • ordered directives are not allowed in the dynamic extent of critical regions if the directives bind to the same parallel as the regions.

  • Any directive that is permitted when executed dynamically inside a parallel region is also permitted when executed outside a parallel region. When executed dynamically outside a user-specified parallel region, the directive is executed with respect to a team composed of only the master thread.