This chapter contains the following sections:
“Overview” explains how to fine-tune program execution using directives and assertions.
“Fine-Tuning Inlining and IPA” describes how to use directives to use inlining and IPA more specifically than with command line options.
“Circumventing PFA” explains how to use directives to bypass PFA's analysis and leave areas of code unchanged.
“Running Code Serially” explains how to use directives and assertions to stop PFA from running specific code in parallel.
“Running Code in Parallel” explains how to use directives and assertions to tell PFA that it is safe to run specific parts of code in parallel.
“Ignoring Data Dependencies” explains how to tell PFA that apparently data-dependent code is safe to run in parallel.
“Using Equivalenced Variables” explains how to assert that your code uses or does not use equivalenced variables.
“Using Aliasing” describes the assertions used with aliasing.
After you run a Fortran source program through PFA once, you can use directives and assertions to fine-tune program execution. The listing file will show where and why PFA did not parallelize the code.
You can use directives and assertions to force PFA to execute portions of code in various ways. Command line directives apply to the program as a whole.
If you want finer control for parallelizing a critical loop or inlining a particular occurrence of a routine, specify directives and assertions directly in the code. You can also use directives and assertions to keep PFA from converting code to run in parallel. In other cases you might want to explicitly force PFA to run segments of code in parallel even though it normally would not.
Chapter 4, “Customizing PFA Execution,” explains how to use inlining and IPA on an entire program (refer to “Performing Inlining and Interprocedural Analysis”). You can fine-tune inlining and IPA using the C*$*[NO] INLINE and C*$*[NO] IPA directives.
The C*$* [NO] INLINE directive behaves much the same as the -INLINE command line option, but with the directive you can specify which occurrences of a routine are actually inlined. The format for this directive is
C*$*[NO]INLINE [(name[,name ... ])] {HERE|ROUTINE|GLOBAL} |
where
name | Specifies the routines to be inlined. If you do not specify a name, this directive will affect all routines in the program. | |
HERE | Applies the INLINE directive only to the next line; occurrences of the named routines on that next line are inlined. | |
ROUTINE | Inlines the named routines everywhere they appear in the current routine. | |
GLOBAL | Inlines the named routines throughout the source file. |
The C*$*NOINLINE form overrides the -INLINE command line option and so allows you to selectively disable inlining of the named routines at specific points.
In the following code fragment, the C*$*INLINE directive inlines the first call to beta but not the second.
do i =1,n C*$*INLINE (beta) HERE call beta (i,1) enddo call beta (n, 2) |
Using the specifier ROUTINE rather than HERE inlines both calls. This routine must be compiled with the -inline_man command line option for the C*$* INLINE directive to be recognized.
The C*$* [NO] IPA directive is the analogous directive for interprocedural analysis. The format for this directive is
C*$*[NO]IPA [(name [,name...])] {HERE|ROUTINE|GLOBAL} |
Sometimes you might need to hand-tune a DO loop so that it will run in parallel. Use the directives in this section to prevent PFA from analyzing your modified code.
The C$ DOACROSS directive tells the Fortran 77 compiler to generate parallel code for the following loop. When PFA encounters this directive on input, it does not modify the accompanying loop and therefore does not interfere with any hand-tuning.
C$ DOACROSS is the standard method for parallelism in Fortran. This directive is the same directive that PFA generates as a result of its analysis. Refer to the Fortran 77 Programmer's Guide for more information about the
C$ DOACROSS directive and its optional clauses.
PFA runs the following code as it appears:
C$ DOACROSS DO 10 I=1, 100 A(I) = B(I) 10 CONTINUE |
Use the following assertions and directives to keep PFA from running specific code in parallel.
The C*$* ASSERT DO (SERIAL) assertion tells PFA to run the specified loop serially. PFA does not try to convert the specified loop to run in parallel. It also does not try to run any enclosing loop in parallel. However, PFA can still convert any loops nested inside the serial loop to run in parallel.
Silicon Graphics PFA supports the corresponding Cray directive, CDIR$ NEXT SCALAR. PFA interprets this directive as if it were a C*$* ASSERT DO (SERIAL) assertion and generates scalar code for the next DO loop.
The C*$* ASSERT DO PREFER (SERIAL) assertion indicates that you want to execute a DO loop in serial mode. This assertion directs PFA to leave the DO loop alone, regardless of the setting of the optimization level. You can use this assertion to control which loop (in a nest of loops) PFA chooses to run in parallel. The following example program segment shows how to use the assertion:
DO 100 I = 1, N C*$*ASSERT DO PREFER (SERIAL) DO 100 J = 1, M A(I,J) = B(I,J) 100 CONTINUE |
In the DO loop above, the assertion requests that the J loop be serial. In this construction, PFA tries to run the I loop in parallel but not the J loop. This capability is useful when you know the value of M to be very small or less than N. This assertion applies only to the DO loop that appears directly after the assertion.
This section explains the directives and assertions that allow PFA to determine that specific areas of code are safe to run in parallel.
The C*$*[NO]CONCURRENTIZE directive converts eligible loops to run in parallel. The NO version prevents PFA from converting loops to run in parallel. These directives, when specified globally, have the same effect as the -CONCURRENTIZE and -NOCONCURRENTIZE options (see Chapter 2, “How to Use PFA.”).
PFA supports the VAST directive CVD$CONCUR. This directive runs a loop in parallel to optimize performance. PFA interprets this directive as if it were the C*$*CONCURRENTIZE directive.
The C*$* ASSERT DO PREFER (CONCURRENT) assertion directs PFA to run a particular nested loop in parallel if possible. PFA runs another of the nested loops in parallel only if a condition prevents running the selected loop in parallel.
Consider the following code:
C*$* ASSERT DO PREFER (CONCURRENT) DO 100 I = 1, N DO 100 J = 1, M A (I, J) = B (I, J) 100 CONTINUE |
This code directs PFA to prefer to run the I loop in parallel. However, if a data dependence conflict prevents running the I loop in parallel, PFA might run the J loop in parallel. The C*$* ASSERT DO PREFER (CONCURRENT) assertion applies only to the DO loop immediately before it.
PFA avoids running code in parallel that it believes to be data-dependent. Use the assertions described in the following sections to override this behavior.
The C*$* ASSERT DO (CONCURRENT) assertion tells PFA to ignore assumed data dependencies. Normally, PFA is conservative about converting loops to run in parallel.
When PFA analyzes a loop to see if it is safe to run in parallel, it categorizes the loop into one of three groups:
yes (loop is safe to run in parallel)
no
not sure
Normally, PFA does not run “not sure” loops in parallel. It assumes there are data dependencies. C*$* ASSERT DO (CONCURRENT) tells PFA to go ahead and run “not sure” loops in parallel.
![]() | Note: If PFA identifies a loop as containing definite (as opposed to assumed) data dependencies, it does not run the loop in parallel even if you specify a C*$* ASSERT DO (CONCURRENT) assertion. |
PFA interprets the Cray directive CDIR$ IVDEP as if it were a C*$* ASSERT DO (CONCURRENT) assertion. Some dependencies that are safe to run on Cray hardware are not safe to run on SGI hardware. Therefore, recognition of this assertion is turned off by default.
The C*$* ASSERT CONCURRENT CALL tells PFA to ignore assumed dependencies that are due to a subroutine call or a function reference. However, you must ensure that the subroutines and referenced functions are safe for parallel execution. This assertion applies to all subroutine and function references in the accompanying loop, which must appear on the next line.
The C*$* ASSERT NO RECURRENCE(variable) assertion tells PFA to ignore all data dependencies associated with variable. PFA ignores not just assumed dependencies (as with the C*$* ASSERT DO (CONCURRENT) assertion) but also real dependencies. Use this assertion to force PFA to parallelize a loop when other, gentler means have failed. Use this assertion with caution, as indiscriminate use can result in illegal parallel code.
The C*$* ASSERT PERMUTATION(array) assertion tells PFA that array contains no repeated values. This assertion permits PFA to run in parallel certain kinds of loops that use indirect addressing, for example,
DO I = 1, N A(INDEX(I)) = A(INDEX(I)) + B(I) ENDDO |
You can run this loop in parallel only if the array INDEX has no repeated values (so that each INDEX (I) is unique). PFA cannot determine this, so it does not run such a loop in parallel. However, if you know that every element of INDEX() is unique, you can insert the following line before the loop to permit PFA to run the loop in parallel:
C*$* ASSERT PERMUTATION (INDEX) |
The C*$* ASSERT NO EQUIVALENCE HAZARD assertion tells PFA that your code does not use equivalenced variables to refer to the same memory location inside one loop nest. Normally, EQUIVALENCE statements allow your code to use different variable names to refer to the same storage location. The -ASSUME=E command line option acts like the global C*$* ASSERT EQUIVALENCE HAZARD assertion (see “Global Assumptions” in Chapter 4). The C*$* ASSERT EQUIVALENCE HAZARD assertion is active until you reset it or until the end of the program unit.
PFA has several assertions for use with aliasing.
The C*$* ASSERT [NO] ARGUMENT ALIASING assertion allows PFA to make assumptions about subprogram arguments in a program. According to the Fortran 77 standard, you can alias a variable only if you do not modify (that is, write to) the aliased variable.
The following subroutine violates the standard, because variable A is aliased in the subroutine (through C and D) and variable X is aliased (through X and E):
COMMON X,Y REAL A,B CALL SUB (A, A, X) ... SUBROUTINE SUB(C,D,E) COMMON X,Y X = ... C = ... ... |
The command line option -ASSUME=P acts like a global C*$* ASSERT ARGUMENT ALIASING assertion (see Chapter 4, “Customizing PFA Execution.”). A C*$* ARGUMENT ALIASING assertion is active until it is reset or until the next routine begins.
The C*$* ASSERT RELATION(name.xx.name) assertion indicates the relationship between two variables or between a variable and a constant. name is the variable or constant, and xx is any of the following: GT, GE, EQ, NE, LT, or LE. This assertion applies only to the next DO statement.
Consider the following code:
DO 100 I = 1, N A (I) = A (I+M) + B (I) 100 CONTINUE |
If you know that M is greater than N, use the following assertion to give this information to PFA:
C*$* ASSERT RELATION (M .GT. N) DO 100 I = 1, N A (I) = A (I +M) + B (I) 100 CONTINUE |
Knowing that M is greater than N, PFA can generate parallel code for this loop. If at run time, M is less than N, the answers produced by the code run in parallel could differ significantly from the answers produced by the original code run serially.
![]() | Note: Many relationships of this type can be cheaply tested for at run time. PFA will attempt to answer questions of this sort by generating an IF statement that explicitly tests the relationship at run time. Occasionally, PFA may need assistance, or you may want to squeeze that last ounce of performance out of some critical loop by asserting some relationship rather than repeatedly checking it at run time. |