Problems With Fortran Bindings for MPI | PARALLEL.RU - Информационно-аналитический центр по параллельным вычислениям

10.2.2. Problems With Fortran Bindings for MPI

Up: Fortran Support Next: Problems Due to Strong Typing Previous: Overview

This section discusses a number of problems that may arise when using MPI in a Fortran program. It is intended as advice to users, and clarifies how MPI interacts with Fortran. It does not add to the standard, but is intended to clarify the standard.

As noted in the original MPI specification, the interface violates the Fortran standard in several ways. While these cause few problems for Fortran 77 programs, they become more significant for Fortran 90 programs, so that users must exercise care when using new Fortran 90 features. The violations were originally adopted and have been retained because they are important for the usability of MPI. The rest of this section describes the potential problems in detail. It supersedes and replaces the discussion of Fortran bindings in the original MPI specification (for Fortran 90, not Fortran 77).

The following MPI features are inconsistent with Fortran 90.

Named Constants

MPI-1 contained several routines that take address-sized information as input or return address-sized information as output. In C such arguments were of type MPI_Aint and in Fortran of type INTEGER. On machines where integers are smaller than addresses, these routines can lose information. In MPI-2 the use of these functions has been deprecated and they have been replaced by routines taking INTEGER arguments of KIND=MPI_ADDRESS_KIND. A number of new MPI-2 functions also take INTEGER arguments of non-default KIND. See Section Language Binding

for more information.

Up: Fortran Support Next: Problems Due to Strong Typing Previous: Overview

10.2.2.1. Problems Due to Strong Typing

Up: Problems With Fortran Bindings for MPI Next: Problems Due to Data Copying and Sequence Association Previous: Problems With Fortran Bindings for MPI

All MPI functions with choice arguments associate actual arguments of different Fortran datatypes with the same dummy argument. This is not allowed by Fortran 77, and in Fortran 90 is technically only allowed if the function is overloaded with a different function for each type. In C, the use of void* formal arguments avoids these problems.

The following code fragment is technically illegal and may generate a compile-time error.

integer i(5) 
  real    x(5) 
  ... 
  call mpi_send(x, 5, MPI_REAL, ...) 
  call mpi_send(i, 5, MPI_INTEGER, ...)

In practice, it is rare for compilers to do more than issue a warning, though there is concern that Fortran 90 compilers are more likely to return errors.

It is also technically illegal in Fortran to pass a scalar actual argument to an array dummy argument. Thus the following code fragment may generate an error since the buf argument to MPI_SEND is declared as an assumed-size array <type> buf(*).

integer a 
  call mpi_send(a, 1, MPI_INTEGER, ...)

[] Advice to users.

In the event that you run into one of the problems related to type checking, you may be able to work around it by using a compiler flag, by compiling separately, or by using an MPI implementation with Extended Fortran Support as described in Section Extended Fortran Support . An alternative that will usually work with variables local to a routine but not with arguments to a function or subroutine is to use the EQUIVALENCE statement to create another variable with a type accepted by the compiler. ( End of advice to users.)

Up: Problems With Fortran Bindings for MPI Next: Problems Due to Data Copying and Sequence Association Previous: Problems With Fortran Bindings for MPI

10.2.2.2. Problems Due to Data Copying and Sequence Association

Up: Problems With Fortran Bindings for MPI Next: Special Constants Previous: Problems Due to Strong Typing

Implicit in MPI is the idea of a contiguous chunk of memory accessible through a linear address space. MPI copies data to and from this memory. An MPI program specifies the location of data by providing memory addresses and offsets. In the C language, sequence association rules plus pointers provide all the necessary low-level structure.

In Fortran 90, user data is not necessarily stored contiguously. For example, the array section A(1:N:2) involves only the elements of A with indices 1, 3, 5, ... . The same is true for a pointer array whose target is such a section. Most compilers ensure that an array that is a dummy argument is held in contiguous memory if it is declared with an explicit shape (e.g., B(N)) or is of assumed size (e.g., B(*)). If necessary, they do this by making a copy of the array into contiguous memory. Both Fortran 77 and Fortran 90 are carefully worded to allow such copying to occur, but few Fortran 77 compilers do it.Technically, the Fortran standards are worded to allow non-contiguous storage of any array data.

Because MPI dummy buffer arguments are assumed-size arrays, this leads to a serious problem for a non-blocking call: the compiler copies the temporary array back on return but MPI continues to copy data to the memory that held it. For example, consider the following code fragment:

real a(100) 
    call MPI_IRECV(a(1:100:2), MPI_REAL, 50, ...)

Since the first dummy argument to MPI_IRECV is an assumed-size array ( <type> buf(*)), the array section a(1:100:2) is copied to a temporary before being passed to MPI_IRECV, so that it is contiguous in memory. MPI_IRECV returns immediately, and data is copied from the temporary back into the array a. Sometime later, MPI may write to the address of the deallocated temporary. Copying is also a problem for MPI_ISEND since the temporary array may be deallocated before the data has all been sent from it.

Most Fortran 90 compilers do not make a copy if the actual argument is the whole of an explicit-shape or assumed-size array or is a `simple' section such as A(1:N) of such an array. (We define `simple' more fully in the next paragraph.) Also, many compilers treat allocatable arrays the same as they treat explicit-shape arrays in this regard (though we know of one that does not). However, the same is not true for assumed-shape and pointer arrays; since they may be discontiguous, copying is often done. It is this copying that causes problems for MPI as described in the previous paragraph.

Our formal definition of a `simple' array section is

name ( [:,]... [<subscript>]:[<subscript>] [,<subscript>]... )

That is, there are zero or more dimensions that are selected in full, then one dimension selected without a stride, then zero or more dimensions that are selected with a simple subscript. Examples are

A(1:N), A(:,N), A(:,1:N,1), A(1:6,N), A(:,:,1:N)

Because of Fortran's column-major ordering, where the first index varies fastest, a simple section of a contiguous array will also be contiguous.To keep the definition of `simple' simple, we have chosen to require all but one of the section subscripts to be without bounds. A colon without bounds makes it obvious both to the compiler and to the reader that the whole of the dimension is selected. It would have been possible to allow cases where the whole dimension is selected with one or two bounds, but this means for the reader that the array declaration or most recent allocation has to be consulted and for the compiler that a run-time check may be required.

The same problem can occur with a scalar argument. Some compilers, even for Fortran 77, make a copy of some scalar dummy arguments within a called procedure. That this can cause a problem is illustrated by the example

call user1(a,rq)  
      call MPI_WAIT(rq,status,ierr)  
      write (*,*) a  
 
subroutine user1(buf,request) 
      call MPI_IRECV(buf,...,request,...)  
      end

If a is copied, MPI_IRECV will alter the copy when it completes the communication and will not alter a itself.

Note that copying will almost certainly occur for an argument that is a non-trivial expression (one with at least one operator or function call), a section that does not select a contiguous part of its parent (e.g., A(1:n:2)), a pointer whose target is such a section, or an assumed-shape array that is (directly or indirectly) associated with such a section.

If there is a compiler option that inhibits copying of arguments, in either the calling or called procedure, this should be employed.

If a compiler makes copies in the calling procedure of arguments that are explicit-shape or assumed-size arrays, simple array sections of such arrays, or scalars, and if there is no compiler option to inhibit this, then the compiler cannot be used for applications that use MPI_GET_ADDRESS, or any non-blocking MPI routine. If a compiler copies scalar arguments in the called procedure and there is no compiler option to inhibit this, then this compiler cannot be used for applications that use memory references across subroutine calls as in the example above.

Up: Problems With Fortran Bindings for MPI Next: Special Constants Previous: Problems Due to Strong Typing

10.2.2.3. Special Constants

Up: Problems With Fortran Bindings for MPI Next: Fortran 90 Derived Types Previous: Problems Due to Data Copying and Sequence Association

MPI requires a number of special ``constants'' that cannot be implemented as normal Fortran constants, including MPI_BOTTOM, MPI_STATUS_IGNORE, MPI_IN_PLACE, MPI_STATUSES_IGNORE and MPI_ERRCODES_IGNORE. In C, these are implemented as constant pointers, usually as NULL and are used where the function prototype calls for a pointer to a variable, not the variable itself.

In Fortran the implementation of these special constants may require the use of language constructs that are outside the Fortran standard. Using special values for the constants (e.g., by defining them through parameter statements) is not possible because an implementation cannot distinguish these values from legal data. Typically these constants are implemented as predefined static variables (e.g., a variable in an MPI-declared COMMON block), relying on the fact that the target compiler passes data by address. Inside the subroutine, this address can be extracted by some mechanism outside the Fortran standard (e.g., by Fortran extensions or by implementing the function in C).

Up: Problems With Fortran Bindings for MPI Next: Fortran 90 Derived Types Previous: Problems Due to Data Copying and Sequence Association

10.2.2.4. Fortran 90 Derived Types

Up: Problems With Fortran Bindings for MPI Next: A Problem with Register Optimization Previous: Special Constants

MPI does not explicitly support passing Fortran 90 derived types to choice dummy arguments. Indeed, for MPI implementations that provide explicit interfaces through the mpi module a compiler will reject derived type actual arguments at compile time. Even when no explicit interfaces are given, users should be aware that Fortran 90 provides no guarantee of sequence association for derived types or arrays of derived types. For instance, an array of a derived type consisting of two elements may be implemented as an array of the first elements followed by an array of the second. Use of the SEQUENCE attribute may help here, somewhat.

The following code fragment shows one possible way to send a derived type in Fortran. The example assumes that all data is passed by address.

type mytype 
       integer i 
       real x 
       double precision d 
    end type mytype 
 
type(mytype) foo 
    integer blocklen(3), type(3) 
    integer(MPI_ADDRESS_KIND) disp(3), base 
 
call MPI_GET_ADDRESS(foo%i, disp(1), ierr) 
    call MPI_GET_ADDRESS(foo%x, disp(2), ierr) 
    call MPI_GET_ADDRESS(foo%d, disp(3), ierr) 
     
 
base = disp(1) 
    disp(1) = disp(1) - base 
    disp(2) = disp(2) - base 
    disp(3) = disp(3) - base 
 
blocklen(1) = 1 
    blocklen(2) = 1 
    blocklen(3) = 1 
 
type(1) = MPI_INTEGER 
    type(2) = MPI_REAL 
    type(3) = MPI_DOUBLE_PRECISION 
 
call MPI_TYPE_CREATE_STRUCT(3, blocklen, disp, type, newtype, ierr) 
    call MPI_TYPE_COMMIT(newtype, ierr) 
 
! unpleasant to send foo%i instead of foo, but it works for scalar 
! entities of type mytype 
    call MPI_SEND(foo%i, 1, newtype, ...)

Up: Problems With Fortran Bindings for MPI Next: A Problem with Register Optimization Previous: Special Constants

10.2.2.5. A Problem with Register Optimization

Up: Problems With Fortran Bindings for MPI Next: Basic Fortran Support Previous: Fortran 90 Derived Types

MPI provides operations that may be hidden from the user code and run concurrently with it, accessing the same memory as user code. Examples include the data transfer for an MPI_IRECV. The optimizer of a compiler will assume that it can recognize periods when a copy of a variable can be kept in a register without reloading from or storing to memory. When the user code is working with a register copy of some variable while the hidden operation reads or writes the memory copy, problems occur. This section discusses register optimization pitfalls.

When a variable is local to a Fortran subroutine (i.e., not in a module or COMMON block), the compiler will assume that it cannot be modified by a called subroutine unless it is an actual argument of the call. In the most common linkage convention, the subroutine is expected to save and restore certain registers. Thus, the optimizer will assume that a register which held a valid copy of such a variable before the call will still hold a valid copy on return.

Normally users are not afflicted with this. But the user should pay attention to this section if in his/her program a buffer argument to an MPI_SEND, MPI_RECV etc., uses a name which hides the actual variables involved. MPI_BOTTOM with an MPI_Datatype containing absolute addresses is one example. Creating a datatype which uses one variable as an anchor and brings along others by using MPI_GET_ADDRESS to determine their offsets from the anchor is another. The anchor variable would be the only one mentioned in the call. Also attention must be paid if MPI operations are used that run in parallel with the user's application.

The following example shows what Fortran compilers are allowed to do.

The compiler does not invalidate the register because it cannot see that MPI_RECV changes the value of buf. The access of buf is hidden by the use of MPI_GET_ADDRESS and MPI_BOTTOM.

The next example shows extreme, but allowed, possibilities.

MPI_WAIT on a concurrent thread modifies buf between the invocation of MPI_IRECV and the finish of MPI_WAIT. But the compiler cannot see any possibility that buf can be changed after MPI_IRECV has returned, and may schedule the load of buf earlier than typed in the source. It has no reason to avoid using a register to hold buf across the call to MPI_WAIT. It also may reorder the instructions as in the case on the right.

To prevent instruction reordering or the allocation of a buffer in a register there are two possibilities in portable Fortran code:

The compiler may be prevented from moving a reference to a buffer across a call to an MPI subroutine by surrounding the call by calls to an external subroutine with the buffer as an actual argument. Note that if the intent is declared in the external subroutine, it must be OUT or INOUT. The subroutine itself may have an empty body, but the compiler does not know this and has to assume that the buffer may be altered. For example, the above call of MPI_RECV might be replaced by
```
call DD(buf) 
        call MPI_RECV(MPI_BOTTOM,...) 
        call DD(buf) 
```
with the separately compiled
```
subroutine DD(buf) 
          integer buf 
        end 
```
(assuming that buf has type INTEGER). The compiler may be similarly prevented from moving a reference to a variable across a call to an MPI subroutine.
In the case of a non-blocking call, as in the above call of MPI_WAIT, no reference to the buffer is permitted until it has been verified that the transfer has been completed. Therefore, in this case, the extra call ahead of the MPI call is not necessary, i.e., the call of MPI_WAIT in the example might be replaced by
```
call MPI_WAIT(req,..) 
          call DD(buf) 
```
An alternative is to put the buffer or variable into a module or a common block and access it through a USE or COMMON statement in each scope where it is referenced, defined or appears as an actual argument in a call to an MPI routine. The compiler will then have to assume that the MPI procedure ( MPI_RECV in the above example) may alter the buffer or variable, provided that the compiler cannot analyze that the MPI procedure does not reference the module or common block.

In the longer term, the attribute VOLATILE is under consideration for Fortran 2000 and would give the buffer or variable the properties needed, but it would inhibit optimization of any code containing the buffer or variable.

In C, subroutines which modify variables that are not in the argument list will not cause register optimization problems. This is because taking pointers to storage objects by using the & operator and later referencing the objects by way of the pointer is an integral part of the language. A C compiler understands the implications, so that the problem should not occur, in general. However, some compilers do offer optional aggressive optimization levels which may not be safe.