Topic: forall index variables in Support

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

17 Jan 2013 7:45 #11407

David,

It never snows much in the south of England, despite being at an Alaskan latitude. There were a few flakes today, and some on Monday. I cleaned my snow chains, and checked them for length on the car, but I haven't fitted them to my last 3 cars ....

A statement function is a horrible construct. But it is a function, and therefore has its own scope. Most people reading an old Fortran program don't understand statement functions. Consider the following:

      SUBROUTINE SAMPLE
      DIMENSION A(5), B(5), C(5)
      D(I) = A(I) + B(I) + C(I)
      ... etc
      END

You can't declare D as an array, as the name has a scope withing SAMPLE of being a FUNCTION. You can't alter I, which picks up the current value of I from wherever D is called. I suppose D must also be a PURE function.

It would be the same if SAMPLE CONTAINS 'FUNCTION D(I)', but expressed that way, D would have access to all variables in SAMPLE, e.g. A, B and C, and could alter any of them, as it is not required that D is PURE if it is CONTAINSed. I suppose the scope of A, B and C is still limited to SAMPLE, but for me it seems wrong somehow that D could have access to variables not passed via parameters or in COMMON and without explicit reference somewhere. A statement function is compelled by its own limitations to be PURE.

I have only ever used 3 statement functions. 2 of them map real world coordinates to screen coordinates, and never give me a problem. The other one always catches me out when I look at the code it is in, which is about once every 6 or 7 years ... I wish the statement function allowed me to declare it as one, i.e. STATEMENT FUNCTION D(I) = etc - but then a suitable comment helps (especially now they can be in-line).

Eddie

davidb

Posts: 555 UK

Back to Top

17 Jan 2013 8:04 (Edited: 17 Jan 2013 8:40) #11409

You are correct to say that a statement function acts as if it is PURE. But this is a happy accident that is forced by the function being restricted to one line. The notion of a PURE function wasn't introduced in the language when Statement functions were invented.

You can write your one line example as a PURE function using CONTAINS if you want. This gives you a more explicit syntax and an explicit interface (and the ability to use more than 1 line).

      SUBROUTINE SAMPLE
      DIMENSION A(5), B(5), C(5)

      ... etc

      CONTAINS
           PURE FUNCTION D(I)
               D = A(I) + B(I) + C(I)
           END FUNCTION D
      END

This function access A, B and C using 'Host association'. The local variables in the main program can be accessed but cannot be changed.

Your statement function is doing exactly the same thing. 😉

Note that a function can still be PURE and read variables outside the scope of the program (host scope, module scope, common) but it can't write to any of these values (or write to disc, screen etc).

[u:24ef2c3877]PS[/u:24ef2c3877]. In the DO i=1,5 loop you posted earlier, I always has the value 6 on exit from the loop. 😃 It won't be 5 or 0 or -1.

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

17 Jan 2013 8:16 #11410

David,

Until I used Clearwin, I was unaware that functions could be anything other than PURE - as all the books (McCracken etc) always gave dire warnings of what might happen if you were daft enough to try. For decades it never occurred to me that parameters for a function could be anything other than INTENT IN (although for a subroutine they were definitely INOUT). Moreover, due to stack limitations, one always used COMMON in preference anyway, and COMMON is decidedly INOUT.

And quite definitely, a compiler should conform to standards.

Eddie

IanLambley

Posts: 501 Sunderland

Back to Top

18 Jan 2013 10:24 #11411

Sorry - that seemed to be like lobbing in a hand grenade and standing back. All I was asking was how do you use a FORALL construct with arrays which have a rank greater than 1. Do all the subscripts have to be addressed by the FORALL ranges or can they be addressed by means external to the FORALL construct? Ian

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

18 Jan 2013 2:14 #11412

Ian,

I think it is an interesting discussion, as the concept of PURE has always been around but it is articulated only in the later standards. What happens if you declare a subprogram to be PURE, and it isn't? I think I might give it a go in the spirit of enquiry espoused by David.

You should be able to mix all constructs, unless the standard says you can't. The problem seems to me that if you mix DO and FORALL, you might not get the answer you expected with experience of nested DOs, and you probably won't with nested FORALLs - if they are a fancy assignment and not an alternative loop.

And David, as the loop variable is undefined on normal exit from a loop, then I never relied on it being anything unless I jumped out of a loop, at which time it retained its inside-the-loop value. I didn't believe then (c. 1979) that the loop variable could be anything other than 5 or maybe 6, so I tried it. But it was a long time ago when I had access to a big range of mainframes via cards or terminal input. It was never reliable, and yes, sometimes it was -1 or 0 on exit. Why, I never understood. With the benefit of hindsight, I ought to have tried it when I had a dozen different 16 bit compilers, but although I do have Absoft 32 bit, I've never been able to use it properly, and as far as G95 or even FTN95 for .NET is concerned, my lack of Klingon means that I can't seem to get past the installation instructions .... If FTN95 gives last value + 1 on loop exit, you shouldn't rely on it.

Eddie

davidb

Posts: 555 UK

Back to Top

18 Jan 2013 4:26 (Edited: 18 Jan 2013 4:54) #11413

Ian,

Quoted from IanLambley Sorry - that seemed to be like lobbing in a hand grenade and standing back. All I was asking was how do you use a FORALL construct with arrays which have a rank greater than 1. Do all the subscripts have to be addressed by the FORALL ranges or can they be addressed by means external to the FORALL construct? Ian

No problem.

You can mix DO and FORALL provided that FORALL is inside DO (You can't have DO inside FORALL). The code you posted is OK. So you can write code like the following:

   DO j=1, N
      FORALL(i=1:M)
         A(i) = B(i) + C(i,j)
      END FORALL
   END DO

When you do this, the J variable has the scope of the program unit (subroutine etc), and the I variable has its own FORALL scope. The FORALL statement is allowed to access J in the surrounding scope provided it doesn't change its value (making the expression PURE).

You can also do this using only FORALL in two ways:

   FORALL(j=1:N)
      FORALL(i=1:M)
         A(i) = B(i) + C(i,j)
      END FORALL
   END FORALL

or

   FORALL(i=1:M, j=1:N)
        A(i) = B(i) + C(i,j)
   END FORALL

davidb

Posts: 555 UK

Back to Top

18 Jan 2013 4:42 #11415

Eddie,

Quoted from LitusSaxonicum

What happens if you declare a subprogram to be PURE, and it isn't?

You will get a compiler error.

Quoted from LitusSaxonicum

The problem seems to me that if you mix DO and FORALL, you might not get the answer you expected with experience of nested DOs, and you probably won't with nested FORALLs - if they are a fancy assignment and not an alternative loop.

That's true. You need to build up some experience using the FORALL syntax. It is different to DOs. If you don't learn how to use it properly, or think they are loops, you could get incorrect results.

Quoted from LitusSaxonicum

And David, as the loop variable is undefined on normal exit from a loop, then I never relied on it being anything unless I jumped out of a loop, at which time it retained its inside-the-loop value.

If a loop is exited (using GOTO or EXIT) the value of the index retains its value outside the loop. If the DO loop runs to completion, the index is defined and has a value that is the first value in the counting sequence that exeeds the final value. e.g.

with DO I=1,5 then I = 6 after the loop (counting sequence 1,2,3,4,5,6) with DO I=1,5,2 then I = 7 after the loop (counting sequence 1,3,5,7) with DO I=1,6,2 then I = 7 after the loop (counting sequence 1,3,5,7) with DO I=4,4 then I = 5 after the loop (counting sequence 4,5) with DO I=4,2 then I = 4 after the loop (no counting, loop isn't executed)

In the most usual case, where the stride is 1, the final value is the count + 1.

This is standard Fortran 77 behaviour so has been around since 1978. (It was different in Fortan 66). The only caveat is that REAL indices are allowed in Fortran 77 and there could be some rounding problems, whereas they have to be integers since Fortran 95.

Most of the time I don't use the value of I after the loop, but sometimes it can be tested to see if a break out of the loop occurred or if the loop ran to completion. e.g.

DO I=1, N
   ...
   IF (...) EXIT
END DO
IF (I <= N) THEN
    ! A jump out occurred.
    ...
END IF

If you don't test I, you have to check for jump out as follows

JUMP = .FALSE.
DO I=1, N
   ...
   IF (...) THEN
      JUMP = .TRUE.
      EXIT
   ENDIF
END DO
IF (JUMP) THEN
    ! A jump out occurred.
    ! Eddie allows I to be used
    ...
END IF

IanLambley

Posts: 501 Sunderland

Back to Top

18 Jan 2013 8:15 #11417

David, Thanks that explains things, regarding multiple subscripts. Regarding your first example, the do loop variable 'I' must therefore only have scope outside the FORALL and the FORALL variable 'I' only has scope within the FORALL and if there is a conflict of variables the within the statements, the FORALL takes precedence and can be regarded as a separate variable. Perhaps, ensuring that it does not clash would give clarity and ensure that the do loop variable was accessible within the FORALL construct if necessary.

Otherwise the old folk like myself would be confused. Ian

davidb

Posts: 555 UK

Back to Top

18 Jan 2013 9:11 (Edited: 19 Jan 2013 8:48) #11418

Quoted from IanLambley David, Thanks that explains things, regarding multiple subscripts. Regarding your first example, the do loop variable 'I' must therefore only have scope outside the FORALL and the FORALL variable 'I' only has scope within the FORALL and if there is a conflict of variables the within the statements, the FORALL takes precedence and can be regarded as a separate variable.

Exactly so. In this case the DO Loop I variable cannot be seen inside the FORALL because it is obscured by the FORALL I variable which takes precedence.

Quoted from IanLambley

Perhaps, ensuring that it does not clash would give clarity and ensure that the do loop variable was accessible within the FORALL construct if necessary. Ian

Agreed. I would never use the construct given in the first example. I would always use different variable names. The example was provided just to illustrate a bug in the compiler.

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

18 Jan 2013 9:13 #11419

(It was different in Fortran 66)

I can't remember which compiler gave what behaviour, but I do remember that the first Fortran-77 compiler I used was on a VAX, and we didn't buy that until the 80s. Even when I got a PC, some of the compilers were still Fortran-66. It just shows that you have to (re-)question everything. I'm never going to rely on the exit value, though!

Eddie

davidb

Posts: 555 UK

Back to Top

18 Jan 2013 9:18 #11420

Quoted from LitusSaxonicum

(It was different in Fortran 66)

... the first Fortran-77 compiler I used was on a VAX Eddie

SNAP.

But I didn't learn Fortran 77 properly until I had it installed on an Acorn Achimedes and then a Acorn Cambridge Workstation (miss them days).

JohnCampbell

Posts: 2526 Sydney

Back to Top

20 Jan 2013 12:48 #11424

I've been away, so it is interesting to read what has been discussed about FORALL. There are two sides to this: One is that David is right in that there is an error with FTN95 not identifying the error. The other is that I would disagree with David, when he states:

It doesn't help thinking about FORALL as a loop. ... Its not even close to being a loop.

While the syntax is not a do loop, the compiler implements a do loop equivalent. David's example where the loop should be run backwards identifies that: To get the intended result, a DO loop should be run backwards. For a FORALL, it must take a copy of the array, then act on this old array to produce the new array. I would consider that inefficient to do that, similar to the temporary copy for array sections in multi-dimension arrays.

My impression is that FORALL is not an efficient construct, contrary to it's initially being for identifying parallel computation.

I'm old-school and only ever use it in example coding. David's identification of errors in FORALL implementation is also helpful for those who choose to use it.

John

davidb

Posts: 555 UK

Back to Top

20 Jan 2013 1:56 #11426

Quoted from JohnCampbell

While the syntax is not a do loop, the compiler implements a do loop equivalent. David's example where the loop should be run backwards identifies that: To get the intended result, a DO loop should be run backwards. For a FORALL, it must take a copy of the array, then act on this old array to produce the new array. I would consider that inefficient to do that, similar to the temporary copy for array sections in multi-dimension arrays.

It depends on the compiler and the hardware.

Certainly Silverfrost's FTN95 implements the FORALL by making a temporary copy when necessary, and in such cases it isn't very efficient.

Other compilers with better optimization will be able to convert the FORALL to a DO loop which doesn't need the copy operation (like in the example where the DO loop goes backwards).

It is fair to say that most compilers are not very good at implementing FORALL efficiently and in most cases, a good crafted DO loop will outperform FORALL. On the other hand, there are a small number of compilers (e.g. Intel, Portland) which do a good job optimizing FORALL and produce code which is on par with or better than the equivalent DO loop -- performance always depends on the underlying vectorisation hardware and whether the compiler can take benefit of it.

Now if Silverfrost decided to offer limited support in FTN95 for SSE and AVX instructions when FORALL is used, there may be a cause to switch, but as of now, you should stick to DO loops in your applications if efficiency is paramount.

PaulLaidler

Posts: 7981 Salford, UK

Back to Top

27 Mar 2013 5:22 #11898

I have fixed the bug so that FORALL can now use the same index as an outer DO construct.

This fix will be included in the next release (after 6.35).

davidb

Posts: 555 UK

Back to Top

27 Mar 2013 5:44 #11902

Thank you Paul!

simon

Posts: 314

Back to Top

26 Aug 2013 5:08 #12920

Paul - can you confirm whether FORALL is running any faster than it used to? As David, suggests, on most compilers it does not necessarily improve efficiency.

If I run the following program, the various FORALL constructs don't seem to perform very well. I have tried compiling using the following options and the first FORALL seems to perform consistently badly, and always at least as badly as having nested FORALL statements.

/lgo /lgo /optimise /lgo /check

Somewhat intriguingly (to me anyway), if a single FORALL statement is used, it seems to make an important difference which index is listed first, and some compilers seem to prefer one ordering whereas others prefer the opposite. So I've avoided using FORALL, simply because it actually seems to make the program run slower! I can't even run these types of tests to see under what conditions FORALL does work faster because it is quite likely to go slower under a different compiler.

PROGRAM t
!
! Compile this program using the following options, and compare times
! FTN95 t.f95 /lgo
! FTN95 t.f95 /lgo /optimise
! FTN95 t.f95 /lgo /check
!
  INTEGER, PARAMETER :: m=9000,n=5000
  REAL :: a(m,n)
!
  CALL CPU_TIME (t1)
  DO i=1,n
     DO j=1,m
        a(j,i)=0.0
     END DO
  END DO
  CALL CPU_TIME (t2)
  PRINT*, 'DO loops in correct order       ',t2-t1
!
  CALL CPU_TIME (t1)
  DO j=1,m
     DO i=1,n
        a(j,i)=0.0
     END DO
  END DO
  CALL CPU_TIME (t2)
  PRINT*, 'DO loops in incorrect order     ',t2-t1
!
  CALL CPU_TIME (t1)
  a(1:m,1:n)=0.0
  CALL CPU_TIME (t2)
  PRINT*, 'a(1:m,1:n)                      ',t2-t1
!
  CALL CPU_TIME (t1)
  a(:,:)=0.0
  CALL CPU_TIME (t2)
  PRINT*, 'a(:,:)                          ',t2-t1
!
  CALL CPU_TIME (t1)
  a=0.0
  CALL CPU_TIME (t2)
  PRINT*, 'a                               ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (j=1:m,i=1:n)
     a(j,i)=0.0
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'One FORALL statement            ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (i=1:n,j=1:m)
     a(j,i)=0.0
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'One FORALL statement, reversed  ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (j=1:m)
     FORALL (i=1:n)
        a(j,i)=0.0
     END FORALL
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'FORALL loops in incorrect order ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (i=1:n)
     FORALL (j=1:m)
        a(j,i)=0.0
     END FORALL
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'FORALL loops in correct order   ',t2-t1
END PROGRAM t

PaulLaidler

Posts: 7981 Salford, UK

Back to Top

27 Aug 2013 5:53 #12923

No work has been done on FTN95 to make FORALL more efficient.

If you want to see what FTN95 does with a given FORALL statement then you can look at the output given using /EXPLIST. You don't need to know assembler coding to get an understanding of the ordering.

As a general rule you will get a faster run time when using /opt whilst /check will be slow and is intended for development and testing only.

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

27 Aug 2013 10:42 #12929

Re:

the various FORALL constructs don't seem to perform very well

I ran it on a quad core Phenom II system with Windows 7 64 bit, with the following results:

                                 Raw timings      Ratio to correct order DO
                                  Base    Opt      Base    Opt
 DO loops in correct order       0.20280 0.07800  1.00000  1.00000
 DO loops in incorrect order     1.54441 1.52881  7.61540 19.60000 
 a(1:m,1:n)                      0.15600 0.04680  0.76923  0.60000 
 a(:,:)                          0.04680 0.03120  0.23077  0.40000 
 a                               0.03120 0.04680  0.15385  0.60000  
 One FORALL statement            1.54441 1.49761  7.61540 19.20001  
 One FORALL statement  reversed  0.15600 0.04680  0.76923  0.60000
 FORALL loops in incorrect order 1.54441 1.51321  7.61540 19.40000 
 FORALL loops in correct order   0.17160 0.04680  0.84615  0.60000

Using OPT or not, the correct order DO loops are bettered by 2 of the FORALL constructs, and also by various array operations., although without OPT the overall winner is different, and the relative timings are more spread out. The relative timings improved with OPT for a(:,:) but worsened for just plain a - which is probably worth Paul taking a look at.

I think the most striking point is that there is a certain amount you can do to make your code run faster, but if you get it wrong, you can slow it down dramatically!

Eddie

simon

Posts: 314

Back to Top

27 Aug 2013 1:50 #12931

Based on Eddie's helpful summary of timings, it seems to me that FTN95 works reasonably well under certain circumstances. However, it is worth noting that whereas

FORALL (i=1:n,j=1:m)

works more slowly than

FORALL (j=1:m,i=1:n)

with FTN95, I have seen the opposite on other compilers.

I have also compared the following syntaxes:

FORALL (i=1:n) a(j,i)=0.0

and

FORALL (i=1:n)
   a(j,i)=0.0
END FORALL

In some cases the former works faster than the latter, but in others the opposite is the case. However, the differences in timing are typically small.

Based on these results (and a few others not shown), I rather hesitantly conclude (for now) that it makes sense to implement FORALL, but with the following qualifications:

Use array operations of the form a= or a(:,:)= where possible;
It is worth implementing FORALL statements in place of DO where there is only one loop;
Where there are multiple loops, FORALL statements should be nested explicitly in the appropriate order, i.e.:

FORALL (i=1:n) FORALL (j=1:m) a(j,i)=0.0 END FORALL END FORALL

in preference to

  FORALL (j=1:m,i=1:n)
     a(j,i)=0.0
  END FORALL

and in preference to

  FORALL (i=1:n, j=1:m)
     a(j,i)=0.0
  END FORALL

LitusSaxonicum

Posts: 2284 Yateley, Hants, UK

Back to Top

27 Aug 2013 4:27 #12938

Simon,

I'll go further. OPT sometimes seems to make me trip over, and as an old dog, FORALL is too new a trick. It needs nibbling at!

However, as a non-OPT user, I can just about get round to using the whole array name = constant instead of nested DO loops if I want to zero all of it, but if say, a is dimensioned 100,100, and one only wants to zero 10x10, the nested DOs are much quicker. Nested DOs seem to me to be more straightforward than FORALL if one wants to use the loop variable itself inside the loop, and the provision for that is probably what slows down the single DO.

Getting the right subscript order was recommended in Kreitzberg & Schneiderman's 'the elements of FORTRAN style' in 1972 - plus ca change.

Eddie

(As evidence of the old-doggedness, I can't make head nor tail of even installing some other compilers, let alone using them!).