forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

forall index variables
Goto page Previous  1, 2, 3
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
davidb



Joined: 17 Jul 2009
Posts: 560
Location: UK

PostPosted: Fri Jan 18, 2013 10:18 pm    Post subject: Re: Reply with quote

LitusSaxonicum wrote:
Quote:
(It was different in Fortran 66)

... the first Fortran-77 compiler I used was on a VAX
Eddie


SNAP.

But I didn't learn Fortran 77 properly until I had it installed on an Acorn Achimedes and then a Acorn Cambridge Workstation (miss them days).
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2623
Location: Sydney

PostPosted: Sun Jan 20, 2013 1:48 pm    Post subject: Reply with quote

I've been away, so it is interesting to read what has been discussed about FORALL.
There are two sides to this:
One is that David is right in that there is an error with FTN95 not identifying the error.
The other is that I would disagree with David, when he states:
Quote:
It doesn't help thinking about FORALL as a loop.
...
Its not even close to being a loop.

While the syntax is not a do loop, the compiler implements a do loop equivalent.
David's example where the loop should be run backwards identifies that:
To get the intended result, a DO loop should be run backwards.
For a FORALL, it must take a copy of the array, then act on this old array to produce the new array. I would consider that inefficient to do that, similar to the temporary copy for array sections in multi-dimension arrays.

My impression is that FORALL is not an efficient construct, contrary to it's initially being for identifying parallel computation.

I'm old-school and only ever use it in example coding.
David's identification of errors in FORALL implementation is also helpful for those who choose to use it.

John
Back to top
View user's profile Send private message
davidb



Joined: 17 Jul 2009
Posts: 560
Location: UK

PostPosted: Sun Jan 20, 2013 2:56 pm    Post subject: Re: Reply with quote

JohnCampbell wrote:

While the syntax is not a do loop, the compiler implements a do loop equivalent.
David's example where the loop should be run backwards identifies that:
To get the intended result, a DO loop should be run backwards.
For a FORALL, it must take a copy of the array, then act on this old array to produce the new array. I would consider that inefficient to do that, similar to the temporary copy for array sections in multi-dimension arrays.


It depends on the compiler and the hardware.

Certainly Silverfrost's FTN95 implements the FORALL by making a temporary copy when necessary, and in such cases it isn't very efficient.

Other compilers with better optimization will be able to convert the FORALL to a DO loop which doesn't need the copy operation (like in the example where the DO loop goes backwards).

It is fair to say that most compilers are not very good at implementing FORALL efficiently and in most cases, a good crafted DO loop will outperform FORALL. On the other hand, there are a small number of compilers (e.g. Intel, Portland) which do a good job optimizing FORALL and produce code which is on par with or better than the equivalent DO loop -- performance always depends on the underlying vectorisation hardware and whether the compiler can take benefit of it.

Now if Silverfrost decided to offer limited support in FTN95 for SSE and AVX instructions when FORALL is used, there may be a cause to switch, but as of now, you should stick to DO loops in your applications if efficiency is paramount.
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 8283
Location: Salford, UK

PostPosted: Wed Mar 27, 2013 6:22 pm    Post subject: Reply with quote

I have fixed the bug so that FORALL can now use the same index as an outer DO construct.

This fix will be included in the next release (after 6.35).
Back to top
View user's profile Send private message AIM Address
davidb



Joined: 17 Jul 2009
Posts: 560
Location: UK

PostPosted: Wed Mar 27, 2013 6:44 pm    Post subject: Reply with quote

Thank you Paul!
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
simon



Joined: 05 Jul 2006
Posts: 308

PostPosted: Mon Aug 26, 2013 6:08 pm    Post subject: Reply with quote

Paul - can you confirm whether FORALL is running any faster than it used to? As David, suggests, on most compilers it does not necessarily improve efficiency.

If I run the following program, the various FORALL constructs don't seem to perform very well. I have tried compiling using the following options and the first FORALL seems to perform consistently badly, and always at least as badly as having nested FORALL statements.

/lgo
/lgo /optimise
/lgo /check

Somewhat intriguingly (to me anyway), if a single FORALL statement is used, it seems to make an important difference which index is listed first, and some compilers seem to prefer one ordering whereas others prefer the opposite. So I've avoided using FORALL, simply because it actually seems to make the program run slower! I can't even run these types of tests to see under what conditions FORALL does work faster because it is quite likely to go slower under a different compiler.

Code:
PROGRAM t
!
! Compile this program using the following options, and compare times
! FTN95 t.f95 /lgo
! FTN95 t.f95 /lgo /optimise
! FTN95 t.f95 /lgo /check
!
  INTEGER, PARAMETER :: m=9000,n=5000
  REAL :: a(m,n)
!
  CALL CPU_TIME (t1)
  DO i=1,n
     DO j=1,m
        a(j,i)=0.0
     END DO
  END DO
  CALL CPU_TIME (t2)
  PRINT*, 'DO loops in correct order       ',t2-t1
!
  CALL CPU_TIME (t1)
  DO j=1,m
     DO i=1,n
        a(j,i)=0.0
     END DO
  END DO
  CALL CPU_TIME (t2)
  PRINT*, 'DO loops in incorrect order     ',t2-t1
!
  CALL CPU_TIME (t1)
  a(1:m,1:n)=0.0
  CALL CPU_TIME (t2)
  PRINT*, 'a(1:m,1:n)                      ',t2-t1
!
  CALL CPU_TIME (t1)
  a(:,:)=0.0
  CALL CPU_TIME (t2)
  PRINT*, 'a(:,:)                          ',t2-t1
!
  CALL CPU_TIME (t1)
  a=0.0
  CALL CPU_TIME (t2)
  PRINT*, 'a                               ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (j=1:m,i=1:n)
     a(j,i)=0.0
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'One FORALL statement            ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (i=1:n,j=1:m)
     a(j,i)=0.0
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'One FORALL statement, reversed  ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (j=1:m)
     FORALL (i=1:n)
        a(j,i)=0.0
     END FORALL
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'FORALL loops in incorrect order ',t2-t1
!
  CALL CPU_TIME (t1)
  FORALL (i=1:n)
     FORALL (j=1:m)
        a(j,i)=0.0
     END FORALL
  END FORALL
  CALL CPU_TIME (t2)
  PRINT*, 'FORALL loops in correct order   ',t2-t1
END PROGRAM t
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 8283
Location: Salford, UK

PostPosted: Tue Aug 27, 2013 6:53 am    Post subject: Reply with quote

No work has been done on FTN95 to make FORALL more efficient.

If you want to see what FTN95 does with a given FORALL statement then you can look at the output given using /EXPLIST. You don't need to know assembler coding to get an understanding of the ordering.

As a general rule you will get a faster run time when using /opt whilst /check will be slow and is intended for development and testing only.
Back to top
View user's profile Send private message AIM Address
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2428
Location: Yateley, Hants, UK

PostPosted: Tue Aug 27, 2013 11:42 am    Post subject: Reply with quote

Re:

Quote:
the various FORALL constructs don't seem to perform very well


I ran it on a quad core Phenom II system with Windows 7 64 bit, with the following results:

Code:
                                 Raw timings      Ratio to correct order DO
                                  Base    Opt      Base    Opt
 DO loops in correct order       0.20280 0.07800  1.00000  1.00000
 DO loops in incorrect order     1.54441 1.52881  7.61540 19.60000
 a(1:m,1:n)                      0.15600 0.04680  0.76923  0.60000
 a(:,:)                          0.04680 0.03120  0.23077  0.40000
 a                               0.03120 0.04680  0.15385  0.60000 
 One FORALL statement            1.54441 1.49761  7.61540 19.20001 
 One FORALL statement  reversed  0.15600 0.04680  0.76923  0.60000
 FORALL loops in incorrect order 1.54441 1.51321  7.61540 19.40000
 FORALL loops in correct order   0.17160 0.04680  0.84615  0.60000


Using OPT or not, the correct order DO loops are bettered by 2 of the FORALL constructs, and also by various array operations., although without OPT the overall winner is different, and the relative timings are more spread out. The relative timings improved with OPT for a(:,:) but worsened for just plain a - which is probably worth Paul taking a look at.

I think the most striking point is that there is a certain amount you can do to make your code run faster, but if you get it wrong, you can slow it down dramatically!

Eddie
Back to top
View user's profile Send private message
simon



Joined: 05 Jul 2006
Posts: 308

PostPosted: Tue Aug 27, 2013 2:50 pm    Post subject: Reply with quote

Based on Eddie's helpful summary of timings, it seems to me that FTN95 works reasonably well under certain circumstances. However, it is worth noting that whereas

Code:
FORALL (i=1:n,j=1:m)

works more slowly than
Code:
FORALL (j=1:m,i=1:n)

with FTN95, I have seen the opposite on other compilers.

I have also compared the following syntaxes:

Code:
FORALL (i=1:n) a(j,i)=0.0


and

Code:
FORALL (i=1:n)
   a(j,i)=0.0
END FORALL


In some cases the former works faster than the latter, but in others the opposite is the case. However, the differences in timing are typically small.

Based on these results (and a few others not shown), I rather hesitantly conclude (for now) that it makes sense to implement FORALL, but with the following qualifications:
1. Use array operations of the form a= or a(:,:)= where possible;
2. It is worth implementing FORALL statements in place of DO where there is only one loop;
3. Where there are multiple loops, FORALL statements should be nested explicitly in the appropriate order, i.e.:
Code:
  FORALL (i=1:n)
     FORALL (j=1:m)
        a(j,i)=0.0
     END FORALL
  END FORALL

in preference to
Code:
  FORALL (j=1:m,i=1:n)
     a(j,i)=0.0
  END FORALL

and in preference to
Code:
  FORALL (i=1:n, j=1:m)
     a(j,i)=0.0
  END FORALL
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2428
Location: Yateley, Hants, UK

PostPosted: Tue Aug 27, 2013 5:27 pm    Post subject: Reply with quote

Simon,

I'll go further. OPT sometimes seems to make me trip over, and as an old dog, FORALL is too new a trick. It needs nibbling at!

However, as a non-OPT user, I can just about get round to using the whole array name = constant instead of nested DO loops if I want to zero all of it, but if say, a is dimensioned 100,100, and one only wants to zero 10x10, the nested DOs are much quicker. Nested DOs seem to me to be more straightforward than FORALL if one wants to use the loop variable itself inside the loop, and the provision for that is probably what slows down the single DO.

Getting the right subscript order was recommended in Kreitzberg & Schneiderman's "the elements of FORTRAN style" in 1972 - plus ca change.

Eddie

(As evidence of the old-doggedness, I can't make head nor tail of even installing some other compilers, let alone using them!).
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Goto page Previous  1, 2, 3
Page 3 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group