|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7938 Location: Salford, UK
|
Posted: Tue Jul 03, 2018 11:00 am Post subject: |
|
|
Klaus
There has been no attempt to "fix" FTN95 with respect the initial program that you posted on this thread.
FTN95 does not process the following line as you would like...
Code: | c = spread(a,dim=2,ncopies=n) * spread(b,dim=1,ncopies=m ) |
If you want to multiply two matrices together then FTN95 expects you to call MATMUL.
Code: | c = MATMUL(spread(a,dim=2,ncopies=n), spread(b,dim=1,ncopies=m)) |
|
|
Back to top |
|
|
KL
Joined: 16 Nov 2009 Posts: 144
|
Posted: Tue Jul 03, 2018 12:15 pm Post subject: |
|
|
Thank you very much, Paul. I misunderstood what was meant by "the problem had been fixed".
Your proposal works well for m = n (if divided by m). But it is not the fastest method: the method mentioned also in this thread (to eliminate the inner do loop) is faster by a factor of 2-3. However, for m /= n the two array shapes are non-conformant.
I have rerun the case with Code::Blocks and the GNU compiler. With this compiler, the "spread solution" is by far the fastest method. No idea why, but obviously both intrinsic spread functions (ftn95/GNU) differ in their conception. As mentioned earlier, I have no insight to get any further.
Klaus |
|
Back to top |
|
|
John-Silver
Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley
|
Posted: Wed Jul 04, 2018 12:10 pm Post subject: |
|
|
the ftn.enh file would confirm that it was SPREAD which was fixed but it isn't included in this beta279 release, only the clrwin.enh .
Maybe in future it would be good to include all relevant .enh's _________________ ''Computers (HAL and MARVIN excepted) are incredibly rigid. They question nothing. Especially input data.Human beings are incredibly trusting of computers and don't check input data. Together cocking up even the simplest calculation ... " |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7938 Location: Salford, UK
|
Posted: Wed Jul 04, 2018 1:09 pm Post subject: |
|
|
There is/was no evidence of a fault in SPREAD.
So SPREAD has not been fixed. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2560 Location: Sydney
|
Posted: Thu Jul 05, 2018 4:51 am Post subject: |
|
|
Paul,
Run mecej4's test program using FTN95 and gFortran. The following adaptation gives an indication that gFortran is running !
I ran it in PLATO selecting Release Win32 then Release x64 : Tools>Options>"Use gFortran/gcc for x64" Code: | program v2m
! construct matrix A(n,n) with elements A(i,j) = i for i = 1:n, j=1:n
implicit none
integer :: i,k,n
real, allocatable :: v(:),A(:,:)
integer*8 :: t1,t2,rate
real*4 :: sec
!
n = 25
do k=1,4
allocate(v(n),A(n,n))
v = (/ (i, i=1,n) /)
call system_clock (t1,rate)
A = spread(v, DIM=2, NCOPIES = n)
call system_clock (t2,rate)
deallocate(v,A)
sec = real(t2-t1)/real(rate) ; write (*,*) sec
write(*,'(I4,2x,F8.4,1x,A)')n,sec,'s'
n=n*2
end do
!
end program |
There may not be a fault, but there is definitely a performance problem. |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7938 Location: Salford, UK
|
Posted: Thu Jul 05, 2018 6:30 am Post subject: |
|
|
John
There was a fault in FTN95 relating to the way it called SPREAD in certain contexts and this was demonstrated in Mecej4's sample program. This has been fixed in the latest beta download. If there is still a performance hit compared with gFortran then it should be negligible. |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1895
|
Posted: Thu Jul 05, 2018 12:38 pm Post subject: Re: |
|
|
PaulLaidler wrote: | John
There was a fault in FTN95 relating to the way it called SPREAD in certain contexts and this was demonstrated in Mecej4's sample program. This has been fixed in the latest beta download. If there is still a performance hit compared with gFortran then it should be negligible. |
Paul, I agree that there is no justification for blaming SPREAD itself. The problem is that FTN95 compiles some Fortran expressions containing SPREAD in such a way that the resulting program is extremely inefficient and slow, because a large number of calls to SPREAD are made where just a single call would suffice.
Let us note at the outset that no matrix multiplication is involved in the following (or in the example codes that were posted earlier). My v2m example was constructed to show the existence of the inefficiency in the simplest way that I could think of. The v8.30.279 beta release fixes that.
Unfortunately, the inefficiency is still a major problem if the expression containing SPREAD involves anything beyond a simple reference to SPREAD. Klaus and John C. provided example codes where the expression was the product of two references to SPREAD. Below I give timings from an adaptation of John's example with that expression split into two statements. In place of
Code: | c = spread ( a,dim=2,ncopies=n ) * spread ( b,dim=1,ncopies=m ) | write
Code: | c = spread ( a,dim=2,ncopies=n )
c = c * spread ( b,dim=1,ncopies=m ) ! This is NOT a matrix multiply operation |
Examination of the generated code using /64 /opt /explist shows the problem clearly:
(i) Calculation of each element of the result matrix C involves the MULSD instruction, which occurs only once in the listing;
(ii) The MULSD is in a loop, and a call to SPREAD is located in the same loop. As a result, calculating the final result C involves making m X n calls to SPREAD, which is extremely expensive.
(iii) A single call to SPREAD should suffice for the second Fortran statement above, since C has already been allocated and initialised in the Fortran statement preceding it.
Here are the timing results (2.1 GHz Intel T4300, Windows 10 X64, FTN95 8.30.279)
Code: | System_clock rate = 10000
n time (s)
10 0.004900
50 0.047500
100 0.697100
200 11.048800 |
And, from Gfortran 7.3 with -O2 :
Code: | System_clock rate = 1000000000
n time (s)
10 0.001192
50 0.000013
100 0.000057
200 0.000585 |
|
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7938 Location: Salford, UK
|
Posted: Thu Jul 05, 2018 5:58 pm Post subject: |
|
|
mecej4
My understanding of this issue differs from yours.
I will take another look at it when I get a moment. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|