|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
KennyT
Joined: 02 Aug 2005 Posts: 317
|
Posted: Mon Nov 27, 2017 1:18 pm Post subject: syntax problem in v8.20? |
|
|
the following code segment crashes (floating stack overflow) at the RMSN= line:
Code: | REAL, ALLOCATABLE :: RA(:), DIFF(:)
ALLOCATE (RA(NE), DIFF(NE), stat=IST); IF( IST.GT.0 ) RETURN
DO I = 1, NE
DIFF(I) = SA(I+1) - SA(I)
END DO
WHERE (ABS(DIFF).GT.1.E15 ) DIFF = 0.
AV = SUM(DIFF)/NE
RA(1:NE) = DIFF(1:NE)- AV
RMSN = SQRT(SUM((ra)*(ra) )/FLOAT(2*NE))
|
but this version doesn't crash:
Code: | REAL, ALLOCATABLE :: RA(:), DIFF(:), r1(:)
ALLOCATE (RA(NE), DIFF(NE), r1(ne), stat=IST); IF( IST.GT.0 ) RETURN
DO I = 1, NE
DIFF(I) = SA(I+1) - SA(I)
END DO
WHERE (ABS(DIFF).GT.1.E15 ) DIFF = 0.
AV = SUM(DIFF)/NE
RA(1:NE) = DIFF(1:NE)- AV
r1(1:ne) = ra(1:ne)*ra(1:ne)
RMSN = SQRT(SUM(r1)/FLOAT(2*NE))
|
is the first version now wrong? - i think it was OK in v8.01...
K |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2556 Location: Sydney
|
Posted: Mon Nov 27, 2017 1:46 pm Post subject: |
|
|
I think there is a problem with a possible temporary array for "SUM((ra)*(ra) )", but there are a lot of questions about this.
Are you using /64 or not ?
Did it work before ?
Would "SUM((ra)*(ra) )" require a temporary array, or should you accumulate the sum in a DO loop, without requiring a temporary array ?
I note that as of ver 8.20 : 64 bit FTN95 now stores large static arrays in the same way as automatic arrays. What would happen to a temporary array ?
Is SUM((ra)*(ra) ) different from SUM(ra*ra) ? Why not use the following. I would try to avoid a temporary array implied by (ra)*(ra)
Code: | AV = SUM(DIFF)/NE
RMSN = 0
DO I = 1, NE
RMSN = RMSN + (DIFF(I) - AV) ** 2
END DO
RMSN = SQRT( RMSN/FLOAT(2*NE) )
|
Also, why don't temporary arrays go on the heap ?
Why wouldn't it be better to eliminate all temporary arrays and have: Code: | sx = 0
sxx = 0
DO I = 1, NE
x = SA(I+1) - SA(I)
if ( abs(x) > 1.E15 ) cycle
sx = sx + x
sxx = sxx + x*x
END DO
AV = sx/NE
RMSN = SQRT ( ( sxx/NE - av*av) / 2 ) |
Last edited by JohnCampbell on Mon Nov 27, 2017 2:06 pm; edited 1 time in total |
|
Back to top |
|
|
KennyT
Joined: 02 Aug 2005 Posts: 317
|
Posted: Mon Nov 27, 2017 2:02 pm Post subject: |
|
|
Thanks John,
no, this is 32-bit code
i think the original version was coded for speed at runtime. the routine gets called a lot!
pretty sure it used to work in v8.01.
K |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2556 Location: Sydney
|
Posted: Mon Nov 27, 2017 2:12 pm Post subject: |
|
|
The use of " if ( abs(x) > 1.E15 ) cycle" may restrict optimisation, but avoiding all the temporary arrays would improve cache usage.
In general, do you find the use of SUM ( array*array ) an efficient approach ? It would not be my choice, but worth considering.
DOT_PRODUCT is an alternative, but is not typically optimised.
John |
|
Back to top |
|
|
KennyT
Joined: 02 Aug 2005 Posts: 317
|
Posted: Mon Nov 27, 2017 3:24 pm Post subject: |
|
|
Is SUM((ra)*(ra) ) different from SUM(ra*ra) ?
YES. it seems that removing the extra brackets around the RA arrays fixes it!
just have to go and look for other occasions where we've used that syntax elsewhere...
K |
|
Back to top |
|
|
KennyT
Joined: 02 Aug 2005 Posts: 317
|
Posted: Mon Nov 27, 2017 5:09 pm Post subject: |
|
|
Paul, if you need a test program...
Code: |
!ftn95$free
program test
real, allocatable :: ra(:)
allocate(ra(10))
do i=1,10
ra(I)=float(I)*1.75
end do
rmsn = sqrt(sum((ra)*(ra))/20.0)
end |
interestingly, this doesn't crash....
Code: | !ftn95$free
program test
real, allocatable :: ra(:)
allocate(ra(10))
do i=1,10
ra(I)=float(I)*1.75
end do
rmsn = sqrt(sum((1.0*ra)*(1.0*ra))/20.0)
end
|
K |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2556 Location: Sydney
|
Posted: Tue Nov 28, 2017 9:16 am Post subject: |
|
|
Ken,
Try this example on either 32-bit or 64-bit.
(I have configured to use /fpp, see noteson64bitftn95.txt)
Code: | !ftn95$free
program test
real, allocatable :: ra(:)
real :: rmsn, rm2
integer :: num, i ,k,kk
integer*8 :: t1,t2,t3, t4
do kk = 1,2 ! 2 pass for less unstable timer
do k = 1,5
num = 10**k
call rdtsc (t1)
allocate(ra(num))
do i=1,num
ra(I)=float(I)*1.75
end do
call rdtsc (t2)
! rmsn = sqrt(sum((ra)*(ra)) /2./num) ! fails
! rmsn = sqrt(sum((1.0*ra)*(1.0*ra)) /2./num) ! works
rmsn = sqrt(sum(ra*ra) /2./num) ! works
call rdtsc (t3)
rm2 = sqrt( dot_product ( ra, ra ) /2./num) ! works
call rdtsc (t4)
write (*,*) 'num=',num, ' rmsn =',rmsn, ' expecting',rm2
write (*,*) t2-t1, t3-t2, t4-t3
deallocate (ra)
end do
end do
end
subroutine rdtsc (tick)
integer*8 tick
! real*10, external :: cpu_clock@ 32-bit
! integer*8, external :: RDTSC_VAL@ 64-bit
CIF(_WIN64)
tick = RDTSC_VAL@ ()
CELSE
tick = cpu_clock@ ()
CENDIF
end subroutine rdtsc |
I'd expect that dot_product would be the best option, as no temporary arrays could be implied. /64 is much better.
The use of the timer was a bit unstable for low num.
I also looked at expanding out sum ( (SA(I+1) - SA(I))^2 ) but did not find a better solution.
To remove the IF from calc lops, could you use:
Code: | N = 1
DO I = 1, NE
DIFF(N) = SA(I+1) - SA(I)
IF (ABS(DIFF(N)).LT.1.E15 ) N = N+1
END DO
N = N-1
AV = SUM(DIFF(1:N))/N
DIFF(1:N) = DIFF(1:N)-AV |
Last edited by JohnCampbell on Tue Nov 28, 2017 9:42 am; edited 1 time in total |
|
Back to top |
|
|
KennyT
Joined: 02 Aug 2005 Posts: 317
|
Posted: Tue Nov 28, 2017 9:40 am Post subject: |
|
|
Thanks again John.
interestingly, I further modified the program to do a variable number of "kk" loops and found that, for 5 loops, "dot product" was slightly faster but for 100 loops, "sum" was faster (which probably means they are actually identical for all practical purposes!)
K |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2556 Location: Sydney
|
Posted: Tue Nov 28, 2017 9:45 am Post subject: |
|
|
Could that imply that for /32, a temporary array is not being generated ?
I always try to remove possible stack overflow constructs.
/64 is different at this time. |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1888
|
Posted: Tue Nov 28, 2017 10:11 am Post subject: Re: |
|
|
JohnCampbell wrote: | I always try to remove possible stack overflow constructs. |
Please note that the initial post of this thread stated that a floating point stack overflow occurred. That is, the generated code probably attempted an FLD x87 instruction at an instant when the registers ST0 to ST7 were already occupied. The compiler should have kept track of how many x87 registers were used up, and should have used the CPU stack space as scratchpad space if necessary to relieve the pressure on the FPU register stack.
I have seen other instances where 32-bit FTN95 had similar problems. Changing the CPU stack size (by using linker options, etc.) will have no effect on this problem, so it is important not to confuse FPU stack overflow, which applies only to the x87 segment of the processor, with the more common stack overflow in main memory.
Earlier threads relating to FPU stack overflow:
http://forums.silverfrost.com/viewtopic.php?t=940
http://forums.silverfrost.com/viewtopic.php?t=718
http://forums.silverfrost.com/viewtopic.php?t=730
A critical article by an expert (Kahan is one of the fathers of the X87):
http://cims.nyu.edu/~dbindel/class/cs279/stack87.pdf
An article that goes into details of the undesirable effects of x87 register spillage, The pitfalls of verifying floating-point computations by David Monniaux:
https://hal.archives-ouvertes.fr/hal-00128124v5/document
Last edited by mecej4 on Tue Nov 28, 2017 11:32 am; edited 5 times in total |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2556 Location: Sydney
|
Posted: Tue Nov 28, 2017 10:18 am Post subject: |
|
|
mecej4,
That is correct, which means we should highlight the post above which exhibits this problem, for Paul to address: Code: | !ftn95$free
program test
real, allocatable :: ra(:)
allocate(ra(10))
do i=1,10
ra(I)=float(I)*1.75
end do
rmsn = sqrt(sum((ra)*(ra))/20.0)
end |
|
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7932 Location: Salford, UK
|
Posted: Thu Nov 30, 2017 7:46 am Post subject: |
|
|
Thanks. I have made a note of this. |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7932 Location: Salford, UK
|
Posted: Thu Nov 30, 2017 9:27 am Post subject: |
|
|
This regression in v8.20 has now been fixed for the next release. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|