forums.silverfrost.com
Welcome to the Silverfrost forums

 Does this code work under debugger and without ? Goto page 1, 2  Next
Author Message
DanRRight

Joined: 10 Mar 2008
Posts: 1814
Location: South Pole, Antarctica

Posted: Thu Mar 02, 2017 11:44 pm    Post subject: Does this code work under debugger and without ?

 Code: i=1 k=2 do while (k>1)    i=i+1.1 if(i/1000000*1000000.eq.i) print*,i enddo end

compile and run it:
>FTN95 a.f95 /64 /debug
>sdbg64 a.exe

This was demonstration code reduced to minimum. It should give the error message of integer overflow but it does not. It goes into infinite loop around 33M.

In larger code I reduced it from it gives wrong error (invalid floating point operation), and the debugger stops on wrong line (next line after offending one)
mecej4

Joined: 31 Oct 2006
Posts: 909

Posted: Fri Mar 03, 2017 1:55 am    Post subject:

Perhaps you did not intend to do so, Dan, but you have exposed a property of the code generated by FTN95-64 for processing mixed integer and real expressions in the XMM registers using floating point instructions. The following adaptation of your program shows the problem in a striking way.
 Code: program danx implicit none integer i i=33554430 i=i+1.1 print*,i end

The printed output is 33554432, instead of the expected 33554431, and the reason is that the expression i+1.1 is calculated using single-precision floating point arithmetic. The value of the expression is such that 24 bits are no longer sufficient to provide the correct conversion to integer. Note that the correct result is 2^25 - 1. For related reasons, the following code will not increment i beyond 2^25, so if you have a DO with a condition on i that depends on such values, the condition may never be satisfied and the program will have an infinite loop.
 Code: program dany implicit none integer i,j i=33554430 do j=1,5    i=i+1.1    print*,i end do end

The Fortran standard puts the responsibility on the programmer to avoid overflow, and you forced floating point evaluation by adding 1.1 instead of 1. If you write 1.1d0, instead, you will find that the correct result is shown, since the expression is then evaluated using double precision reals.
DanRRight

Joined: 10 Mar 2008
Posts: 1814
Location: South Pole, Antarctica

Posted: Fri Mar 03, 2017 3:30 am    Post subject:

WOW! One more dead moster besides 16 and 32bits -- the 24bit ! -- got out of its grave in the 64bit code...And we have to keep this devilry in mind? That is source of big numerous troubles in the future because this is rear thing and red flags will always be forgotten. Due to this feature such hidden errors in 64 bit codes will never be found. Thanks Mecej4, I am shocked

After getting out of shock here what I was initially tried to demonstrate

 Code: i=1 k=2 do while (k>1)    i=i+1.1d0 if(i/10000000*10000000.eq.i) print*,i enddo end

Besides that red double-line in SDBG64 is also good to fix
mecej4

Joined: 31 Oct 2006
Posts: 909

Posted: Fri Mar 03, 2017 5:36 am    Post subject:

 Quote: Besides that red double-line in SDBG64 is also good to fix.

I don't know what that is. I am waiting for the personal edition of 8.1 to be made available. The 8.05 version of SDBG64 is hardly of any use to me. I cannot view assembly and even attempting to make the font larger causes SDBG64 to self-destruct.
JohnCampbell

Joined: 16 Feb 2006
Posts: 1947
Location: Sydney

Posted: Fri Mar 03, 2017 8:17 am    Post subject:

It is surprising that " i = i + 1.1 " would round down below "I = I + 1"
Definitely something to remember, although I rarely use real*4 constants with 24 bit accuracy.

I did some other changes, to make the print test "better", which again resulted in a different round-off problem. Something else to avoid.

64-bit with expectations on larger problems and values is going to throw up more of these.

 Code: integer i,k, next  i=1  k=2  next = 0  do while (k>1)       i=i+1.1d00    if (i >= next) then      print*,i      next = i+1000000    end if  end do  end
PaulLaidler

Joined: 21 Feb 2005
Posts: 5354
Location: Salford, UK

 Posted: Fri Mar 03, 2017 8:47 am    Post subject: At first sight I think that we should be able to fix this. I have made a note that it needs investigating.
mecej4

Joined: 31 Oct 2006
Posts: 909

Posted: Fri Mar 03, 2017 2:22 pm    Post subject:

Paul, my initial reaction on running Dan's example was that a compiler bug was involved. However, further consideration leads me to think that this is a programmer error in the sense that a calculation is performed that causes overflow on some processor (or processor FPU).

Here are some results for my first test program of this thread from the competition, on Windows XP-SP3, 32-bit, Athlon X2 4200+ ('S' = sequential numbers, last digit 1, 2, 3, 4, 5; 'F' = fixed last digit 2).
 Code: gfortran 4.5, 32 bit, -march=i386            S                              -march=i686            S                              -msse2                    F ifort 2013SP1U6     -Qxhost                    F                              -QxSSE2                 F                              -QxSSE3                 S

As you can see, the results are "processor-dependent". Perhaps the best solution is to write code such as
 Code: IVAR = + INT()

 Code: IVAR = +
PaulLaidler

Joined: 21 Feb 2005
Posts: 5354
Location: Salford, UK

 Posted: Fri Mar 03, 2017 3:26 pm    Post subject: mecej4 Thanks for the feedback. I understand that the code is not good and that the result may be processor dependent but I am fairly sure that there is also a bug in FTN95 in this context.
PaulLaidler

Joined: 21 Feb 2005
Posts: 5354
Location: Salford, UK

Posted: Fri Mar 03, 2017 4:04 pm    Post subject:

I understand it now.

 Code: i=33554430 i=i+1.1

In the second line, i is converted to real, then 1.1 is added, then the result is truncated to an integer.

But the key is that the real value suffers from round-off error.

So, yes it is a programming error.
DanRRight

Joined: 10 Mar 2008
Posts: 1814
Location: South Pole, Antarctica

 Posted: Fri Mar 03, 2017 7:12 pm    Post subject: But the problem is that there is no such error in 32 bit mode! Conversion of older 32 bit code to 64 bits here must be straightforward without adding any side effects because we formally still stay with 32 bit arithmetics. Weird 24 bit mode must be excluded as default, period, because this is way too shadow feature, no one will remember it to avoid. Just try to realize this: you took legacy 32 bit code which worked OK and switching from 32 bits to 64 and got 24 bit downgrade with what programmers use most - in mixing real and integer numbers - wow, how absurd the situation is. Worst idiotism ever. Is this what Standard prescribed??? No words. The compiler must report this as a warning then if keeping this craziness.
PaulLaidler

Joined: 21 Feb 2005
Posts: 5354
Location: Salford, UK

 Posted: Fri Mar 03, 2017 7:53 pm    Post subject: Dan I think that you misunderstand the issue. If it works for 32 bits then it's just luck. The round-off error must turn out to be different. A REAL value has only a limited number of significant figures. As far as the compiler is concerned, the code is treated in the same way and any difference is in the associated assembler instructions and the way in which these are implemented by the central processor in use.
mecej4

Joined: 31 Oct 2006
Posts: 909

Posted: Fri Mar 03, 2017 7:59 pm    Post subject: Re:

 DanRRight wrote: But the problem is that there is no such error in 32 bit mode!
That is true for FTN95, but not for Gfortran, Intel or Lahey, all in 32-bit mode (see my previous post for results from those compilers).

FTN95 uses only X87 instructions for FP in 32-bit mode. The other compilers let you choose between X87 and SSE/SSE2/SSE3. The X87 FPU has only 80 bit registers (64 bit mantissa, 15 bit biased exponent, 1 bit sign), so the overflow problem would occur only with much larger numbers than in your test program.

It only adds to the confusion when the terms "32-bit" and "64-bit" are used in vain. Those are address sizes, and have very little to do with FPU registers, X87 or SSE/XMM.

Last edited by mecej4 on Sun Mar 05, 2017 8:16 pm; edited 1 time in total
DanRRight

Joined: 10 Mar 2008
Posts: 1814
Location: South Pole, Antarctica

Posted: Fri Mar 03, 2017 9:24 pm    Post subject: Re:

 PaulLaidler wrote: I think that you misunderstand the issue
How i could misunderstand if i fixed it in my example 2 few messages up?

 PaulLaidler wrote: As far as the compiler is concerned, the code is treated in the same way and any difference is in the associated assembler instructions and the way in which these are implemented by the central processor in use.

This could not be the same obviously. Just comparing the speed of integer and FP operations in old 32 and new 64 compiler tells that the new one is way faster (up to 5x, compile with /opt)

 Code: CALL CPU_TIME(tStart) k=1 1 j=1 do i=1,10000000 j=j+1 enddo k=k+1 if(k.lt.100) goto 1 CALL CPU_TIME(tFinish) RunTime=tFinish-tStart OpPerSInt = 1e9/Runtime Print*, RunTime, OpPerSInt k=1 CALL CPU_TIME(tStart) 2 a=1. do i=1,10000000 a=a+1. enddo k=k+1 if(k.lt.100) goto 2 CALL CPU_TIME(tFinish) RunTime=tFinish-tStart OpPerSfp = 1e9/Runtime Print*, RunTime, OpPerSfp end 32bit  Time          Op/second INT  1.64063        6.095238E+08 FP   2.20313        4.539007E+08 64bit INT  0.281250       3.555556E+09       6x   speedup FP   0.890625       1.122807E+09       2.5x speedup

 mecej4 wrote: It only adds to the confusion when the terms "32-bit" and "64-bit" are used in vain. Those are address sizes, and have very little to do with FPU registers, X87 or SSE/XMM.

One more confusion is added here: the user all his life expected 32bit integer+32bit FP to run to 2B before crash not to 30M. These integers were used as indices of arrays so the change in accuracy directly influencing address space. These compiler manufacturers together with processor designers choose speed. They switched to faster but smaller mantissa FP units and SSE to do integer operations. Did they warn in compilation LOG file that INT*4 + FP*4 could now be misleading ?

OK, this way is faster, but the compiler then must warn about use of real*4 and integer*4 together that the song may end way faster then they expect and suggest to switch at least to real*8 because this with new FP processors has no performance penalty (not sure about SSE) while 64bit vs 32bit resolves memory space penalty. Or must implement runtime crash of integer at 33M. Hell, otherwise you will never find the hidden bugs in large codes, this one specifically.

Last edited by DanRRight on Fri Mar 03, 2017 11:43 pm; edited 2 times in total
JohnCampbell

Joined: 16 Feb 2006
Posts: 1947
Location: Sydney

 Posted: Fri Mar 03, 2017 11:13 pm    Post subject: Dan, There is a bug in the code : It is a real*4 round-off error, but doesn't appear with some configurations. I have experienced lots of examples of coding bugs that don't appear with some compilers, but do with others. This often happens when moving a code to different compilers or hardware. It's good you now know that real*4 is only accurate to 24 bits and not 32 bits. That's 7 figures of accuracy, compared to "9" for Integer*4. John
DanRRight

Joined: 10 Mar 2008
Posts: 1814
Location: South Pole, Antarctica

 Posted: Fri Mar 03, 2017 11:19 pm    Post subject: John, See? Even the most experienced people here like you did not know that damn feature from the hell. Would you like to re-visit Polyhedron with /64 /opt and may be catch few possible bugs? The new debugger is much better but sometimes miss offending place by 1-2 lines causing confusion.
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First
 All times are GMT + 1 HourGoto page 1, 2  Next Page 1 of 2