forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Optimisation bug, 32-bit FTN95 8.51

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Thu Sep 19, 2019 2:29 am    Post subject: Optimisation bug, 32-bit FTN95 8.51 Reply with quote

The following reproducer contains only one line where the local variable jww is assigned a value. The program is error free, and that it is so can be checked by compiling with /checkmate or with another Fortran compiler.

When a 32-bit EXE is built using FTN95 8.51 with /opt and run, the output is
Code:
  IWEL  JWW  KWW
    1   25   11
 jww at line-40 =            0
**** STOP: Bug encountered


instead of the correct output
Code:
  IWEL  JWW  KWW
    1   25   11
    2   19    6
    3   20    6
    4   14   11
    5   15   11
    6   16   11
 dz =      12.5000


How did the variable jww get set to zero?

The bug does not occur with FTN95 7.20 (note: that version will require that initialisation expressions be written with '(/ ... /)' instead of '[ ... ]'). Nor does the bug occur when 64-bit EXEs are built, with or without /opt .

The source code:
Code:
module wells
   implicit none
   integer, parameter :: NWM = 6, NXM = 41, NZ = 18
   integer :: nw, nx
   integer, dimension(NWM)  :: jw, kw, lcbotw, lctopw
   real, dimension(NXM)  :: x
end module wells

subroutine initwh(dz)
   use wells
   implicit none
   integer  ::  iwel, k, l, is, jww, kww, jww0
   real :: dz, wisec(4)

   print *,' IWEL  JWW  KWW'
   do iwel = 1 , nw
      kww = kw(iwel)
      jww = jw(iwel)                 ! only place where jww is set
      print '(3I5)',iwel,jww,kww
      jww0 = jww                     ! save jww for checking later
     
      do k = lcbotw(iwel) , lctopw(iwel)
         do is = 1 , 4               ! This loop has no purpose other than
            wisec(is) = 0.           ! to instigate the bug, in this abridged
         enddo                       ! test program. It is needed in the full program.

         do l = 1 , 2
            if ( k==1 ) then
               if ( l==1 ) cycle
               dz = 0.5*(x(2)-x(1))
            elseif ( k==nx ) then
               if ( l==2 ) cycle
               dz = 0.5*(x(k)-x(k-1))
            elseif ( l==1 ) then
               dz = 0.5*(x(k)-x(k-1))
            else
               dz = 0.5*(x(k+1)-x(k))
            endif
            if (jww /= jww0) then       ! should never be .true.
               print *,'jww at line-40 = ',jww
               stop 'Bug encountered'
            endif
         enddo
      enddo
   enddo
   
   return
end subroutine initwh

program sim
use wells
implicit none
integer i
real dz
!
nw  = NWM
nx = NXM
jw      = [25, 19, 20, 14, 15, 16]
kw      = [11, 6, 6, 11, 11, 11]
lcbotw  = [10, 15, 17, 19, 20, 23]
lctopw  = [22, 17, 27, 20, 23, 32]
x(6:41) = [(-525.0+i*25.0, i=6,41)]
x(1:5)  = [-750., -625., -525., -450., -410.]

call initwh(dz)
print *,'dz = ',dz

end program
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Thu Sep 19, 2019 5:40 am    Post subject: Reply with quote

I can confirm that it fails with my install of Ver 8.51.

I did some additions to the code, which does not appear to change the error.
I am reporting the address of the local stack variables to see their relative location and values of wisec and jww either side of "do is"
jww is changed during operation of the optimised "do is" loop
Code:
module wells
   implicit none
   integer, parameter :: NWM = 6, NXM = 41, NZ = 18
   integer :: nw, nx
   integer, dimension(NWM)  :: jw, kw, lcbotw, lctopw
   real, dimension(NXM)  :: x
end module wells

subroutine initwh(dz)
   use wells
   implicit none
   integer  ::  iwel, k, l, is, jww, kww, jww0
   real :: dz, wisec(4)
!
   write (*,*) 'is   ',loc(is)
   write (*,*) 'jww  ',loc(jww)
   write (*,*) 'kww  ',loc(kww)
   write (*,*) 'jww0 ',loc(jww0)
   write (*,*) 'wisec',loc(wisec)
   write (*,*) 'dz   ',loc(dz)
!
   is = 0
   wisec = 1
   print *,' IWEL  JWW  KWW'
   do iwel = 1 , nw
      kww = kw(iwel)
      jww = jw(iwel)                 ! only place where jww is set
      print '(3I5)',iwel,jww,kww
      jww0 = jww                     ! save jww for checking later
     
      do k = lcbotw(iwel) , lctopw(iwel)
            write (*,*) k,is,jww,kww, wisec
         do is = 1 , 4               ! This loop has no purpose other than
!zz            write (*,*) k,is,jww,kww, wisec    ! this print changes the bug
            wisec(is) = 0.           ! to instigate the bug, in this abridged
         enddo                       ! test program. It is needed in the full program.
            write (*,*) k,is,jww,kww, wisec

         do l = 1 , 2
            if ( k==1 ) then
               if ( l==1 ) cycle
               dz = 0.5*(x(2)-x(1))
            elseif ( k==nx ) then
               if ( l==2 ) cycle
               dz = 0.5*(x(k)-x(k-1))
            elseif ( l==1 ) then
               dz = 0.5*(x(k)-x(k-1))
            else
               dz = 0.5*(x(k+1)-x(k))
            endif
            write (*,*) k,l,jww
            if (jww /= jww0) then       ! should never be .true.
               print *,'jww at line-40 = ',jww
               stop 'Bug encountered'
            endif
         enddo
      enddo
   enddo
   
   return
end subroutine initwh

program sim
use wells
implicit none
integer i
real dz
!
nw  = NWM
nx = NXM
jw      = [25, 19, 20, 14, 15, 16]
kw      = [11, 6, 6, 11, 11, 11]
lcbotw  = [10, 15, 17, 19, 20, 23]
lctopw  = [22, 17, 27, 20, 23, 32]
x(6:41) = [(-525.0+i*25.0, i=6,41)]
x(1:5)  = [-750., -625., -525., -450., -410.]

call initwh(dz)
print *,'dz = ',dz

end program
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Thu Sep 19, 2019 6:17 am    Post subject: Reply with quote

I can confirm, for the revised test code version I have posted:

FTN95 Ver 8.51 fails
FTN95 Ver 8.50 fails
FTN95 Ver 8.40 works ok
FTN95 Ver 8.30 works ok

Also replacing the "do is" loop with wisec = 0 also removes the appearance of the bug.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Thu Sep 19, 2019 7:19 am    Post subject: Reply with quote

Many thanks for the bug report and comments. I have made a note that this needs fixing.
Back to top
View user's profile Send private message AIM Address
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Thu Sep 19, 2019 10:15 am    Post subject: Re: Reply with quote

JohnCampbell wrote:
... replacing the "do is" loop with wisec = 0 also removes the appearance of the bug.


That kind of response to minor changes is typical of optimiser bugs. Simply adding a PRINT statement to display the values of suspected variables can make the bug disappear, and this has caused some programmers to call such bugs "Heisenbug"s.

This property makes it troublesome to prepare a reproducer. We may try paring away a few lines of seemingly unrelated source code, hoping that the bug will not disappear. If it does disappear, as it often does, we have to revert to the previous version of the source code, and look for something else to cut out. The reduction in size is definitely very slow until, say, something like 50 percent reduction has been reached. Near the end, progress is often faster than one expects.

The fast compilation of FTN95 helps to make all this less of a burden, but it is an obstacle that we are unable to use SDBG with code compiled with /opt. Other related problems are the choice of a proprietary format for 64-bit OBJ files and the unavailability of tools (such as Microsoft's DUMPBIN /disasm) to list the instructions in the OBJ files. The /EXP listings are useful in other contexts, but the display of local variables by name (rather than by RBP or RSP offsets), or even pseudovariable names such as "extracted_expression_73" is often not enough when an addressing error is being investigated.

John, thanks for testing with other versions of the compiler and confirming the bug. Your comments regarding the DO IS loop are also helpful. I had observed that while preparing the reproducer, but decided not to describe it in order to fit the important matters into a single posting.

P.S. My shorter reproducer, posted hours after I wrote this posting, leads to an explanation of why replacing the DO IS loop by an array assignment removes the bug. The rules of Fortran require that after the DO IS=1,4 loop completes four iterations, the DO index IS should be 4+1, i.e., 5. Thus, the DO loop has the same effect as the array assignment, with one addition: setting the index variable at loop exit. The bug happens with setting the index variable. More generally, I think that the bug occurs whenever the compiler unrolls a DO loop and sets the value of the DO index variable to its expected value from the normal (unoptimised) execution of a DO loop.


Last edited by mecej4 on Sat Sep 21, 2019 12:35 pm; edited 3 times in total
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Thu Sep 19, 2019 6:04 pm    Post subject: Reply with quote

Here is a shorter reproducer for the issue.
Code:
program bwel
integer iw(4), jw
call wel(iw)
print *,iw
end program

subroutine wel(iw)
integer iw(4)
integer i, jww
jww = 5
do i=1,4
   iw(i) = i
end do
print *,'jww = ',jww
return
end


Compile with /opt /p6, link and run. With V 8.51, the printed value of JWW is 0, whereas with V 7.2 it is 5.

The /exp listing of this code with V 8.51 illustrates how the nature of the listing makes the problem a bit obscure. Here is an extract:
Code:
   0010   jww = 5
   0011   do i=1,4
   0012      iw(i) = i
      0000000d(52/7/53)          mov       eax,address of IW
      00000010(51/3/19)          mov       JWW,=5              ; <<=== [ebp-10h]
      00000017(53/7/53)          mov       [eax],=1
      0000001d(54/6/58)          mov       [eax+4],=2
      00000024(55/5/64)          mov       [eax+8],=3
      0000002b(56/4/70)          mov       [eax+12],=4
      00000032(57/4/76)          mov       I[4],=0             ; <<=== [ebp-10h]
      00000039(58/4/76)          mov       I,=5


There is nothing in the listing to indicate that JWW and I[4] occupy the same address. Running dumpbin /disasm on the OBJ file shows that both these occupy dword ptr [ebp-10h].

The last two instructions in the extract are setting the local variable I to the value that it should have after the DO loop terminates, but the compiler seems to think that I is an 8-byte integer. For some reason, the upper 4 bytes overlap the local variable JWW.


Last edited by mecej4 on Fri Sep 20, 2019 4:27 am; edited 3 times in total
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Fri Sep 20, 2019 1:38 am    Post subject: Reply with quote

Good that mecej4 already not the first time addressing optimization issue of the compiler. This is ages old problem. Pity no one at SF likes to dig into the optimization which might bring the largest advancement of the compiler itself together with f90 and 64bits since its day one as FTN77.

Hope that some day somebody also will demand compatibility with MPI/CUDA parallel instructions to add to the options and finally the FTNXX for supercomputers (Linux and Windows) will be made. Personal supercomputing era is coming soon. Even with mere 128 cores in personal use, which is just two chips currently, our PIC code runs as fast as 1000-core supercomputer with its time share and highly congested usage (you got 24 hours then you wait in the queue for few days for continuation. And then your usage limit ends ). Prices should drop like a rock with the AMD competition. You can buy previous generation 16-core Intel Xeon E5 chips for $100-200 on eBay currently. And for the supercomputers often it's not the processor speed but the RAM speed and interconnect are the performance limiting factors. Rewriting the codes from C to Fortran also may give factor of 3 speedup in some cases probably
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Fri Sep 20, 2019 12:41 pm    Post subject: Reply with quote

Dan has a point. I always used the optimisation options automatically with every compiler until I moved over to FTN95, and hen (probably wrongly) I put some of the problems down to asynchronicity within Windows. As a result, I've avoided /opt, and never encountered such bugs afterwards.

I'm in the lucky position that my stuff runs adequately quickly on any computer I care to run it on even without /opt, but each time I see the Polyhedron benchmarks (which I try to avoid) I have a pang of envy that other, lesser, compilers have faster runtimes. It's like the benchmarks for Intel v. AMD. When I look at the scores, Intel wins. But when in the real world I've had to use Intel machines, I simply don't experience that difference.

And as for PC v. supercomputer, I can remember when my 16Mb RAM machine could run problems that many mainframes couldn't tackle, and the runtime was only a minor component of the time between job submission and receiving the results, as the immediacy of the PC beat a timesharing system hands down.

So please, Paul, do take the optimisation issues seriously.

Eddie
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sun Sep 22, 2019 9:10 am    Post subject: Reply with quote

Yes, Eddie, Polyhedron run results may be irrelevant, or may be are the example of bad code, but for general public unfamiliar with the subject they like hit below the waist undermine the whole idea of using FTN95 and even Fortran as fast language. Pain and no excuse seeing that for 20 years !

Though I personally did not suffer from that because the key for me is fast development, deep error checking, GUI, while super-fast run I get from parallel libraries which use multi-core processors and made in all different compilers so you can chose the fastest.

We also started using more supercomputers recently. There FTN95 also offers the fastest read speed, clearly exceeding HDF5 while doing that with ultimate simplicity (were not able to use HDF5 directly with FTN95 yet, may be Silverfrost developers will help and compile their sources in Fortran or C or make a DLL)

Still this polyhedron blow is totally unacceptable and has to be resolved somehow, don't you agree at SF?
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 1520
Location: Aerospace Valley

PostPosted: Thu Sep 26, 2019 7:20 am    Post subject: Reply with quote

The polyhedron published results are getting quite an airing again on several posts recently.

.... so after starting to write some comments on here I decided to create a new thres'ad dedicated to it.

You can find it HERE
_________________
''Computers (HAL and MARVIN excepted) are incredibly rigid. They question nothing. Especially input data.Human beings are incredibly trusting of computers and don't check input data. Together cocking up even the simplest calculation ... Smile "


Last edited by John-Silver on Thu Sep 26, 2019 9:31 am; edited 2 times in total
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Fri Oct 11, 2019 12:03 pm    Post subject: Reply with quote

This bug has now been fixed for the next release of FTN95.
Back to top
View user's profile Send private message AIM Address
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group