forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Array expressions mishandled when /opt is used

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> 64-bit
View previous topic :: View next topic  
Author Message
mecej4



Joined: 31 Oct 2006
Posts: 1031

PostPosted: Fri Oct 12, 2018 3:39 pm    Post subject: Array expressions mishandled when /opt is used Reply with quote

Here is another instance of an optimization bug. This one occurs when array sections are copied and /opt /64 has been requested. The bug is rather fragile, and minor changes to the code make it disappear. To make the bug easy to notice, I made two copies of a single subroutine, which modifies an input 2-D array. The two copies are identical except for one line. The first version has, on line 64,
Code:
         v (:mp1, j) = v (:mp1, jcol)               ! array section copy

The second version has, instead, a DO loop, on lines 91-93:
Code:
         Do k = 1, mp1;  v (k, j) = v (k, jcol);  End Do

The main program compares the returned arrays v from the two versions. In the absence of bugs, the two should be identical. With 32-bit compilations, or with /64 but not /opt, they are identical. With /64 /opt, the bug surfaces.

The source code:
Code:
Program tst
      Implicit None
      Integer :: m = 306, nvar = 4, n = 5, ns = 2
      Double Precision :: eq (4, 5)
      Double Precision, Dimension (307, 6) :: v, v1, v2
      Integer :: i, j
      Data eq/0.d0,1.d0,3*0.d0,1.d0,3*0.d0,1.d0,2*0.d0,1.d0,3*0.d0, &
             1.d0,3.d0,2*0.d0/
!
      do i = 1, 307
         do j = 1,6
            v(i, j) = i*(3.d0-j)
         end do
      end do
!
      v1 = v
      Call scrch1 (m, nvar, eq, v1)! Uses vector assignment, v (:mp1, j) = v (:mp1, jcol)
!
      v2 = v
      Call scrch2 (m, nvar, eq, v2)! Uses DO loop, DO k=1, mp1; v(k, j) = v(k, jcol)
!
! Check that the two results match, as they should
!
      If (any(Abs(v1-v2) > 1d-6)) Then
         Write (*, 10) m, nvar, n, ns
         Write (*,*) 'SCRCH, diffs found'
         Do i = 1, 307
            Do j = 1, 6
               If (Abs(v1(i, j)-v2(i, j)) > 1d-5) &
              &    write (*, 20) i, j,v1 (i, j), v2 (i, j)
            End Do
         End Do
      Else
         Write (*,*) 'SCRCH, no diffs'
      End If
      Stop
!
10    Format ('m, nvar, meqa, n, ns = ', 5 I5)
20    Format (2 I4, 2 x, 2 ES15.7)
!
End Program
!
Subroutine scrch1 (m, nvar, eq, v)
      Implicit None
      Integer, Intent (In) :: m, nvar
      Integer :: nscol (2) = (/ 2, 3 /), iresl (4) = (/ 1, 4, 2, 1 /)
      Double Precision, Intent (In) :: eq (4, 5)
      Double Precision, Dimension (307, 6), Intent (Inout) :: v
      Integer :: l, j, lrow, lcol, jcol, n, np1, mp1, ns
      Double Precision :: fact
      n = nvar + 1; np1 = n + 1;  mp1 = m + 1; ns = nvar - 2
!
      Do l = 1, 2
         lrow = iresl (2*l-1); lcol = iresl (2*l)
         Do j = 1, ns
            jcol = nscol (j); fact = eq (lrow, jcol)
            v (:mp1, jcol) = v (:mp1, jcol) - fact * v (:mp1, lcol)
         End Do
         fact = eq (lrow, n)
         v (:mp1, np1) = v (:mp1, np1) - fact * v (:mp1, lcol)
      End Do
      Do j = 1, ns
         jcol = nscol (j); If (jcol == j) Cycle
         v (:mp1, j) = v (:mp1, jcol)               ! array section copy
      End Do
      v (:mp1, n-2) = v (:mp1, n); v (:mp1, n-1) = v (:mp1, np1)
      Return
End Subroutine scrch1

NOTE: The second version of the subroutine is posted in a follow-on posting below, because of the forum line limits.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1031

PostPosted: Fri Oct 12, 2018 3:41 pm    Post subject: ..continued.., last part of source code Reply with quote

Here is the second version of the subroutine. Put the two pieces together into a single source file, compile and run with /64 /opt, then with /64 alone.
Code:
Subroutine scrch2 (m, nvar, eq, v)
      Implicit None
      Integer, Intent (In) :: m, nvar
      Integer :: nscol (2) = (/ 2, 3 /), iresl (4) = (/ 1, 4, 2, 1 /)
      Double Precision, Intent (In) :: eq (4, 5)
      Double Precision, Dimension (307, 6), Intent (Inout) :: v
      Integer :: l, j, k, lrow, lcol, jcol, n, np1, mp1, ns
      Double Precision :: fact
      n = nvar + 1; np1 = n + 1; mp1 = m + 1; ns = nvar - 2
!
      Do l = 1, 2
         lrow = iresl (2*l-1); lcol = iresl (2*l)
         Do j = 1, ns
            jcol = nscol (j); fact = eq (lrow, jcol)
            v (:mp1, jcol) = v (:mp1, jcol) - fact * v (:mp1, lcol)
         End Do
         fact = eq (lrow, n)
         v (:mp1, np1) = v (:mp1, np1) - fact * v (:mp1, lcol)
      End Do
      Do j = 1, ns
         jcol = nscol (j); If (jcol == j) Cycle
         Do k = 1, mp1                                    ! Do Loop copy
            v (k, j) = v (k, jcol)
         End Do
      End Do
      v (:mp1, n-2) = v (:mp1, n); v (:mp1, n-1) = v (:mp1, np1)
      Return
End Subroutine scrch2
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5609
Location: Salford, UK

PostPosted: Sat Oct 13, 2018 8:05 am    Post subject: Reply with quote

Thank you for this report. I have logged it for investigation.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5609
Location: Salford, UK

PostPosted: Thu Nov 08, 2018 2:39 pm    Post subject: Reply with quote

An initial investigation indicates that the fault lies with item 30 of the optimiser. So a temporary "fix" is to add /inhibit_opt 30.

The issue remains outstanding.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1031

PostPosted: Thu Nov 08, 2018 3:14 pm    Post subject: Reply with quote

Paul, where can we see a list of these "optimisation item"s? How many of these are there, and are they grouped into categories?
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 1879
Location: Yateley, Hants, UK

PostPosted: Thu Nov 08, 2018 3:21 pm    Post subject: Reply with quote

I agree with Mecej4 - it would be interesting to see what optimisations are performed, and which might simply be skipped.

It's a long time since I even attempted to use /OPT, as my codes run adequately fast without it, especially on my new computer that is considerably faster than the old one.

Eddie
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1031

PostPosted: Thu Nov 08, 2018 3:51 pm    Post subject: Reply with quote

Incidentally, Eddie, I was also brought up with "subexpression optimisation" in my blood. More recently, I have been weaning myself from that habit, because (i) it makes the code less readable, as it always did, and (ii) these days, it usually makes the code slower as well.

The variables that are created for the purpose of holding many of these subexpressions are very short-lived and, if they are declared in the subprogram declarations section, the compiler ends up generating code that will be doing lots of copying to and from main memory. Left to itself, the compiler optimiser can do a better job of recognising and confining these variables to registers.


Last edited by mecej4 on Sun Nov 11, 2018 12:05 pm; edited 1 time in total
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5609
Location: Salford, UK

PostPosted: Thu Nov 08, 2018 4:42 pm    Post subject: Reply with quote

I will make enquiries.
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 1879
Location: Yateley, Hants, UK

PostPosted: Thu Nov 08, 2018 5:51 pm    Post subject: Reply with quote

Mecej4,

I think it may well be the case that common subexpression removal by an efficient and working optimization is probably better than doing it yourself, but surely, for readability it depends how you write it, doesn’t it? And that, in part, depends on the naming of the variables that hold the pre-calculated common subexpressions. 'm1' doesn't cut it for me.

One programmer whose work I admired always used ‘GASH’ (British military slang (specifically from the Royal Navy and Royal Marines) for rubbish (garbage), or for something that is considered useless, broken or otherwise of little value, rather than any other definition) for such a variable, and when more than one was required, supplemented them with ‘GESH’, ‘GISH’, ‘GOSH’ and ‘GUSH’ if he had to, which wasn’t often.

As for me, I tended to prefer COEF with various other suffixes for the same job. (COEF_A, COEF_B etc)

In both cases, the readability seems to me to be enhanced rather than degraded.

Eddie
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2007
Location: Sydney

PostPosted: Sun Nov 11, 2018 1:11 am    Post subject: Reply with quote

I looked at array sections in scrch1 and found:

v (:mp1, j) = v (:mp1, jcol) ! opt fails
v (1:mp1, j) = v (1:mp1, jcol) ! opt fails
v (:, j) = v (:, jcol) ! opt is OK

The first 2 may take a section copy of the array, while the 3rd might not.

for " v (:, np1) = v (:, np1) - fact * v (:, lcol) ",
I would usually replace it by the following:
fact = -eq (lrow, n)
call daxpy (mp1, fact, v(1, lcol), v(1, np1) )

As with mecej4's observations of compiler optimisation performance, it is now better left to the compiler to clean this up. ( most of my F77 wrapper approaches are no longer effective)
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1031

PostPosted: Sun Nov 11, 2018 2:34 pm    Post subject: Re: Reply with quote

JohnCampbell wrote:
I looked at array sections in scrch1 and found:

v (:mp1, j) = v (:mp1, jcol) ! opt fails
v (1:mp1, j) = v (1:mp1, jcol) ! opt fails
v (:, j) = v (:, jcol) ! opt is OK

The first 2 may take a section copy of the array, while the 3rd might not.


Since the upper bound of the first index of v is > mp1, we should not write

v (:, j) = v (:, jcol) ! opt is OK

as that would clobber v(mp1+1:,j). Consequently, in chasing after efficiency, we may have to redo the whole program, changing the dimensions of v to exactly fit the problem being treated, rather than something "big enough".
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2007
Location: Sydney

PostPosted: Mon Nov 12, 2018 5:30 am    Post subject: Reply with quote

I was more trying to identify which types of array syntax are failing, hopefully to assist with the bug identification
I also tried a 3rd option that used F77 wrappers, but it is not as clearly presented as the array syntax.

https://www.dropbox.com/s/tso7bfyhwxikh5l/tst_opt.f90?dl=0
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 961
Location: Aerospace Valley

PostPosted: Tue Nov 13, 2018 9:38 am    Post subject: Reply with quote

Enough to put anyone off using optimisation for life this post !
For starters ...
mecej4 wrote:
Quote:
... and (ii) these days, it usually makes the code slower as well.

how is that possible ??? whtìs happened to make this a reality ?
_________________
''Computers are incredibly rigid. They question nothing. Especialy input data.Human beings are incredibly trusting of computers and don't check input data. Together they are capable of cocking up even the simplest calculation ... Smile "
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 1879
Location: Yateley, Hants, UK

PostPosted: Tue Nov 13, 2018 12:08 pm    Post subject: Reply with quote

The reason why it might make the code slower is that when you put a subexpression result in a named variable, that variable has to be assigned its value, then retrieved each time, rather than the compiler holding the result in a register for re-use.

Why the compiler can't recognise the single variable as a subexpression to hold in a register I don't know, which costs you only the code to assign it to a named variable, and then of course, it will probably be still retained in the cpu cache anyway, but there you are.

Of course, the common subexpression recognition in the compiler and how effective it is depends in part on (a) being able to recognise it in the first place, and (b) how far ahead is the lookahead, (c) how many subexpressions are dealt with, and of course, whether the compiler does it correctly.

Personally, I gave up on /opt when I got funny results, believing that they were the result of re-ordering, which is a well-known issue with optimising compilers (I am usually in the fortunate position of being able to detect a stupid result). If it is the result of a bug, then perhaps I did the right thing.

I still always do simple, straightforward hand optimization, and believe the idea that it slows things down is an urban myth, but that may be my own myth.

Eddie
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 961
Location: Aerospace Valley

PostPosted: Tue Nov 13, 2018 9:15 pm    Post subject: Reply with quote

Thanks for your insight Eddie.
You kinda second guessed several other things in th back of my mind.

Now some babbing, in no particular direction ! .....

What I'm surprised at is all the talk of registers, etc .....
So, optimisation isn't rewriting the code 'openly' but doing stuff 'behind the scenes' (in a very clever way one assumes)
We're concentrating on the case of 'ommon expressions' here of course, but that's relatively simple in comparison to ìsay optimising loops (especially multi-nested loops.

The 'problem example' here of course uses the new-fangled (?) implicit array expressions/operations.
I would assume that there are a set of ftn95 'standard tests' used to assess the level of optimisation for several typical (and maybe eventually not-so typical) cases.

Take an example of 3 nested Do loops to define values in an array.
It's optimisation depends not only on the position and inter-relations of the various arrays and variables but also what's also contained within them.

Start simple and get more complicated as time permits and need determines is a good maxim.

What happens for example if double-do loops containing a 2-D array is set up and then optimised ?
If the original code is ill-defined and the indices are not in the optimum order does optimisation correct that leading to a quicker code ?

Or is there a risk that if the indices were set up in the 'optimum order' that optimisation might 'disorder' them ?

Is using an array expression actually quicker than just sticking in the necessary DO loops ?
Or is it just a way of 'reducing code writing '

And how does all that work out with optimisation !

It's not easy to produce an optimum optinisation.

It's sort of held bck by the philosophy of my post 'signature' is it not ! :O). Two irrestistible forces coming together to do magical things, and simply fìgetting mixed up not knowing what each one is actually trying to achieve.
_________________
''Computers are incredibly rigid. They question nothing. Especialy input data.Human beings are incredibly trusting of computers and don't check input data. Together they are capable of cocking up even the simplest calculation ... Smile "
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> 64-bit All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group