 |
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
mecej4
Joined: 31 Oct 2006 Posts: 1899
|
Posted: Fri Oct 12, 2018 3:39 pm Post subject: Array expressions mishandled when /opt is used |
|
|
Here is another instance of an optimization bug. This one occurs when array sections are copied and /opt /64 has been requested. The bug is rather fragile, and minor changes to the code make it disappear. To make the bug easy to notice, I made two copies of a single subroutine, which modifies an input 2-D array. The two copies are identical except for one line. The first version has, on line 64,
Code: | v (:mp1, j) = v (:mp1, jcol) ! array section copy |
The second version has, instead, a DO loop, on lines 91-93:
Code: | Do k = 1, mp1; v (k, j) = v (k, jcol); End Do |
The main program compares the returned arrays v from the two versions. In the absence of bugs, the two should be identical. With 32-bit compilations, or with /64 but not /opt, they are identical. With /64 /opt, the bug surfaces.
The source code:
Code: | Program tst
Implicit None
Integer :: m = 306, nvar = 4, n = 5, ns = 2
Double Precision :: eq (4, 5)
Double Precision, Dimension (307, 6) :: v, v1, v2
Integer :: i, j
Data eq/0.d0,1.d0,3*0.d0,1.d0,3*0.d0,1.d0,2*0.d0,1.d0,3*0.d0, &
1.d0,3.d0,2*0.d0/
!
do i = 1, 307
do j = 1,6
v(i, j) = i*(3.d0-j)
end do
end do
!
v1 = v
Call scrch1 (m, nvar, eq, v1)! Uses vector assignment, v (:mp1, j) = v (:mp1, jcol)
!
v2 = v
Call scrch2 (m, nvar, eq, v2)! Uses DO loop, DO k=1, mp1; v(k, j) = v(k, jcol)
!
! Check that the two results match, as they should
!
If (any(Abs(v1-v2) > 1d-6)) Then
Write (*, 10) m, nvar, n, ns
Write (*,*) 'SCRCH, diffs found'
Do i = 1, 307
Do j = 1, 6
If (Abs(v1(i, j)-v2(i, j)) > 1d-5) &
& write (*, 20) i, j,v1 (i, j), v2 (i, j)
End Do
End Do
Else
Write (*,*) 'SCRCH, no diffs'
End If
Stop
!
10 Format ('m, nvar, meqa, n, ns = ', 5 I5)
20 Format (2 I4, 2 x, 2 ES15.7)
!
End Program
!
Subroutine scrch1 (m, nvar, eq, v)
Implicit None
Integer, Intent (In) :: m, nvar
Integer :: nscol (2) = (/ 2, 3 /), iresl (4) = (/ 1, 4, 2, 1 /)
Double Precision, Intent (In) :: eq (4, 5)
Double Precision, Dimension (307, 6), Intent (Inout) :: v
Integer :: l, j, lrow, lcol, jcol, n, np1, mp1, ns
Double Precision :: fact
n = nvar + 1; np1 = n + 1; mp1 = m + 1; ns = nvar - 2
!
Do l = 1, 2
lrow = iresl (2*l-1); lcol = iresl (2*l)
Do j = 1, ns
jcol = nscol (j); fact = eq (lrow, jcol)
v (:mp1, jcol) = v (:mp1, jcol) - fact * v (:mp1, lcol)
End Do
fact = eq (lrow, n)
v (:mp1, np1) = v (:mp1, np1) - fact * v (:mp1, lcol)
End Do
Do j = 1, ns
jcol = nscol (j); If (jcol == j) Cycle
v (:mp1, j) = v (:mp1, jcol) ! array section copy
End Do
v (:mp1, n-2) = v (:mp1, n); v (:mp1, n-1) = v (:mp1, np1)
Return
End Subroutine scrch1 |
NOTE: The second version of the subroutine is posted in a follow-on posting below, because of the forum line limits. |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1899
|
Posted: Fri Oct 12, 2018 3:41 pm Post subject: ..continued.., last part of source code |
|
|
Here is the second version of the subroutine. Put the two pieces together into a single source file, compile and run with /64 /opt, then with /64 alone.
Code: | Subroutine scrch2 (m, nvar, eq, v)
Implicit None
Integer, Intent (In) :: m, nvar
Integer :: nscol (2) = (/ 2, 3 /), iresl (4) = (/ 1, 4, 2, 1 /)
Double Precision, Intent (In) :: eq (4, 5)
Double Precision, Dimension (307, 6), Intent (Inout) :: v
Integer :: l, j, k, lrow, lcol, jcol, n, np1, mp1, ns
Double Precision :: fact
n = nvar + 1; np1 = n + 1; mp1 = m + 1; ns = nvar - 2
!
Do l = 1, 2
lrow = iresl (2*l-1); lcol = iresl (2*l)
Do j = 1, ns
jcol = nscol (j); fact = eq (lrow, jcol)
v (:mp1, jcol) = v (:mp1, jcol) - fact * v (:mp1, lcol)
End Do
fact = eq (lrow, n)
v (:mp1, np1) = v (:mp1, np1) - fact * v (:mp1, lcol)
End Do
Do j = 1, ns
jcol = nscol (j); If (jcol == j) Cycle
Do k = 1, mp1 ! Do Loop copy
v (k, j) = v (k, jcol)
End Do
End Do
v (:mp1, n-2) = v (:mp1, n); v (:mp1, n-1) = v (:mp1, np1)
Return
End Subroutine scrch2 |
|
|
Back to top |
|
 |
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 8211 Location: Salford, UK
|
Posted: Sat Oct 13, 2018 8:05 am Post subject: |
|
|
Thank you for this report. I have logged it for investigation. |
|
Back to top |
|
 |
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 8211 Location: Salford, UK
|
Posted: Thu Nov 08, 2018 2:39 pm Post subject: |
|
|
An initial investigation indicates that the fault lies with item 30 of the optimiser. So a temporary "fix" is to add /inhibit_opt 30.
The issue remains outstanding. |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1899
|
Posted: Thu Nov 08, 2018 3:14 pm Post subject: |
|
|
Paul, where can we see a list of these "optimisation item"s? How many of these are there, and are they grouped into categories? |
|
Back to top |
|
 |
LitusSaxonicum
Joined: 23 Aug 2005 Posts: 2402 Location: Yateley, Hants, UK
|
Posted: Thu Nov 08, 2018 3:21 pm Post subject: |
|
|
I agree with Mecej4 - it would be interesting to see what optimisations are performed, and which might simply be skipped.
It's a long time since I even attempted to use /OPT, as my codes run adequately fast without it, especially on my new computer that is considerably faster than the old one.
Eddie |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1899
|
Posted: Thu Nov 08, 2018 3:51 pm Post subject: |
|
|
Incidentally, Eddie, I was also brought up with "subexpression optimisation" in my blood. More recently, I have been weaning myself from that habit, because (i) it makes the code less readable, as it always did, and (ii) these days, it usually makes the code slower as well.
The variables that are created for the purpose of holding many of these subexpressions are very short-lived and, if they are declared in the subprogram declarations section, the compiler ends up generating code that will be doing lots of copying to and from main memory. Left to itself, the compiler optimiser can do a better job of recognising and confining these variables to registers.
Last edited by mecej4 on Sun Nov 11, 2018 12:05 pm; edited 1 time in total |
|
Back to top |
|
 |
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 8211 Location: Salford, UK
|
Posted: Thu Nov 08, 2018 4:42 pm Post subject: |
|
|
I will make enquiries. |
|
Back to top |
|
 |
LitusSaxonicum
Joined: 23 Aug 2005 Posts: 2402 Location: Yateley, Hants, UK
|
Posted: Thu Nov 08, 2018 5:51 pm Post subject: |
|
|
Mecej4,
I think it may well be the case that common subexpression removal by an efficient and working optimization is probably better than doing it yourself, but surely, for readability it depends how you write it, doesn�t it? And that, in part, depends on the naming of the variables that hold the pre-calculated common subexpressions. 'm1' doesn't cut it for me.
One programmer whose work I admired always used �GASH� (British military slang (specifically from the Royal Navy and Royal Marines) for rubbish (garbage), or for something that is considered useless, broken or otherwise of little value, rather than any other definition) for such a variable, and when more than one was required, supplemented them with �GESH�, �GISH�, �GOSH� and �GUSH� if he had to, which wasn�t often.
As for me, I tended to prefer COEF with various other suffixes for the same job. (COEF_A, COEF_B etc)
In both cases, the readability seems to me to be enhanced rather than degraded.
Eddie |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Sun Nov 11, 2018 1:11 am Post subject: |
|
|
I looked at array sections in scrch1 and found:
v (:mp1, j) = v (:mp1, jcol) ! opt fails
v (1:mp1, j) = v (1:mp1, jcol) ! opt fails
v (:, j) = v (:, jcol) ! opt is OK
The first 2 may take a section copy of the array, while the 3rd might not.
for " v (:, np1) = v (:, np1) - fact * v (:, lcol) ",
I would usually replace it by the following:
fact = -eq (lrow, n)
call daxpy (mp1, fact, v(1, lcol), v(1, np1) )
As with mecej4's observations of compiler optimisation performance, it is now better left to the compiler to clean this up. ( most of my F77 wrapper approaches are no longer effective) |
|
Back to top |
|
 |
mecej4
Joined: 31 Oct 2006 Posts: 1899
|
Posted: Sun Nov 11, 2018 2:34 pm Post subject: Re: |
|
|
JohnCampbell wrote: | I looked at array sections in scrch1 and found:
v (:mp1, j) = v (:mp1, jcol) ! opt fails
v (1:mp1, j) = v (1:mp1, jcol) ! opt fails
v (:, j) = v (:, jcol) ! opt is OK
The first 2 may take a section copy of the array, while the 3rd might not. |
Since the upper bound of the first index of v is > mp1, we should not write
v (:, j) = v (:, jcol) ! opt is OK
as that would clobber v(mp1+1:,j). Consequently, in chasing after efficiency, we may have to redo the whole program, changing the dimensions of v to exactly fit the problem being treated, rather than something "big enough". |
|
Back to top |
|
 |
JohnCampbell
Joined: 16 Feb 2006 Posts: 2615 Location: Sydney
|
Posted: Mon Nov 12, 2018 5:30 am Post subject: |
|
|
I was more trying to identify which types of array syntax are failing, hopefully to assist with the bug identification
I also tried a 3rd option that used F77 wrappers, but it is not as clearly presented as the array syntax.
https://www.dropbox.com/s/tso7bfyhwxikh5l/tst_opt.f90?dl=0 |
|
Back to top |
|
 |
LitusSaxonicum
Joined: 23 Aug 2005 Posts: 2402 Location: Yateley, Hants, UK
|
Posted: Tue Nov 13, 2018 12:08 pm Post subject: |
|
|
The reason why it might make the code slower is that when you put a subexpression result in a named variable, that variable has to be assigned its value, then retrieved each time, rather than the compiler holding the result in a register for re-use.
Why the compiler can't recognise the single variable as a subexpression to hold in a register I don't know, which costs you only the code to assign it to a named variable, and then of course, it will probably be still retained in the cpu cache anyway, but there you are.
Of course, the common subexpression recognition in the compiler and how effective it is depends in part on (a) being able to recognise it in the first place, and (b) how far ahead is the lookahead, (c) how many subexpressions are dealt with, and of course, whether the compiler does it correctly.
Personally, I gave up on /opt when I got funny results, believing that they were the result of re-ordering, which is a well-known issue with optimising compilers (I am usually in the fortunate position of being able to detect a stupid result). If it is the result of a bug, then perhaps I did the right thing.
I still always do simple, straightforward hand optimization, and believe the idea that it slows things down is an urban myth, but that may be my own myth.
Eddie |
|
Back to top |
|
 |
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 8211 Location: Salford, UK
|
Posted: Mon Feb 04, 2019 3:22 pm Post subject: |
|
|
The original bug on this thread has now been fixed for the next release of FTN95. |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|