forums.silverfrost.com

mecej4 · Joined: 31 Oct 2006 Posts: 1899

I have been working on a groundwater flow program with about 12,000 lines of Fortran code. FTN95 (V 8.71) has been quite helpful in finding and fixing bugs related to subscript bounds, uninitialised variables, etc. However, in one run, I encountered strange program behaviour.

When I compiled the program with /checkmate and built a 32-bit EXE, the program ran for a few seconds and then aborted with INTEGER OVERFLOW on a line that contains just a subroutine call:

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

would abmult need an interface ?

mecej4 · Joined: 31 Oct 2006 Posts: 1899

Thanks for taking a look. Good point.

When an array section is used as an actual argument, an interface may be required. However, since ABMULT is a contained subroutine of the caller, which is the host routine, the latter has an interface already available.

The issue persists if the call is changed to

PaulLaidler · Posted: Thu Apr 15, 2021 2:56 pm Post subject:

This is my impression...

The argument ap(1,L) of abmult is a scalar whereas an array is expected.

If I change to ap(1,L:L), it is an array but only with one element.

Within abmult, x(irn) = s is putting values into multiple elements of the actual argument ap. So the destination is not valid.

mecej4 · Joined: 31 Oct 2006 Posts: 1899

Paul, your remarks relate to the F77 conventions for passing contiguous array sections and their variance from the F90+ conventions for the same. Indeed, the F2003 standard comments at length on these differences in section C.9.5.

However, every current Fortran compiler that we are likely to use today, including FTN95, handles such F77 style calls perfectly well. Here is a very short example that illustrates that capability. FTN95 has no problem with this code, nor does the NAG compiler. Similarly, if /check or /debug is used instead of /checkmate on the larger test code that I posted to Dropbox, there is no problem.

PaulLaidler · Posted: Thu Apr 15, 2021 5:03 pm Post subject:

mecej4

My impression was that the first argument is a scalar or an array with one element. Whereas the routine is setting multiple elements for the corresponding dummy argument.

If that is correct then the code is at fault regardless of the compiler or the Fortran standard.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

For 32-bit version, I ran in SDBG.
I changed
routine init.f90 to report that ap was allocated and its dimensions; which appeared to be correct

mecej4 · Joined: 31 Oct 2006 Posts: 1899

John,

The caller is passing the first columns of the two-dimensional arrays AP and BP as the actual arguments to match the one-dimensional arrays X and Y in ABMULT. Yes, L = 0, but the arrays AP and BP have been declared with matching 0-s in their second dimension.

With /checkmate, I expect FTN95 to set INTENT(OUT) arguments to 'UNDEFINED' . To do so correctly, it needs the actual bounds of those arguments, which are normally not available with assumed size arguments, but we expect /checkmate to pass any/all extra information needed for checking and initialising to 'UNDEFINED'.

The original code is the HST3D Version 2 from the USGS, see https://wwwbrr.cr.usgs.gov/projects/GW_Solute/hst/index.shtml and http://priede.bf.lu.lv/ftp/pub/TIS/datu_analiize/WaterFlow/HST3D/ .

Thanks for your comments.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

mecej4 · Joined: 31 Oct 2006 Posts: 1899

John, the full story is that the original code passes assumed shape arrays to BLAS-like routines such as ABMULT, but for calls with assumed shape arrays, there is a major performance penalty with FTN95 (but not with Gfortran, Intel). Here are run times (FTN95 8.71 /opt /64, Ifort 21 /O2, CPU: i7-10710-1.1 GHz)

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

I have further looked at the code you provided.
In iter.f90 from line 55 I have included:

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

I am using FTN95 Ver 8.64 32-bit and SDBG Ver 8.62

Using CALL abmult (ap(1,L),bp(:,L));
when it crashes, in the Vars:GCGRIS window, there are 2 listed variables BP,
a variable BP and
an array "BP = REAL*8 (280774289,0:214748646)"
These have different memory addresses.
This looks confusing to me.

-- new test --
I returned to using CALL abmult (ap,bp);
The program terminates normally, but SDBG is still listing a variable BP and an array BP ??? FTN95/SDBG error ?
Arrays AP and BP have incorrect (random) sizes.
(I should put a breakpoint prior to the call)

-- new test --
Now in SDBG with breakpoint, checked prior to call abmult
AP has wrong size (57735933,0:2147483646)
BP has wrong size, (4604097,0:2147483646), plus a variable BP exists.
write statement reports size(ap,1) correctly as 6479

Their first dimension should be correct ?? FTN95/SDBG error ?
I would suggest this is possible cause of integer overflow.
Their second dimension is reported as 0:2147483646, a possible length for *, so could be ok.

In abmult
X and Y have same memory size of module MCS2 variables AP and BBP.
X = Y = REAL*8 (38874) { this is 6479*6 }

(my comment about "-nrn" appears invalid, as CI might be adjusted for this offset. ie my ngood = 36805 vs nbad = 0)

-- new test --
Further test: I changed the declarations for AP and BP to
REAL(KIND=kdp), DIMENSION(nrn,*) :: ap
REAL(KIND=kdp), DIMENSION(nbn,*) :: bp

SDBG still reports their first dimension incorrectly

Other arrays ra and rr are reported with correct dimensions by SDBG.
array h is also correct. (automatic array where nsdr is module variable)
REAL(KIND=kdp), DIMENSION(lrcgd1,*), INTENT(INOUT) :: ra
REAL(KIND=kdp), DIMENSION(*), INTENT(IN OUT) :: rr
REAL(KIND=kdp), DIMENSION(0:nsdr-2,0:nsdr-2) :: h

mecej4 · Joined: 31 Oct 2006 Posts: 1899

Encouraged by your comments, John, I succeeded in creating a short reproducer, which I hope Paul will consider.

PaulLaidler · Posted: Sat Apr 17, 2021 7:40 am Post subject:

mecej4

Thank you for this. I have made a note to investigate.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2615 Location: Sydney

mecej4,

I had a look at your short reproducer.
For my testing with FTN95 Ver 8.64, on entry to gcgris, the array ap has valid dimensions, so I am not sure if this identifies the previous problem (that I am seeing).

I have produced a single file reproducer from your larger test.

https://www.dropbox.com/s/jvwul6p11ol83gi/Agcgris_test.f90?dl=0

https://www.dropbox.com/s/gdw1ndc10g5iccw/Agcgris_test3.f90?dl=0

For my testing of this reproducer, on entry to gcgris, the arrays ap, bp have incorrect dimensions in SDBG, but array ra has valid dimensions.
I think the invalid dimensions for ap and bp is a problem to be addressed.

I compiler as :
ftn95 agcgris_test /checkmate /link
sdbg agcgris_test

I set breakpoints at:
line 210 : at call to gcgris : shows arrays ap, bbp and ra defined as expedted
line 115 : entry to gcgris : shows arrays ap, bp with invalid dimensions
line 145 : entry to abmult : shows X,Y with correct dimensions
F6 to start test

Paul,
I see an error at line 115 : entry to gcgris that AP and BP have invalid dimensions in SDBG, reported as (1,7), but first dimension should = nbn = 100.
I identified this as a problem in the bigger program.
( note write (*,*) ... size(ap,1) reports correct value of first dimension, different from SDBG ?? )
In my example:
lines 96:101 show alternative definitions of AP and BP, although there was no change to outcome.
lines 130:132 show alternative calls to abmult. The first 2 resulted in a crash in the larger program, while the 3rd ran to completion.
I have only tested the 3rd option with this reproducer.

I thought this identified the problem for FTN95/SDBG that needs checking.

edit:
the second link above for Agcgris_test3.f90 reproduces the original crash. This uses the 1st declarations (as in original post program) and then 3rd then 2nd call.
3rd call works, but 2nd call that requires array dimensions fails.

I hope this easily demonstrates the problem.

ps.
I do find the use of "SAVE" in a module to be annoying.
I do not know of any F95+ compiler that requires explicit use of SAVE in a module. SAVE does nothing as modules do not go out of scope.