forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Perplexing bug in program (or compiler?)
Goto page Previous  1, 2
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2351
Location: Sydney

PostPosted: Sun Apr 18, 2021 6:50 am    Post subject: Reply with quote

Could a possible explaination be: as abmult is CONTAINS, it has an implied INTERFACE.
The "CALL abmult (ap(1,L),bp(:,L))" and "CALL abmult (ap(1,L),bp(1,L))" are both transferring elements of the array (plus a start address for an F77 wrapper)
Both of these calls could fail with x declaration and INTENT(OUT)
REAL(KIND=kdp), DIMENSION(*), INTENT(OUT) :: x
although removing ", INTENT(OUT) " still crashes.

The implied INTERFACE could be causing corruption of the executable ?

The following is a cleaned out test that fails for both 32-bit and 64-bit with FTN95 Ver 8.64

https://www.dropbox.com/s/basdey4fkw4t5sw/Agcgris_test4.f90?dl=0

Running any of the versions I have posted in SDBG; on entry to GCGRIS, the 1st dimension of AP and BP are wrong in Vars window, but are reported correctly in program using SIZE. Using subscripts for AP or BP causes a crash.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1580

PostPosted: Sun Apr 18, 2021 1:55 pm    Post subject: Reply with quote

Within a subroutine or function, being viewed by the user in the debugger variables pane, a dummy argument has a split personality. Attributes (such as array bounds of assumed size dummy arguments) that are not visible in a Fortran program are needed when a checking option has been used, and the designer of the debugger has an option whether to allow "sneak-peeks".

At the same time, one has to avoid conflating the actual and dummy arguments. I think that FTN95 has made correct choices in this regard, since I find myself having to move back to the caller in the stack window when I want to check the actual argument for shape, size, etc., and I have no reason to be confused as to whether I am viewing the attributes of the dummy argument or the actual argument.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1580

PostPosted: Sun Apr 18, 2021 3:08 pm    Post subject: Re: Reply with quote

JohnCampbell wrote:

...
ps.
I do find the use of "SAVE" in a module to be annoying.
I do not know of any F95+ compiler that requires explicit use of SAVE in a module. SAVE does nothing as modules do not go out of scope.


The Fortran 2003 Standard says in Section 16.5.6:

9 (3) When execution of an instance of a subprogram completes,
10 (a) its unsaved local variables become undefined,
...
14 (c) unsaved nonfinalizable local variables of a module become undefined unless another
15 active scoping unit is referencing the module, and


The clause (c) is what makes it prudent to specify SAVE for module variables. Fortran 95 has similar wording. I think (I have not checked carefully) that one needs to move to F2008 to make doing this redundant.

I tend to use many compilers, new and old, when tracking down bugs in code and compilers, so a reproducer may contain additional "annoying" features. Creating a reproducer involves a lot of "CUT" and some "PASTE", and most parts of a reproducer can be expected to be unstylish, if not pointless.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2351
Location: Sydney

PostPosted: Sun Apr 18, 2021 3:14 pm    Post subject: Reply with quote

mecej4,

I am not sure where your comments about believing the dummy arguments bounds leaves my understanding of the problem ?

In your latest cut-down example,
entering gcgris, ap has correct dimensions
entering abmult, x has incorrect size of 1, so for irtn = 2 produces a subscript out of bounds error, different from the previous failure report.
I think this is a different presentation of the error/bug.

In my latest cut-down example,
entering gcgris, ap and bp have incorrect first dimension.
CALL abmult (ap,bp) works successfully, while
CALL abmult (ap(1,L),bp(1,L)) fails with integer overflow at the call.
(I expect CALL abmult (ap(1,L),bp(:,L)) would be similar.)

I am assuming that either:
resolving the memory address for ap(1,L) or bp(1,L) produced integer overflow,
or some other problem with stack info for the call.
Note also, this example produces the same error report if running in SDBG or as Agcgris_test4.exe.

Can you or Paul identify where the integer overflow is being generated ?
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1580

PostPosted: Sun Apr 18, 2021 4:17 pm    Post subject: Reply with quote

John, the behaviour of my test program differs between 32-bit and 64-bit versions.

The 32-bit EXE terminates with integer overflow, probably while computing how many bytes to set to 'undefined', and thinking that that number is more than 32 bits permit.

The 64-bit version terminates with access violation, probably trying to set gigabytes of memory to 'undefined', but the count of bytes is adequately represented using more than 32 but less than 64 bits.

Turning now to your short test program (the one that you posted to Dropbox), we see different behaviour with 32 and 64 bit EXEs compiled with /check. With Version 8.71, the 64-bit EXE runs to completion, whereas the 32-bit EXE aborts with integer overflow.

Your question regarding where the integer overflow happens is tricky to answer. Normally, integer overflow occurs billions of times and is ignored. If code is produced to detect and trap integer overflow, and such a trap is sprung, the question is whether the overflow originated in the arithmetic performed by the user's code, or in the calculations of addresses, array bounds, etc. These secondary calculations are more numerous when /check or /checkmate has been specified.

I suspect that in the present case the overflow occurred in the secondary address calculations and not in the user code.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2351
Location: Sydney

PostPosted: Mon Apr 19, 2021 1:01 am    Post subject: Reply with quote

Mecej4,
I modified your latest reproducer to produce both errors.
Compile FTN95 /checkmate /link
Run SDBG
place breakpoints at lines 22 and 34 and run
stop at 22 : looks ok
stop at 34 : 1st > X(1) loop k to allow continue; incorrect bounds
stop at 34 : 2nd > X(42)
no 3rd stop showing integer overflow in call
Code:
MODULE mcs
  IMPLICIT NONE
  INTEGER, PARAMETER :: LRCGD1 = 19
  INTEGER :: NRN
  REAL, DIMENSION(:,:), ALLOCATABLE, SAVE :: app
END Module

Program HST3d
USE mcs
IMPLICIT NONE
NRN = 7
allocate(app(NRN,0:5))
CALL gcgris(app)
print *,app(3,0)
end Program

SUBROUTINE gcgris(ap)
  USE mcs
  IMPLICIT NONE
  REAL, DIMENSION(NRN,0:*), INTENT(IN OUT) :: ap
  integer :: L = 0
  CALL abmult( ap(1:NRN,0) )    ! > sdbg reports x(1) ; use do 1,1 to continue
  CALL abmult( ap )             ! > sdbg reports x(42)
  CALL abmult( ap(1,L) )        ! > sdbg reports integer overflow
end subroutine gcgris

SUBROUTINE abmult(x)
  USE mcs
  IMPLICIT NONE
  REAL, DIMENSION(*), INTENT(OUT) :: x
  INTEGER :: irn
  integer :: k = 0
!
  k = k+1
  write (*,*) 'at abmult',k
  DO irn=1,k ! nrn
     x(irn) = 0.1
  END DO
END SUBROUTINE abmult
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7142
Location: Salford, UK

PostPosted: Wed Apr 21, 2021 8:36 am    Post subject: Reply with quote

mecej4

Initial reactions.

1) The code appears to run successfully when /CHECK is not used.

2) For 32 bits, it fails when using /CHECK with array subscript out of bounds. This appears at first sight to be a false error report.

3) For 64 bits, it appears to run successfully when /UNDEF is used but hangs completely with /FULL_UNDEF.

4) I am doubtful about the validity of the Fortran. The use of INTENT means that interfaces are required but then NRN is declared in a module. So I suspect that the interfacing needs to be applied via the module and this would make more sense from a object orienting point of view.

5) Then there is also the array APP that needs to be considered. It is declared in the module and passed as an argument to gcgris. But gcgris has direct access to APP via the USE statement.

My initial conclusion is that gcgris must be defined in the module and that abmult needs an interface.
Back to top
View user's profile Send private message AIM Address
Theo1002



Joined: 19 Apr 2021
Posts: 15
Location: Ingelheim am Rhein, Germany

PostPosted: Wed Apr 21, 2021 11:25 am    Post subject: Different results for 32Bit and 64 Bit version Reply with quote

I don't now if my problem fits under this topic, but I'll give it a try.
Using Silverfrost ftn95 for reactivating older Fortran programs which run in the 80s on an IBM 3032 Mainframe generally works fine, when replacing IBM specific commands/conventions.
Currently, however, I'm stuck with program calculating matrix elements. What puzzles me is the fact, that the 32Bit-Version delivers different results compared with the results of the 64Bit-Version (I'm using Plato for compilation and SDBG version 8.70.00).
Whereas the 32Bit-version produces correct results, the ones calculated with the 64Bit version are obvioulsy wrong. Trying to isolate the problem, kept me busy the last days, but is not so easy due to 20 subroutines and many function calls. Since I didn't succeed to isolate the root cause of the differences between the results of the 32- and 64-Bit versions up to now, I would like to ask the following questions:
Did anybody experience similar deviations between results of 32- and 64-Bit versions?
Can certain compiler options help to isolate the problem?
Please note, that I have no experience with command line use of SDBG, but use the Plato environment.
Best regards
Theo
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7142
Location: Salford, UK

PostPosted: Wed Apr 21, 2021 11:58 am    Post subject: Reply with quote

Theo1002

Can I suggest that you copy this post to a new thread. Then I could make some recommendations. There is nothing to suggest that this is related to the current issue.
Back to top
View user's profile Send private message AIM Address
mecej4



Joined: 31 Oct 2006
Posts: 1580

PostPosted: Wed Apr 21, 2021 1:59 pm    Post subject: Reply with quote

Paul,

Thanks for looking into the problem.

Perhaps I can show that it is not the absence of an interface that is causing problems by just modifying the reproducer, instead of invoking sections of the Fortran standard.

The following version contains the subroutines inside module MCS, so an interface is provided, whether or not one is required.

Code:
MODULE mcs
  IMPLICIT NONE
  INTEGER, PARAMETER :: LRCGD1 = 19
  INTEGER :: NRN
  REAL, DIMENSION(:,:), ALLOCATABLE, SAVE :: app
CONTAINS

SUBROUTINE gcgris(ap)
  IMPLICIT NONE
  REAL, DIMENSION(NRN,0:*), INTENT(IN OUT) :: ap
  CALL abmult(ap(1:NRN,0))
end subroutine gcgris

SUBROUTINE abmult(x)
  IMPLICIT NONE
  REAL, DIMENSION(*), INTENT(OUT) :: x
  INTEGER :: irn
!
  DO irn=1,nrn
     x(irn) = 0.1
  END DO
END SUBROUTINE abmult

END Module

Program HST3d
USE mcs
IMPLICIT NONE
NRN = 7
allocate(app(NRN,0:5))
CALL gcgris(app)
print *,app(3,0)
end Program


Using FTN95 8.71, latest DLLS80.

The following options lead to a normal run, printing "0.10000"

/debug
/64 /debug
/check /64
/undef /64

The following options cause the program to end abnormally:

/check : subscript out of bounds (X is incorrectly displayed as having a size of 1 in SDBG)

/full_undef /64: access violation, need to run inside SDBG64 to see details of violation.

Please note the inconsistency in the behaviours with /check /64 (normal) and /check (subscript bounds violation).

P.S. Apologies, I erred in copying and pasting in the code, with the result that the program was incomplete. I have corrected the code (I hope!).


Last edited by mecej4 on Wed Apr 21, 2021 2:34 pm; edited 2 times in total
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7142
Location: Salford, UK

PostPosted: Wed Apr 21, 2021 2:31 pm    Post subject: Reply with quote

mecej4

I must be missing something. abmult is external to gcgris. abmult is called from
gcgris and the compiler is not informed that abmult uses INTENT(OUT).

I don't doubt that FTN95 is not coping with this but I think that we need to start with well formed Fortran code.
Back to top
View user's profile Send private message AIM Address
mecej4



Joined: 31 Oct 2006
Posts: 1580

PostPosted: Wed Apr 21, 2021 2:44 pm    Post subject: Reply with quote

Paul, my apologies -- I had pasted in a piece of the OLD code instead of the complete NEW code. I have corrected the code.

Regarding INTENT, I think that the position taken by the Fortran standard is that keeping the code in conformity with stated INTENT is the programmer's responsibility. However, one of the strengths of FTN95 is its ability to help the errant programmer in such matters, and it is to catch errors of this type that I often turn to FTN95.

In fact many, many old Fortran programs, such as those in Alan Miller's repository (https://wp.csiro.au/alanmiller/), often declare output array arguments as INTENT(OUT), when only a part of the array will be updated in a subprogram. Until a few years ago, when Mr. Ian Hounam of NAG advised me to the contrary, I myself did not know that INTENT(OUT) meant that all elements of the array became undefined as soon as the subprogram was entered.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7142
Location: Salford, UK

PostPosted: Wed Apr 21, 2021 4:14 pm    Post subject: Reply with quote

mecej4

I suspect that DIMENSION(*) in abmult is incorrect. It means "pass by size".

It should be DIMENSION(:) which means "pass by shape".

Anyway this change appears to fix the problem.

I would also remove the * in gcgris and the SAVE.
Back to top
View user's profile Send private message AIM Address
Theo1002



Joined: 19 Apr 2021
Posts: 15
Location: Ingelheim am Rhein, Germany

PostPosted: Wed Apr 21, 2021 10:31 pm    Post subject: Re: Reply with quote

PaulLaidler wrote:
Theo1002

Can I suggest that you copy this post to a new thread. Then I could make some recommendations. There is nothing to suggest that this is related to the current issue.


O.K. Paul, I tried to do so. Unfortunately the source coude was cut off (maybe too long). I will try again tomorrow.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group