forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

FP Stack fault in /CHECKMATE
Goto page 1, 2  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
wahorger



Joined: 13 Oct 2014
Posts: 1214
Location: Morrison, CO, USA

PostPosted: Sat Jan 27, 2018 8:28 pm    Post subject: FP Stack fault in /CHECKMATE Reply with quote

I'm getting a FP Stack Fault in my main code. I've isolated the routine and the fact that it occurs only when compiled with /CHECKMATE. The error is:

Quote:
Runtime error from program:f:\temp\checkmate\win32\forumtesting.exe
Floating point stack fault
Floating point stack fault at address 1012d2d2
1012d227 IO_convert_long_double_to_ascii [+00ab]
1013975e R_WSF_main [+081c]
101396f6 D8__WSF [+0024]
ALASKA - in file forumtesting.for at line 123 [+18b8]
MAIN# - in file forumtesting.for at line 4 [+00bd]

eax=0360f7d8 ebx=000d7f3b ecx=00000000
edx=00000001 esi=0360fc80 edi=0360cfe2
ebp=0360d028 esp=0360cfd4 IOPL=0
ds=002b es=002b fs=0053
gs=002b cs=0023 ss=002b
flgs=00010202 [NC OP NZ SN DN NV]

1012d2d2 fxtract
1012d2d4 dfstp [ebp-0x28]
1012d2d7 fldlg2


The link to the sample Plato project is: https://www.dropbox.com/s/p5c5k9um3v41bsy/FPStackFaultSample.zip?dl=0
Back to top
View user's profile Send private message Visit poster's website
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Sun Jan 28, 2018 1:44 am    Post subject: Reply with quote

I built and ran your code with the 8.1 version 32-bit compiler, with /checkmate. Instead of an FPU stack overflow, the program ended after detection of the use of the undefined variable Q on line 149 (P is also undefined at the same line).

I certainly believe that X87 stack overflow occurs with FTN95-compiled 32-bit programs more often than it should occur. In fact, given the number of long expressions that I see in your program, I expect X87 stack overflow to happen. Fixing the problem can be expedited if you provide a test program that exhibits it without other errors (such as undefined variables) clouding up the issue.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2551
Location: Sydney

PostPosted: Sun Jan 28, 2018 2:46 am    Post subject: Reply with quote

The example code has many examples of old style calculations where a complex calculation spreads over many lines.
I may be wrong, but isn't the "floating point stack overflow" simply that the parser runs out of registers, rather than breaking up the expression into bits.
Did F77 do it this way ?

I got this earlier error on line 123.
my patch was:
Code:
!z        S = (EP**(FACT) +EP**(AFACT))/2.0D0
!z        AMU = (1.0D0/(2.0D0*B))*DLOG((S+FF*R+G*DSIN(U/D))/(S-FF*R -
!z     $ G*DSIN(U/D))) -(C/B)
        z1 = EP**(FACT)
        z2 = EP**(AFACT)
        S = ( z1 + z2 )/2.0D0
   print *,s
        zz = ( S+FF*R + G*DSIN(U/D) )
     $     / ( S-FF*R - G*DSIN(U/D) )
        AMU = (1.0D0/(2.0D0*B)) * DLOG ( zz ) - (C/B)

note there are lots of <HT> in this code.

certainly P and Q are not defined for goto 1400 from 1200 ( which my use of SDBG and /full_debug did not report ?)

I've always found it worrying that floating point stack overflow was not overcome by breaking up the expression and using fewer registers.

Does /64 have this same problem ?
Back to top
View user's profile Send private message
wahorger



Joined: 13 Oct 2014
Posts: 1214
Location: Morrison, CO, USA

PostPosted: Sun Jan 28, 2018 7:18 am    Post subject: Reply with quote

mecej4: This was run under V8.20.0. YMMV re: 8.1
Back to top
View user's profile Send private message Visit poster's website
wahorger



Joined: 13 Oct 2014
Posts: 1214
Location: Morrison, CO, USA

PostPosted: Sun Jan 28, 2018 3:33 pm    Post subject: Reply with quote

JohnCampbell, after looking at the code in some detail, I can see the problem you detected (UNDEFINED), and where it likely originates.

This code was obtained from the National Park Service under its Public Domain policy. So this issue dates back to the 1980's.

Thanks for pointing it out.

The problem is that P is used for the Z1 forward calculation, while R is used for the Z1 inverse calculation. Since the computation of both P and R is the same, this is the likely cause. I've searched my archives back to 1992 and this issue is there.

The ALASKA conversion is one that is rarely (if ever) used. Indeed the data set that caused the error was created in 1996 and was used by me while testing another section of code when this problem popped up.

I will be looking through the other sections of code to see if this was a "standard" coding problem in the other coordinate routines.

Again, thanks for the heads up.

The Stack Overflow issue is separate; I'll let Paul address this.

Bill
Back to top
View user's profile Send private message Visit poster's website
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Sun Jan 28, 2018 4:00 pm    Post subject: Re: Reply with quote

JohnCampbell wrote:
...

certainly P and Q are not defined for goto 1400 from 1200 ( which my use of SDBG and /full_debug did not report ?)

John, /full_debug (and /debug) do not insert code to check for subscript errors and undefined variables. In the debugger, I see the undefined variables showing value = 6.013470016999068e-154, or, in hexadecimal integer notation, Z'2020202020202020'.
Quote:
I've always found it worrying that floating point stack overflow was not overcome by breaking up the expression and using fewer registers.

I share the sentiment. The compiler writers, on the other hand, have to make a compromise between running out of registers, keeping temporary results in FPU registers as much as possible, and making as few accesses to memory as possible.

Quote:
Does /64 have this same problem ?

The problem is slightly different. Take the expression in Line 124:
Code:
(1.0D0/(2.0D0*B))*DLOG((S+FF*R+G*DSIN(U/D))/(S-FF*R - G*DSIN(U/D)))-C/B

The sub-expression FF*R+G*DSIN(U/D) appears twice. Will the compiler recognise this fact, and avoid calculating the sub-expression twice? I translated the full expression to X87 instructions by hand, ordering the operations as I would on an RPN calculator, proceeding from left to right. I found that I used six FPU registers, but I also observed that it took extra book-keeping to note the number of registers that were occupied. Next, I redid the calculation, trying to use fewer registers. I found that I could do the calculation using only three registers. The 8.10 compiler put out code that used four FPU registers.

The same compiler, with /64, used only three XMM registers, but did a lot more memory <-> XMM transfers than necessary. The 64-bit code must, of necessity, keep track of XMM register usage and use no more than the eight available (ignoring conventions regarding XMM register usage). With the X87 instructions, the stack-oriented instructions encourage one to assume that one has as many registers as needed. Most of the time, we are working with just ST0 and ST1 and, most of the time, we get away with not keeping track of register occupancy!
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2551
Location: Sydney

PostPosted: Mon Jan 29, 2018 6:50 am    Post subject: Reply with quote

mecej4,

Thanks for your detailed explanation. I think I understand.
My (remove the problem) approach to that line was to first calculate zz, which I hoped would not overflow the registers.
zz = ( S+FF*R + G*DSIN(U/D) ) &
& / ( S-FF*R - G*DSIN(U/D) )
( should have used zz = FF*R + G*DSIN(U/D) first )
However, the error might have been with line 123, so I simplified that also with z1 and z2.
.
Bill,

I had the need to convert Lat-Long values from charts for Maputo in Mozambique into X Y for the projection (I don't recall).
I thought I could use a few sin cos conversions, until I googled the transformation equations, similar to what you have posted.
I can't believe the complexity, and was worried about register overload back then. I gave up, found a GIS expert and sent him the 30 points I needed.
I would always try to "tidy" the calculation and use a few temporary z's to make the equation readable / auditable.
EP**FACT is going to be a problem, if EP is out of range. It is also something I would not expect in a coordinate transformation.

I'll stick to FE calculations !!

John
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7912
Location: Salford, UK

PostPosted: Mon Jan 29, 2018 8:42 am    Post subject: Reply with quote

Bill

Can you post a short and simple form of the issue that you want investigated.
Back to top
View user's profile Send private message AIM Address
wahorger



Joined: 13 Oct 2014
Posts: 1214
Location: Morrison, CO, USA

PostPosted: Mon Jan 29, 2018 5:54 pm    Post subject: Reply with quote

Paul, other than the example I've put in DropBox (link supplied in the original message), I don't know what else to do. It works when compiled as /RELEASE, and not as /CHECKMATE.

If I restructure the calculation that appears to be the culprit, the stack overflow does not occur. So my assumption it is a compiler code generation problem, but only when /CHECKMATE is used.
Back to top
View user's profile Send private message Visit poster's website
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Mon Jan 29, 2018 6:56 pm    Post subject: Reply with quote

Paul,
I also have several cases when the compiler in 64bit mode does not even compile the code if both /check /undef are used simultaneously but I can not provide small demo. In small code things very often go fine. So please look at any chance, like this one for example, to corner some last possible bugs and make compiler rock stable.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Mon Jan 29, 2018 9:32 pm    Post subject: Reply with quote

After looking at the error traceback (in the initial post) in more detail, I am quite puzzled. The FPU stack overflow occurred in Salflibc.dll, in the Fortran I/O routines, starting with a call to D8__WSF in the user's code. Since the X86 register convention requires that "The x87 floating point registers ST0 to ST7 must be empty (popped or freed) when calling a new function, and ST1 to ST7 must be empty on exiting a function" (see https://en.wikipedia.org/wiki/X86_calling_conventions), FPU stack overflow in the I/O routine can have nothing to do with the complexity of the arithmetic expression on Line-122 or elsewhere.

There is some inconsistency here, but I cannot check more precisely because I have the 8.1 compiler and I cannot reproduce the X87 stack overflow with Wahorger's source code.

On the other hand, once P and Q are initialised in the test code one should be able to compile and run the program without errors.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2551
Location: Sydney

PostPosted: Wed Jan 31, 2018 1:48 am    Post subject: Reply with quote

Could the error occurring with V8.20, actually be occurring on line 123 as reported:
print *,s
This would explain the I/O path.
Could when using /check, line 122 register usage is not finalised properly before doing I/O at line 123 ?

I have my own devilry with V8.20, which does not occur in V8.10 and I'm not getting close to the problem.
Back to top
View user's profile Send private message
wahorger



Joined: 13 Oct 2014
Posts: 1214
Location: Morrison, CO, USA

PostPosted: Wed Jan 31, 2018 3:49 am    Post subject: Reply with quote

John, Actually, no. This is the way the error is "trapped" even without the PRINT. It is always showing the next executable line.

It was because of this that it took me a while to find the offender......
Back to top
View user's profile Send private message Visit poster's website
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7912
Location: Salford, UK

PostPosted: Wed Jan 31, 2018 9:57 am    Post subject: Reply with quote

This is a bug in 32 bit FTN95 that we are working on.
Back to top
View user's profile Send private message AIM Address
narayanamoorthy_k



Joined: 19 Jun 2014
Posts: 142
Location: Chennai, IN

PostPosted: Wed Jan 31, 2018 10:14 am    Post subject: Reply with quote

Is the FTN95.exe release planned with this fix soon?
_________________
Thanks and Regards
Moorthy
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group