forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

FP Stack Fault in Formula-Why does it occur, How to Fix it ?

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
John-Silver



Joined: 30 Jul 2013
Posts: 1520
Location: Aerospace Valley

PostPosted: Wed May 13, 2015 3:48 pm    Post subject: FP Stack Fault in Formula-Why does it occur, How to Fix it ? Reply with quote

Ok, I've got heritage code which I've been converting into FTN95 compatible format.
On truncating a very long line (Z132 chars - however do other compilers do that ?)
I get a FLOATING POINT STACK FAULT.

After some iterations I've determined it's cause is due to a specified formula.

I've concocted the following simple test program to show the problemo ....

Code:
      Program Test
      REAL*8 x,y,z,ans2,ans2a,ans3a

      x=1.0
      y=2.0
      z=3.0

      ans2=x/y+x/z+y/z+x*cos(x)+y*sin(y)+z*tan(z)
      print*,'ans2 = ',ans2
      ans2a=ans2+z
      print*,'ans2a = ',ans2a
      ans3a=x/y+x/z+y/z+x*cos(x)+y*sin(y)+z*tan(z)+z
      print*,'ans3a = ',ans3a

      end program


As you will see, ans2 and ans2a are done fine, but ans3a fails with the fp stack fault - i.e. just by adding a '+z' to the formula of ans 2!!!

So, please :-
a) if I increase the STACK size (I remember seeing on several posts in the past discussions of doing this) , will that solve the problem, and more importantly. how do I do that exactly ! ?
(Searching for STACK in the forums is useless , and also in the manuals where the only reference I've found says its only available for .NET !!!)

b) can someone explain in simple layman's terms what this fault is exactly
I've tried searching for explanations but all I find are incomprehensible (to me) discussions which seem to fly off on a tangent.

Thanks John

P.S.
The 60 character limit for Post 'Titles' really needs to be increased you know, it's getting really annoying ! Please add to list of proposed forum changes (post length, etc....). It might make the forum search more effective also because then more keywords could be included in post title and hence more identifiable in returned search results.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7924
Location: Salford, UK

PostPosted: Wed May 13, 2015 6:10 pm    Post subject: Reply with quote

It runs OK for me using fixed or free format.

It is nothing to do with the stack size.

Make sure that there are no tabs in the code.
Back to top
View user's profile Send private message AIM Address
mecej4



Joined: 31 Oct 2006
Posts: 1886

PostPosted: Wed May 13, 2015 6:59 pm    Post subject: Reply with quote

John-Silver: You are confusing the X87 register stack with the CPU memory stack. You can have stack overflow with either or both, but the issue here is with overflow of the X87 register stack. There are only eight X87 registers, ST0-7. It is up to the compiler to keep track of how many X87 registers are already in use before attempting to push another value on to the X87 stack, and spill the remaining real/double/tbyte values into memory if the X87 stack has been used up. If it does not do this task correctly, a runtime exception is taken by the FPU.

The /stack:nnnnnnnn option of some Windows/DOS linkers pertains to the in-memory stack. Changing it can do nothing about the hardware limit of eight FPU X87 registers.

I don't know which compiler version you used, but with 7.10 the program that you gave does not cause an X87 stack overflow with or without /OPT. The following extension of your program, however, does.
Code:
      Program Test
      REAL*8 x,y,z,u,ans2,ans2a,ans3a

      x=1.0
      y=2.0
      z=3.0
      u=4.0

      ans2=x/y+x/z+y/z+z/u+x*cos(x)+y*sin(y)+z*tan(z) +u/tan(u)
      print*,'ans2 = ',ans2
      ans2a=ans2+z+u
      print*,'ans2a = ',ans2a
      ans3a=x/y+x/z+y/z+x*cos(x)+y*sin(y)+z*tan(z)+u/tan(u)+u+z
      print*,'ans3a = ',ans3a

      end program
It is possible for the compiler to issue code that can calculate this longer expression without causing an X87 stack overflow. The key idea is to perform the '+' operation as soon as feasible, instead of calculating all the nine terms, putting their values on the X87 stack and postponing all the additions until all the terms have been evaluated.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7924
Location: Salford, UK

PostPosted: Wed May 13, 2015 8:25 pm    Post subject: Reply with quote

I have logged this bug for investigation.
Back to top
View user's profile Send private message AIM Address
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Wed May 13, 2015 9:29 pm    Post subject: Reply with quote

When faced with expressions such as are given for ans2a and ans3a, ans2a saves 3 trig functions and a number of divides and multiplies as well as adds, so good practice prior to the availability of optimising compilers would be to follow the ans2a route.

An extremely common optimisation is the automatic removal of common subexpressions (i.e. pre-calculating them just once), and in a way, I'm surprised that FTN95 doesn't do this when /OPT is specified (or appears not to have by John-S getting an fpu stack overflow).

Much more surprising, however, is that the fpu stack overflows at all, as the expression is hardly that complicated and managing the fpu is central to every version of the compiler since FTN77/386 !

Eddie
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1886

PostPosted: Thu May 14, 2015 1:25 am    Post subject: Reply with quote

I suspect that the compiler takes care of the X87 stack limit correctly in most cases, which is why this problem has not been the subject of many complaints. The case where it produces code that causes X87 stack overflow is, I think, when the expression being calculated contains one or more terms containing the tan() function.

This function differs from the other trigonometric functions in that, instead of ST0 being replaced by the function (sin, cos, etc.) value with argument equal to ST0, the tangent value appears in ST1, and 1.0 appears in ST0 when the FPTAN X87 instruction is used. The 1.0 in ST0 is about to be discarded in the next instruction, which is FSTP ST0. Unfortunately, the X87 stack will have overflowed already before the FSTP instruction is reached. This may be caused by a simple error in keeping tabs on how many X87 registers are used up.

Here is a shorter example which exhibits the problem. Replace the tan() by some other function such as sqrt() or exp() [but not atan()!], and the X87 stack overflow will go away.
Code:
      Program Test
      REAL*8 x,y,z,u,ans

      x=1.0
      y=2.0
      z=3.0
      u=4.0

      ans=x/y + x/z + y/z + z/u + x/u + sin(y) + tan(z) + 1.5*u
      print*,'ans = ',ans

      end

It is a bit curious that FTN77, given the same Fortran code, evaluates the terms from right to left, and calls a separate function to calculate tan(z), instead of using an inline FPTAN instruction. In this case, the FTN77-compiled program runs with no X87 stack overflow. Nevertheless, FTN77 shares the FTN95 characteristic of holding off on adding terms until all the terms have been evaluated, so FTN77-compiled code is also at risk of causing X87 stack overflow or stack spillover, when given long arithmetic expressions.


Last edited by mecej4 on Thu May 14, 2015 11:13 pm; edited 1 time in total
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Thu May 14, 2015 10:56 am    Post subject: Reply with quote

Hi Mecej4,

In the face of such a cogent explanation, I see why I never encountered such a problem: for me, the tan function is a 'new trick' and this dog was trained in an era where it hadn't been implemented. We had to make do with sin and cos, and get tan ourselves! (As you point out that FTN77 does).

My habit of manual subexpression removal dates from my reading of an early book on Fortran optimisation, and the difficulties of getting things on a card, especially if one needed the expression to be readable AND had a limited number of continuation cards available!

I note that ST(7) must be empty to avoid an invalid-operation exception with FPTAN, and perhaps this isn't checked or allowed for (and I'm not going looking as I have better things to do!)

The general points still stand, however, and they are:

1. Common subexpression removal - whether done manually or as an optimisation - should help,

2. Managing the fpu stack is somewhat critical.

Perhaps the FTN77 treatment harks back to the differences between the behaviour of FPTAN from 8087-80187-80287 where the size of the argument was limited to the range 0-pi/4. This confuses me, as I understood FTN77 to require a 386/7 or better. However, you could at one stage use a Weitek coprocessor instead of a 387, and maybe that was a factor, as Weiteks were rather different to Intel coprocessors in their architecture.

Eddie
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 1520
Location: Aerospace Valley

PostPosted: Thu May 14, 2015 11:38 am    Post subject: Reply with quote

I'm amazed, I come back to my PC this morning and the test code I posted now works !!!
Why on earth could this be ???? is the compiler sensitive to temperature or something ! LOL

However, I ran mecej4's slightly modified version and it FAILED !

John-S

For Paul's info., when researching before posting I came across this /as Don McClean might say : from long , long, time ago - 2003 to be precise).It might be of some use, just confused the hell out of me .... popping n pushing ....

http://compgroups.net/comp.lang.fortran/floating-point-stack-fault/593764
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Thu May 14, 2015 11:59 am    Post subject: Reply with quote

Perhaps it never got reported ....

It certainly needs fixing.

The book "8087 applications and programming for the IBM PC, XT, and AT" by Richard Startz never got updated as far as I know, and the guy who borrowed and didn't return my copy died some years ago. I seem to remember something in it about stack overflow.

E
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7924
Location: Salford, UK

PostPosted: Thu May 14, 2015 4:42 pm    Post subject: Reply with quote

The bug reported in comgroups appears to have been fixed.
Back to top
View user's profile Send private message AIM Address
John-Silver



Joined: 30 Jul 2013
Posts: 1520
Location: Aerospace Valley

PostPosted: Thu May 14, 2015 5:55 pm    Post subject: Reply with quote

another possible help, a bug in gcc , explained on same lines as mecej4 did above. seems to be a convo. between 2 developers on a bb .

http://sourceware.org/ml/newlib/2009/msg01053.html

Quote:
and what happens is that the x87 fptan instruction leaves two
values on the fpu register stack, a constant +1.0 as well as the actual tan
result. The _f_tan and _f_tanf routines advance the fp stack pointer to skip
over the useless (to us in this context) constant, but that doesn't work the
same as when you advance the sp of an ordinary stack, because the x87 fpu
tracks which registers are occupied by valid values, and next time the stack
pointer reaches down to this same register, it sees the value is valid (still
contains +1.0) and assumes the stack pointer has wrapped round, leaving QNaNs
in the result and signaling stack underflow as a result. This breaks both the
printf (that first one actually is a +0 when it's passed to printf, but it
comes out as -0 in the end) and tan itself (the second call to _f_tan returns
NaN).

The intel manual warns that using fincstp to advance the stack pointer isn't
the same thing as actually popping the stack, and the simple solution is to
add an ffree instruction to mark the register containing the constant unused
before we skip over it. The attached patch adds ffree instructions before the
only two fincstp instructions I could find by grepping libm/, and it fixes the
testcase.
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Thu May 14, 2015 8:22 pm    Post subject: Reply with quote

The constant 1.0 is placed so that it and tan(x) are in the right place to compute cotan(x) (from 1/tan(x)).
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1886

PostPosted: Fri May 15, 2015 1:58 pm    Post subject: Reply with quote

John-Silver:
Quote:
It might be of some use, just confused the hell out of me .... popping n pushing ....


The terms "pop" and "push" in the context of stacks have been with us for decades. There were, in fact, some mainframes with a stack-oriented architecture (Burroughs, for example). The 1980s TI99 home computer had no registers, and there were instructions that allowed a small portion of RAM to be used as a register file. The X86 family of processors have POP and PUSH instructions for loading and unloading registers using values on the stack, and specialised versions for pushing and popping the flags register.

The X87 instruction set, for the most part, is stack oriented. FLD is equivalent to a "push" and "FSTP" explicitly has the terminal P to remind us that it stands for "floating point store and pop".
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7924
Location: Salford, UK

PostPosted: Wed May 25, 2016 7:47 am    Post subject: Reply with quote

This bug has now been fixed.
Back to top
View user's profile Send private message AIM Address
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group