forums.silverfrost.com

John-Silver · Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley

Ok, I've got heritage code which I've been converting into FTN95 compatible format.
On truncating a very long line (Z132 chars - however do other compilers do that ?)
I get a FLOATING POINT STACK FAULT.

After some iterations I've determined it's cause is due to a specified formula.

I've concocted the following simple test program to show the problemo ....

PaulLaidler · Posted: Wed May 13, 2015 6:10 pm Post subject:

It runs OK for me using fixed or free format.

It is nothing to do with the stack size.

Make sure that there are no tabs in the code.

mecej4 · Joined: 31 Oct 2006 Posts: 1886

John-Silver: You are confusing the X87 register stack with the CPU memory stack. You can have stack overflow with either or both, but the issue here is with overflow of the X87 register stack. There are only eight X87 registers, ST0-7. It is up to the compiler to keep track of how many X87 registers are already in use before attempting to push another value on to the X87 stack, and spill the remaining real/double/tbyte values into memory if the X87 stack has been used up. If it does not do this task correctly, a runtime exception is taken by the FPU.

The /stack:nnnnnnnn option of some Windows/DOS linkers pertains to the in-memory stack. Changing it can do nothing about the hardware limit of eight FPU X87 registers.

I don't know which compiler version you used, but with 7.10 the program that you gave does not cause an X87 stack overflow with or without /OPT. The following extension of your program, however, does.

PaulLaidler · Posted: Wed May 13, 2015 8:25 pm Post subject:

I have logged this bug for investigation.

LitusSaxonicum · Posted: Wed May 13, 2015 9:29 pm Post subject:

When faced with expressions such as are given for ans2a and ans3a, ans2a saves 3 trig functions and a number of divides and multiplies as well as adds, so good practice prior to the availability of optimising compilers would be to follow the ans2a route.

An extremely common optimisation is the automatic removal of common subexpressions (i.e. pre-calculating them just once), and in a way, I'm surprised that FTN95 doesn't do this when /OPT is specified (or appears not to have by John-S getting an fpu stack overflow).

Much more surprising, however, is that the fpu stack overflows at all, as the expression is hardly that complicated and managing the fpu is central to every version of the compiler since FTN77/386 !

Eddie

mecej4 · Joined: 31 Oct 2006 Posts: 1886

I suspect that the compiler takes care of the X87 stack limit correctly in most cases, which is why this problem has not been the subject of many complaints. The case where it produces code that causes X87 stack overflow is, I think, when the expression being calculated contains one or more terms containing the tan() function.

This function differs from the other trigonometric functions in that, instead of ST0 being replaced by the function (sin, cos, etc.) value with argument equal to ST0, the tangent value appears in ST1, and 1.0 appears in ST0 when the FPTAN X87 instruction is used. The 1.0 in ST0 is about to be discarded in the next instruction, which is FSTP ST0. Unfortunately, the X87 stack will have overflowed already before the FSTP instruction is reached. This may be caused by a simple error in keeping tabs on how many X87 registers are used up.

Here is a shorter example which exhibits the problem. Replace the tan() by some other function such as sqrt() or exp() [but not atan()!], and the X87 stack overflow will go away.

LitusSaxonicum · Posted: Thu May 14, 2015 10:56 am Post subject:

Hi Mecej4,

In the face of such a cogent explanation, I see why I never encountered such a problem: for me, the tan function is a 'new trick' and this dog was trained in an era where it hadn't been implemented. We had to make do with sin and cos, and get tan ourselves! (As you point out that FTN77 does).

My habit of manual subexpression removal dates from my reading of an early book on Fortran optimisation, and the difficulties of getting things on a card, especially if one needed the expression to be readable AND had a limited number of continuation cards available!

I note that ST(7) must be empty to avoid an invalid-operation exception with FPTAN, and perhaps this isn't checked or allowed for (and I'm not going looking as I have better things to do!)

The general points still stand, however, and they are:

1. Common subexpression removal - whether done manually or as an optimisation - should help,

2. Managing the fpu stack is somewhat critical.

Perhaps the FTN77 treatment harks back to the differences between the behaviour of FPTAN from 8087-80187-80287 where the size of the argument was limited to the range 0-pi/4. This confuses me, as I understood FTN77 to require a 386/7 or better. However, you could at one stage use a Weitek coprocessor instead of a 387, and maybe that was a factor, as Weiteks were rather different to Intel coprocessors in their architecture.

Eddie

John-Silver · Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley

I'm amazed, I come back to my PC this morning and the test code I posted now works !!!
Why on earth could this be ???? is the compiler sensitive to temperature or something ! LOL

However, I ran mecej4's slightly modified version and it FAILED !

John-S

For Paul's info., when researching before posting I came across this /as Don McClean might say : from long , long, time ago - 2003 to be precise).It might be of some use, just confused the hell out of me .... popping n pushing ....

http://compgroups.net/comp.lang.fortran/floating-point-stack-fault/593764

LitusSaxonicum · Posted: Thu May 14, 2015 11:59 am Post subject:

Perhaps it never got reported ....

It certainly needs fixing.

The book "8087 applications and programming for the IBM PC, XT, and AT" by Richard Startz never got updated as far as I know, and the guy who borrowed and didn't return my copy died some years ago. I seem to remember something in it about stack overflow.

E

PaulLaidler · Posted: Thu May 14, 2015 4:42 pm Post subject:

The bug reported in comgroups appears to have been fixed.

John-Silver · Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley

another possible help, a bug in gcc , explained on same lines as mecej4 did above. seems to be a convo. between 2 developers on a bb .

http://sourceware.org/ml/newlib/2009/msg01053.html

LitusSaxonicum · Posted: Thu May 14, 2015 8:22 pm Post subject:

The constant 1.0 is placed so that it and tan(x) are in the right place to compute cotan(x) (from 1/tan(x)).

mecej4 · Joined: 31 Oct 2006 Posts: 1886

John-Silver:

PaulLaidler · Posted: Wed May 25, 2016 7:47 am Post subject:

This bug has now been fixed.