forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Stack error
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Thu Dec 03, 2015 12:40 am    Post subject: Stack error Reply with quote

I have a section of code that used to run against hundreds or thousands of data files without error. Now it fails after processing around 57 files. Here's the details.

I am getting a stack overflow on a WRITE to a character variable (Checkmate compilation). If I run the Release compilation, I get no stack overflow.

The WRITE operations are (generically):
Code:

character*256 RAW_DATA
      REAL FPNUM
.
.
.
WRITE(RAW_DATA,66000)fpnum
66000 format(f11.3)




The Checkmate code now performs somewhere between 126980 and 131545 before the stack overflow occurs.

Each of the WRITE operations is within a SELECT CASE statement.

Any ideas?
Back to top
View user's profile Send private message Visit poster's website
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Thu Dec 03, 2015 12:57 am    Post subject: Reply with quote

Even after I removed the WRITE for the character data (about 20 places), the error still occurs on the WRITE operation to format the data to character format for manipulation later on.

I am using TRIM() on every data record as well as lots of concatenation.

Any thoughts?
Back to top
View user's profile Send private message Visit poster's website
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Thu Dec 03, 2015 8:52 am    Post subject: Reply with quote

Personally I would be happy with the release mode calculation. The stack overflow indicates a lack of memory rather than a fault in your program.
Generally I would use /checkmate for code development rather than production runs.

/checkmate does have a overhead both in reduced memory and in increased run times.
Back to top
View user's profile Send private message AIM Address
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Thu Dec 03, 2015 6:54 pm    Post subject: Reply with quote

I find the stack overflow business really rather interesting because I have never experienced it myself. I put this down to the way I program which is to communicate between routines with a lot of named common blocks so that most of the variables are statically allocated. I very rarely use locally defined arrays, and I very rarely pass anything larger than a couple of individual variables via the arguments in a subroutine call – except of course when I am sending stuff to system routines. I’m quite prepared to be told that this is old-fashioned, or possibly not even very efficient, but it does minimise the use of the stack.

You put a big load on the stack by having lots of local variables, particularly long ones like arrays, and by passing lots of things in great nested packages of subroutine and function calls. You can’t do anything about these nested packages once you’ve written your program, but you can force FTN95 not to use the stack for local variables. Although the documentation says that you should not use the compiler option /save, it does make all the local variables allocated statically and this removes a lot of stress on the stack. I think if I were you I might give that a try when using the Checkmate option. On the other hand, for production work I’d go along with Paul’s advice and use the release mode which does seem to work.

It did occur to me to look through the documentation to find out how the stack size might be increased. I’ve assumed that you are not using absolutely every single byte in your computer, and that therefore you are not using it like Dan Right or John Campbell. If you are, then trying to increase the stack size won’t be the slightest bit of use to you.

And it was there, looking for the way to increase the stack size, that I got into difficulty. I’ve used several versions of FTN77/FTN95 and there was a time that one set the stack size along with the WINAPP directive like this:
Code:
WINAPP stack_size, something_else_size

I had a cursory look for this, and the documentation I have to hand is either too old or too new to explain it! I then looked for a /stack option in either the compiler or the linker. I found both eventually, but it did seem that you need to possess some other knowledge in order to use them. FTN95.CHM states that the compiler directive is a .NET only facility, although I suspect it isn’t - it used not to be back in the time of DBOS. Let’s suppose that some kind soul is prepared to explain the use of either the compiler or the linker (SLINK) stack setting options to us in words of one syllable, then there is still a problem in working out how FTN95 sizes its stack in the first place.

So here is my suggestion for the questions to ask:

How do we find out what stack size has been generated by default?
What methods are there to increase the stack size from the default?
Which one is best? – If there is more than one method.
How do we decide how much extra stack might be needed, or is it just trial and error?

It is worth remembering that a stack overflow can be by as little as a byte or two, and it may be that the default stack size is very close to what is genuinely needed. No one is better placed to discover how much extra is required than you are, because you already experience stack overflow and for some modes of programming the default stack is probably amply adequate. If you did discover that the problem was solved with a few bytes, then it might pay Paul to revisit the sizing algorithm.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Thu Dec 03, 2015 7:06 pm    Post subject: Reply with quote

If you have the tools DUMPBIN and EXEHDR from one of the Windows SDKs, or Visual C-Express, Visual Studio, etc., or you obtained these tools packaged with a language compiler from some other vendor, you can find the information concerning the stack by running DUMPBIN /headers on the EXE, and you can raise the stack size by running EDITBIN /stack on the EXE. For more information on these tools, you can run a search on MSDN or Google. These tools work by reading and modifying the standard WIN32/PE header of a Windows EXE file.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Fri Dec 04, 2015 1:19 am    Post subject: Reply with quote

It has always been very confusing to me how to modify the stack.
One of the problems is I have not identified an easy way of reporting what the stack size is or how to change it. Sizes can be reported or defined as octal, decimal or hex there is not an easy way to identify what is expected.

I have attempted to change the stack and the result does not change. After failing in this approach (probably years ago!), I have given up.

What my knowledge does cover is how to avoid stack overflows and how to minimise local variable size. Shifting arrays from the stack to the heap (using ALLOCTE) or static storage (using MODULE or COMMON) does fix the problem.

It would be good if there was a clear description of how to change or report the stack size. Perhaps I am stupid, but I can't reliably do this at the moment. Could (does) SDBG have this capability ?

John
Back to top
View user's profile Send private message
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Fri Dec 04, 2015 2:47 am    Post subject: Reply with quote

John, all my routines are compiled with the /SAVE option on. And, I do make liberal use of common blocks, even for data that are local to the routine. This particular routine does have a lot of local variables, but not much in the way of local space requirements.

What I'm intending to do is to check the value of the stack at various locations within the code and attempt to isolate the cause of the crash in this way. I'm hoping that this will tell me, in an unambiguous way, what is getting "honked up".

Paul: I have almost gotten to the point where the code has been so stable that I've considered switching to the RELEASE version entirely. I may have to do this before I feel ready. I still am doing some development, but don't get many CHECKMATE related errors more than once a week,even when adding hundreds of SLOC.

Just FYI, the heap and stack allocations in the SLINK step were:
/heap:1000000 /stack:600000 when the problem first occurred, so I changed it to
/heap:6000000 /stack:4800000,2400000
with no change in where the stack crashes.
Back to top
View user's profile Send private message Visit poster's website
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Fri Dec 04, 2015 6:44 am    Post subject: Reply with quote

Quote:
Just FYI, the heap and stack allocations in the SLINK step were:
/heap:1000000 /stack:600000 when the problem first occurred, so I changed it to
/heap:6000000 /stack:4800000,2400000
with no change in where the stack crashes.

This is the experience I also found, where changing the stack size had no apparent effect on the stack overflow error. There is no clear indication that the stack size has been reset and can take effect.
It really annoys me when the reporting of what is being achieved is not clearly provided. It certainly could be improved.

My load map says: Stack = 3200000 Heap = 100000 : Is this decimal or hex, like most other numbers in the load map ?
wahorger, do you see different values reported for this after using your new settings ?
If it does, you should take the default value and add a zero

ALLOCATE must use a different heap ? or not use the Heap for large arrays ?

All my comments are based on assuming this is caused by large local arrays being allocated to the stack. If you use /save this should not be the case ? My solution has always been to identify the large arrays being placed on the stack.
Do you use recursion or array sections as these can create large temporary arrays.

There must be a cause of the stack overflowing, or is it being corrupted ?

John
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Fri Dec 04, 2015 1:39 pm    Post subject: Reply with quote

The PE32 file structure, in particular the header formats, are set by Microsoft and widely published. There are tools to view and print the contents of the headers. I chose to use Prof. Agner Fog's excellent ObjConv utility ( http://www.agner.org/optimize/#objconv ) to help demystify the issue. You could use the MS DUMPBIN program similarly.

Here is an example program with a substantial stack consumption.
Code:

program stkovfl
implicit none
call sub()
end program

subroutine sub()
implicit none
integer, parameter :: N = 70000
integer x(N),i
double precision s
do i=1,N
   x(i)=2*i-1
end do
s=0
do i=1,N
   s=s+x(i)
end do
write(*,*)s
return
end subroutine

By default, SLINK provides enough stack for this program to run with no special effort but, for illustration, I built it with a smaller stack:
Code:

ftn95 stk.f90 & slink stk.obj /stack:0x40000,0x20000

The program crashes with stack overflow. So, I double the second stack size parameter, i.e.,
Code:

slink stk.obj /stack:0x40000,0x40000

This time, the program runs. Running OBJCONV on the EXE with the -df option gave, among other things, the following:
Code:

Size of stack reserve: 0x40000
Size of stack commit: 0x40000

As you can see, even if you have an EXE with no information regarding which linker was used, you can use a utility such as OBJCONV to obtain the stack size information that is contained in the file headers. This particular tool also displays the "0x" prefix to remove an ambiguity about the numbers being decimal/hexadecimal.

How much stack does your program need? That can be a bit hard to ascertain, if your compiler system does not provide help. Some versions of MS Link warn when they build an EXE with an insufficient stack size. You may need to experiment. For instance, If I had started with /stack:0x1000,0x1000 and repeatedly doubled the two numbers until the program ran, I would have seen "no apparent effect" after five attempts, but the program would have run on the sixth attempt.

If your program is big and linking is slow, it would be more efficient to link once and simply modify the stack fields in the EXE header using a tool such as EDITBIN (included in MS C, Windows SDKs, etc):
Code:

editbin /stack:0x40000,0x40000 stk.exe


For details regarding what "commit" and "reserve" mean, please see <https://msdn.microsoft.com/en-us/library/windows/desktop/ms686774(v=vs.85).aspx> .
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Fri Dec 04, 2015 5:38 pm    Post subject: Reply with quote

Just a couple of points to add to mecej4's excellent explanation.

The integer values after WINAPP were used with the former DBOS operating system. These values are now ignored and should be left out.

You can sometimes get an idea of how the stack is expanded by looking at the /explist for the code. In the listing for mecej4's program you will see "sub esp,=280024" and the 28000 is 4x7000 with an integer occupying 4 bytes. Also there is a similar call to __adjust_stack_f at the end of the subroutine. The register esp is used for the "stack pointer". Here the association is simple. In other situations the stack might be extended to handle automatic arrays or temporary array sections or even for passing a non-scalar result of a function back to the caller.

All of this relates to Win32 only.
Back to top
View user's profile Send private message AIM Address
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Fri Dec 04, 2015 11:47 pm    Post subject: Reply with quote

I knew that the WINAPP additions were a historical relic, but they did give me the introduction to asking the questions that have been so admirably answered above.

I followed up the Agner Fog website, and found answers to some (actually unasked) questions there, so thanks to Mecej4 for the introduction.
Back to top
View user's profile Send private message
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Sat Dec 05, 2015 12:34 am    Post subject: Reply with quote

I supply the parameters as decimal, and they are reported in the linker in hex.

I have had several unexplained crashes due to this stack overflow error. It still mystifies me, even after doubling the stack commit and reserve that it will still crash, always at the same place during execution. If I then change something, it may suddenly start working. But what I changed is some relatively innocuous line of code somewhere. Maybe I moved a data declaration (but not into/out-of common). I baffles me.

It is my intent to do some more detailed tracking of the stack using a "probe" at various points in the code to see if there is a leak somewhere, and running the same code without /CHECKMATE to track its stack usage as well (just to make sure).

So, if the error is actually there, I should be able to track it down. If it still occurs, but the stack appears to be just fine, then there might be some other problem.

I'll do what I can. But at some point.....
Bill
Back to top
View user's profile Send private message Visit poster's website
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Sat Dec 05, 2015 1:16 am    Post subject: Reply with quote

Bill W.:

If your application is a character mode program, not very large (the data files of reasonable size, as well) and you do not object to posting them here or somewhere in the cloud, I have some forensic tools that I can use to try and catch the source of the stack overflow problem.

I have noticed with some other Fortran compilers that the I/O runtime sometimes leaks a few bytes of memory. If so, perhaps the heap is growing towards the stack, and at some point a stack allocation can fail. I have no reason to suspect that this is the case with FTN95, but it may be worth checking this possibility.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Sat Dec 05, 2015 2:41 am    Post subject: Reply with quote

Mecej4,

I am still puzzled by part of the report you provided.

If "ftn95 stk.f90 & slink stk.obj /stack:0x40000,0x20000" crashes with stack overflow.
But "slink stk.obj /stack:0x40000,0x40000" the program runs, what is going on ?

The two examples have a different commit, but the same reserve. Was the error in the first case that the stack of Reserve:0x40000 was not contiguous ?

Isn't the stack reserve: 0x40000 the maximum stack size, while commit: 0x20000 the initial size ?
I presume:
stack:4000000 ( assume read as decimal this would allocate about 4mb for stack ) is similar to
stack:0x400000 ( read as hex )

Have I been wrong in the past when I provided "stack:40000000" where as I should have provided "stack:40000000,40000000", ie is it wrong to not provide a commit value?

My approach to these problems has always been to identify what is using up the stack and try to minimise this, rather than change the stack size, which can be default sized differently with different linkers.

John
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1885

PostPosted: Sat Dec 05, 2015 3:31 am    Post subject: Reply with quote

If you read the Microsoft page that I referred to above, you can see something that may answer you: "The system commits additional pages from the reserved stack memory as they are needed, until either the stack reaches the reserved size minus one page (which is used as a guard page to prevent stack overflow)". I am not knowledgeable about corner cases where reserve size = commit size, etc., nor do I know if the description on that MS page applies to all versions of their VC compilers and tools.

When I write a program, I try to keep the stack small by using allocatable or statically allocated, saved arrays. However, when helping with some old F77 or F66 code, I find it necessary to specify a large enough stack to allow the program to be run with few changes.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group