Silverfrost Forums

Welcome to our forums

Illegal Pointer problem

28 Oct 2016 8:16 #18252

I have raised this issue in the General forum and perhaps I should really have raised it here. Below is a summary of the discussion so far which I am nopw raising here in the hope that someone can give me an idea on how to go forward to find a solution.

Thanks for reading.

Hi I have been using FORTRAN for some time (Salford 77 and now Silverfrost Ftn95) and I am now getting an 'illegal pointer' error and thus an access violation during execution. The program has been growing slowly with no problems until now. As far as I am aware, I am not using pointers (certainly not defined, anyway) so I am wondering if anyone can shed some insight on what is/may be going on. I am running under DOS.

The code at which the problem occurs has been executed at least once before failing. The program is very basic and unsophisticated but I do need to overcome this problem to proceed.

Any comments. views, suggestions welcome!

Thanks

Can you reduce the size of the malfunctioning program and post the source code here? If the program needs to read data files or uses include files, those will also be needed.

When a program crashes, you usually receive a traceback. Please record and provide the traceback information, since we can determine from it whether the illegal access occurred in your code, the runtime library or in the OS.

Hi I was afraid that this would be necessary! The program has about 50 subroutines and reads data from 30 or so tables in different directories (it is a game I am working on). In addition there are saved games that need selection and loading so it would be a substantial undertaking (for me and for anyone looking at it also.

When you refer to traceback, do you refer to the list of called routines? That is certainly available.

If there is no other reasonable way to investigate the matter I will get a zipped file together; please let me know.

Thank you for your interest and help.

Yes. The traceback will let you know whether the crash occurred in one of your Fortran subroutines, the Fortran RTL or the OS. Having that information would let you look at the offending routine more closely than otherwise.

I am developing thr program using the Salford debugger, SDBG so I know just where the fault occurs. I give below the point at which it fails.

The ORGPLY routine is executed (at least) once and all seems to be OK. Then before the last HEADER call the variable P is 5 (P is not in a common block). When the problem is to happen, the HEADER call executes OK as far as the results are concerned (a screen commentary on progress) but when control comes back to the ORGPLY routine, P is already declared as an illegal pointer.

So the HEADER routine seems to do something to upset things. I have stepped through HEADER and checked the screen and I cannot see anything amiss - but clearly something odd is happening.

I tried putting P into a common block and this did allow the MENU call to work OK; but then the illegal pointer error happened in an input routine in the MENU routine.

Do you have any ideas for a way forwards from here?

Thanks!

SUBROUTINE ORGPLY (Text plus common blocks in INCLUDE statements. INTEGER K, I, L, P 10 CALL HEADER P=5 CALL MENU(P, K) IF(K.EQ.1) THEN ..........


Any help would be appreciated.

Thanks

28 Oct 2016 10:28 #18256

It looks as if you may have an interface problem with MENU. You could try compiling all code with /check, then run in SDBG. Another option would be to put all code in a single file and compile with /check.

29 Oct 2016 6:23 #18257

Thanks for your suggestion. I am already compiling with /check though!

I will try the 'single file' approach but I am wondering how this might help - although it would add a substantial re-compilation overhead to program development if it works.

You are pointing to possible problems with MENU but before the HEADER call all seems OK and on return P becomes an illegal pointer before the MENU call. Could it not be some problem in HEADER? Not that I have any idea what!

29 Oct 2016 7:26 #18258

Does it run OK if you don't use /check?

29 Oct 2016 8:19 #18259

No, it does not. It fails as before with P being an illegal pointer on return from HEADER.

29 Oct 2016 11:18 #18269

Have you tried (or is it OK with how your code is designed) to use the /SAVE option, making all local variables statically allocated?

30 Oct 2016 6:06 #18270

Thanks for your suggestion. I added /save to the compiler options and the program ran OK well past the previous point at which it failed previously.

I am now a little outside my comfort zone in understanding just what is happening. OK, I can see from the manual that the program is now using dynamic storage rather than the stack so does it imply that the stack was getting corrupted in some way before? If so, is it a case of waiting to see what the next problem is since something unusual is going on?

I also notice that 'its use should usually be avoided' according to the manual; so while I am pleased to be able to continue working I wonder what may be in store.

Or am I simply worrying too much?

Whatever, many thanks for your help.

30 Oct 2016 10:44 #18272

Alan,

Like you I have used FTN77 and later FTN95 for many years. I came across a problem like this (that is solved by /SAVE) once in all that time.

Essentially, in traditional Fortran, anything in a COMMON block has storage allocated to it for the whole program run (I declare it in the PROGRAM unit to be sure, as I'm not a trusting soul), but all other variables have space allocated when they are used, and this storage is released when the subprogram exits.

It happened to me in association with the return value for a WINIO@ call, so when you write:

IR = WINIO@( ...

and you have a subroutine that sets up window that continues in existence after the subroutine exits, the return path is lost because the storage associated with IR is freed.

Decades ago, many mainframe compilers allocated everything statically anyway, but FTN is smart.

If you program with MODULEs, then the rules of what is statically allocated and what is not are more complex (no doubt Mecej4 would say 'marginally').

If anything is helped by /SAVE, or indeed by SAVE in a subprogram, it is because you are relying on a variable continuing to be in existence when its storage has been released. I used to think I was ace at not assuming this, but WINIO@ caught me out. I never before used FUNCTIONs that returned an error/success code, as I regard this as a sort of abomination, having been brought up on a more restricted view of programming styles).

I have a feeling that if you spread WINIO@ calls across several routines the same problem arises. I certainly do this for very complicated windows.

(I made the assumption that you were using Clearwin+ from your mention of games, but there must be analogous situations even if you are not).

Eddie

30 Oct 2016 1:19 #18275

Quoted from LitusSaxonicum

Essentially, in traditional Fortran, anything in a COMMON block has storage allocated to it for the whole program run (I declare it in the PROGRAM unit to be sure, as I'm not a trusting soul.

Variables in a named COMMON block are not given an implicit SAVE attribute if that block does not occur in the main program unit. In such a case, every block that the user wants to save must be explicitly given the SAVE attribute in each subprogram where that block is used. (paraphrasing 5.2.4 of Fortran 95 standard).

30 Oct 2016 2:11 #18276

Well there you go. So even COMMON isn't the friend I thought it was. But then all my named COMMON blocks are always in my main program unit, because I started that way. Does BLOCK DATA give them a SAVE? It would be logical if it did, not that I use BLOCK DATA either! (I do my initialisation is SUBROUTINE BLOCK_DATA, and then I can re-initialise things if I feel the need).

What happens with blank COMMON - seems that has an implicit SAVE.

So what are the rules for MODULEs? Not that I need to know, but no doubt someone does.

Eddie

30 Oct 2016 3:01 #18278

Your common sense and experience serve you well, but standards writers have to cater to us pedestrians, too.

Then, there are historical reasons for the distinction between blank common and labelled common: on mainframes, blank common could not be initialized because the memory area in which blank common was placed was used by the linker/loader itself, and became available only when the loader was done and handed control to the just-loaded program.

I never liked BLOCK DATA. It is so easy to forget to include the corresponding OBJ file at link time. If you do forget, you won't see any linker errors, but your program will do unexpected things because the intended initialisations did not take place.

Section 5.5.2.4 of Fortran 95 says this about blank common:

Differences between named common and blank common

A blank common block has the same properties as a named common block, except for the following:

(1) Execution of a RETURN or END statement may cause data objects in a named common block to become undefined unless the common block name has been declared in a SAVE statement, but never causes data objects in blank common to become undefined (14.7.6).

(2) Named common blocks of the same name shall be of the same size in all scoping units of a program in which they appear, but blank common blocks may be of different sizes.

(3) A data object in a named common block may be initially defined by means of a DATA statement or type declaration statement in a block data program unit (11.4), but objects in blank common shall not be initially defined.

30 Oct 2016 4:19 #18281

Thanks to all who have commented. I hope that the /SAVE will keep things running, at least for a while. In passing, I did try the idea of putting all the code into a single file but that did not help at all.

So, thanks again, your help is really appreciated.

30 Oct 2016 4:31 #18282

Neat! But I hardly call you a pedestrian.

Ultimately, the point of the original request was to track down why the pointer error occurred, and subsequently why /SAVE fixed the problem. The answer lay with something becoming undefined. Now most of us who program by experience as distinct from a reading of the Fortran standard know about variables being undefined on exit from a program unit, but it creeps up on you with WINIO@, and I suppose too in named COMMON blocks that aren't defined in a main program unit.

The problem you describe (of a BLOCK DATA subprogram being missed at link time) is one I never encountered, but Les Hatton who used to write on such matters once wrote that if you used named BLOCK DATA routines, and used those names in an EXTERNAL statement, you at least got an error if they were missed. I thought that was dodgy, partly because I used machines/compilers that didn't even have BLOCK DATA, let alone named BLOCK DATA, and I regarded EXTERNAL as being intended for something else so his usage was abusing it. (I only ever use EXTERNAL in connection with Clearwin+).

I suppose that I reconcile my frequent reading of the user manuals with extremely rare recourse to the Fortran standard as the result of decades-long experience that compilers don't always follow the standard, but then compilers don't always follow their own manual ...

So, why don't you spill the beans on what you are likely to find happening to variables defined in MODULEs? If such a module is USEd in a main program unit, then is the data in it effectively SAVEd, but if not, then is it effectively lost when one exits the last subprogram that USEd it?

That could be the problem described in another thread.

Eddie

30 Oct 2016 5:18 #18284

Quoted from LitusSaxonicum ...What are you likely to find happening to variables defined in MODULEs? If such a module is USEd in a main program unit, then is the data in it effectively SAVEd, but if not, then is it effectively lost when one exits the last subprogram that USEd it?

Yes. In regard to this question, module variables and labelled common block variables are very similar. Here is the relevant extract of the standard, section 14.7.6:

(3) The execution of a RETURN statement or an END statement within a subprogram causes all variables local to its scoping unit or local to the current instance of its scoping unit for a recursive invocation to become undefined except for the following: ... (c) Variables in a named common block that appears in the subprogram and appears in at least one other scoping unit that is making either a direct or indirect reference to the subprogram. ... (e) Variables accessed from a module that also is referenced directly or indirectly by at least one other scoping unit that is making either a direct or indirect reference to the subprogram.

You may observe that even the wording is similar. The main difference is that module variables are accessed by name (as modified by rename clauses, if any) whereas named COMMON block variables are accessed by their positions relative to the base of the block.

To protect against module variables becoming undefined, one can specify SAVE for a subset or all the variables in a module, as an alternative to USEing the module in the main program.

30 Oct 2016 6:42 #18285

Excellent. I'm sure your summaries will prove to be cross referenced when there is another user with a similar problem. Eddie

31 Oct 2016 12:26 #18286

The idea that COMMON (both labelled or blank) or MODULE could go out of scope and loose their value is very worrying at first. However, the only case where I have experienced that is with an overlaying linker. Apart from that case, all other linkers I have ever used do not loose values in COMMON. The FTN95 linker SLINK allocates a static memory address to all COMMON so this is not a problem for /32. SLINK64 does allocate memory addresses for COMMON at run time, so are dynamically allocated, but I have not seen any indication that they could go out of scope and be de-allocated.

The idea that COMMON could be deallocated if going out of scope could stop many library routines from working; most of mine and I suspect many others. It is a shame about this scare campaign, as the concept of COMMON being out of scope should have been limited only to overlaying linkers. I don't see any other situation where it is required.

The apparent problem being reported in this thread appears to be that local variables, which are typically dynamically allocated to the stack (or heap, via malloc if large arrays) are being lost on exit from the subroutine/function. This is a much more common problem with very old static code, as I think dynamic allocation of local variables has been around since F77.

I would discourage the /save solution, as rather than use /static, the better solution is to identify which variables should have their value retained and allocate them with the SAVE attribute. This would document the problem for others who may need to identify the same problem.

31 Oct 2016 1:29 #18288

Quoted from JohnCampbell I would discourage the /save solution, as rather than use /static, the better solution is to identify which variables should have their value retained and allocate them with the SAVE attribute. This would document the problem for others who may need to identify the same problem.

I agree with the recommendation, but there are many cases (of modernising old code, eg., in Netllib/TOMS) where the original programmer(s) assumed /SAVE, but it turns out that SAVE is needed only for a small subset of the variables.

In such cases, an initial run with /SAVE can be used to produce reference results, following which one keeps pruning away the SAVE attributes until the program starts producing incorrect results. Or, one can remove SAVE from all variables, and keep adding SAVE until the program starts producing correct results.

31 Oct 2016 7:28 #18290

Eddie Just a quick comment to make something clear; I am not using WINIO@ but running in DOS - very humble stuff. Thanks for your comments, though.

As far as the discussion on /save and /static is concerned, I am getting less clear about the best way to go forward; and this is not helped by the fact that I can find no reference to /static in the Ftn95 manual...

For the moment, at least, I guess that I am staying with /save!

31 Oct 2016 8:36 #18291

Sorry, my mistype /static should have been /save.

What I am suggesting is that you should identify the local variables whose value is expected to be retained between calls. This can be coded using a statement like:

integer, save :: call_count A data statement also makes a local variable static.

Variables that retain their values are described as having a static memory address, while those that are not saved can have a new dynamic address each time the routine is called, as they are allocated dynamically onto the stack on each call then released from the stack when exiting the routine. This is the way most (standard conforming) compilers work.

While it may appear to be a lot of work, your efforts would provide documentation of the non-standard behaviour of this old code. As you get more familiar with the code and old coding style, it may make it easier for you to identify this non-conforming practice. Or you may take the easy way out and use /save.

Most old programmers have styles that don't fully conform with new standards. I have many, but assuming dynamic allocation of variables is a new change that I have adopted. It is a good change, as few available compilers default to /save.

31 Oct 2016 9:33 #18292

John,

On the other hand, there is a good reason to use /SAVE, and that is that one then knows that every variable behaves in the same way. Using SAVE selectively means that some variables behave one way, and others in another way. Sure, you can document which ones behave differently, but that means in-code comments, and they have to be on every instance of the use of the SAVEd variable, or reading the code years hence you still forget and get confused. (And you have to know or find which ones are the root of the problem).

My personal rule is to know that the scope of a variable name is always limited to the subprogram, and I very rarely use local arrays at all. The overhead of /SAVE is then tiny, and as my meanest (laptop) PC has 2Gb RAM, my programs that take less than 100Mb aren't competing with anything except MS bloatware for it.

When I used overlays, it was to squeeze code into a tiny space left after all the variables, COMMON etc took up most of it.

Alan, I do recommend Clearwin+ as it is an excellent way to get a modern feel to your program. I made my breakthrough with it over a decade ago when students started turning their noses up at DOS based stuff that they'd used for a decade and a half before that.

Eddie

Please login to reply.