forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Fortran 77 problem - think I have a poltergeist

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
technophobe



Joined: 28 May 2007
Posts: 29

PostPosted: Mon Feb 16, 2009 11:09 pm    Post subject: Fortran 77 problem - think I have a poltergeist Reply with quote

Hi Guys, I'm working on a CFD code my predecessor developed and it's written in fortran 77. I have a good amount of experience in fortran 90 but am having a few teething troubles with his code. If someone could shed some light on these issues I'd be very grateful.

The code uses a common file which stores all the global variables, this file is "included" in every subroutine. There is only one common file and it contains many variables. In order to introduce some changes to the code (it's massive and a re-write would take weeks) I have had to introduce some new variables, two of these are common to the whole program so I added them to the common file, lets call them Array1 and Array2. When doing this I remembered to declare their type as double precision and allocated the correct array size, of 10,10; they were also initialised at the start of the program so that all their array elements are equal to 0.0. In this prototype code Array1 and Array2 are not altered in any way (in other words they should be maintained at 0.0; when I get this bit working Array1 and Array2 will be read from a file - that's the next step).

There are two subroutines in the program, lets call them Asub and Bsub. Asub calculates the variables A, B and C and passes them to Bsub, Bsub them performs some calculations and returns an answer. The data passed between the two routines are local variables and are not required at any other point in the program - so I didn't add them to the 'common' file. The code looks something like:

-------------------------------------

Code:
Subroutine Asub

Include Codecommon.cmn

Double Precision A,B,C

A = Array1(1,1)

B = Array2(1,1)

C = 5

CALL Bsub(A,B,C,D)

-------------------------------------

Code:
Subroutine Bsub

Include Codecommon.cmn

Double Precision A,B,C

D = function(A,B,C)

-------------------------------------

My problem is that the solutions coming out of Bsub aren't what I expect. They seem to indicate that Array1 and Array2 have had their values changed - I'm not sure how this is possible since they are not used at any other point in the program. To verify this I ran searches for every mention of the variable names - they only appear where I expect them to. I also wrote a line into Bsub that printed the values A and B to a file every time the subroutine was accessed; I found several non-zero values (magnitude around 4e-4).

Once Array1 and Array2 have their values initiated to 0.0 I expect them to stay that way, they are not altered at any point in the code. I suspect this is something to do with the use of "include" and common files but I'm rather unfamiliar with these concepts. When I use global variables I always put them in modules.

If someone could please help - you may just save my sanity Smile
Bren
Back to top
View user's profile Send private message Send e-mail
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Tue Feb 17, 2009 3:36 am    Post subject: Reply with quote

My apologies if I am missing your problem.

Prior to modules, it was good programming practise to use include files for all common definition. In this format there are two main things that must be provided in the include file.

file codecommon.cmn should look like:
!
! Declare variable types and size
real*8 other_real_variables
integer*4 other_integer_variables
real*8 array1(10,10)
real*8 array2(10,10)
!
! Allocate them to common list
common /cmd_blk/ other_real_variables, other_integer_variables, &
array1, array2

In an ideal world, all variables in the include file are both declared and listed in the common block variable list.

In a less than ideal world:-
1) the variables may not be declared explicitly, then if declared in other routines, can change the length, eg mixing as real*8 and integer*4 for the same variable in different routines can change the length of the common and mess up the implied location of each variable. It was common to not declare variables, but use the first letter as implicit type. If the program was developed where integer and reals had the same byte length, this error would not have been noticed. eg on the CDC cyber machines in 1970's.
2) some variables declared, may not be listed in the common block variable list. This is a bug and would result in the variables not being transferred between routines.
3) In a realy bad programing style, there may be variables added to the common block list, locally in each subroutine, via extra continuations or by equivalence. It can happen !!

The solutions can include:-
1) Make variable declaration explicit, using IMPLICIT NONE, however this may result in a lot of work for an old program. I adopted using IMPLICIT NONE when I started using FTN95, it finds a lot of bugs.
2) You could experiment by only making sure all variables in the include file are explicitly declared, say 1 per line. (see clearwin.ins and other ftn95 include files as examples). This should find double declarations as errors when compiling with ftn95.
3) Experiment with converting the common include into a module, then include it with a use statement. This removes the problem of overwriting of arrays and may pick up the problem of array type declaration outside the module.
3) With this, old fashioned implicit equivalence can no longer be guaranteed, as the common block variable list guaranteed the location of arrays. eg " common a,b,c" could also be used for a subroutine call as and array a(3), to address b and c. I've done this one ! It can change the rank and size of arrays.

The aim is to minimise the changes to the original code.

Good luck,
John
Back to top
View user's profile Send private message
technophobe



Joined: 28 May 2007
Posts: 29

PostPosted: Tue Feb 17, 2009 2:08 pm    Post subject: Thanks Reply with quote

Thanks for your reply John - I really appreciate you taking the time to respond to my rather lengthy post.

Unfortunately I still haven't resolved the issue. The code I am changing has been very well written and the fault must lie with the additions that I have made. The original author has used IMPLICIT NONE everywhere so there are no problems with mis-typing variable names. The new variables have all been declared in a common block and explicitly assigned equal to zero.

I have run some test to try and locate my problem. The difficulty seems to be that I set variable A equal to an array element Array1(1) , at the moment there is only one element in the array and it is equal to zero. I then pass A to a subroutine. The subroutine uses A in some calculations but never reassigns it. It then outputs some results and writes A to a file for post-processing.

This all happens inside a loop which is repeated many times, other variables are involved elsewhere in the program but these are all operating as I would expect. At some stage, however, the value of Array1(1) gets reassigned to some non-zero value. Later in the loop it returns to zero again! There is only one assignment statement for Array1 in the entire program and this sets each element (1 at the moment) equal to zero.

I have checked this code with some work colleagues and no-one can work out what's going on! It simply doesn't make sense for the code to work this way. The author of the original code now works at another company so I'm going mad trying to work it out. If someone could shed some light on this I'd be very grateful.
Bren
Back to top
View user's profile Send private message Send e-mail
brucebowler
Guest





PostPosted: Tue Feb 17, 2009 2:15 pm    Post subject: Re: Fortran 77 problem - think I have a poltergeist Reply with quote

technophobe wrote:

Code:
Subroutine Asub
...
CALL Bsub(A,B,C,D)

-------------------------------------
Code:
Subroutine Bsub
...
D = function(A,B,C)



If that call and that subroutine declaration are REALLY the way you have them, then the subroutine declaration is your problem.

It should be
Code:
Subroutine Bsub(A,B,C,D)


If that's not the case, can you post the actual code and actual declarations so we can look at them?

Bruce
Back to top
technophobe



Joined: 28 May 2007
Posts: 29

PostPosted: Tue Feb 17, 2009 2:45 pm    Post subject: Subroutine Reply with quote

Thanks Bruce, you're right I should have included the variables in the subroutine call and the subroutine declaration. In the real code I have made sure that the subroutine and the calling program are both using the same variables list. In both cases the variables are local so they are declared at the start of each routine. I made sure of this by copy-pasting the variables.

Unfortunately I can't display the actual code because it will violate IPR regs at work. I'm sorry - I know that is a pain Sad
Back to top
View user's profile Send private message Send e-mail
technophobe



Joined: 28 May 2007
Posts: 29

PostPosted: Tue Feb 17, 2009 3:55 pm    Post subject: Simpler problem Reply with quote

Hello again guys,

To try and resolve this issue I have removed the arrays entirely and am just using real variables. I still have the same problem although I am unsure why. It appears that a variable is being overwritten somewhere although I have checked every single assignment statement where the variable exists and it is only asigned once.

I now have variable A, set equal to zero early in the code. The simplified code looks like

Code:
---------------------------------------------------------------
SUBROUTINE ASUB
DOUBLE PRECISION B,C,D !Local variables, A is common
B = A
C = 5
CALL BSUB(B,C,D)
---------------------------------------------------------------
SUBROUTINE BSUB(B,C,D)
DOUBLE PRECISION B,C,D
D=function(B,C)
---------------------------------------------------------------


The subroutine BSUB does a few things but none of them assign any values to A or B. When I write the A and B variables to a file I notice that they are identical, beginning at zero then taking some nonzero number. Then then go back to zero.

Some process must be overwriting the data in A but without an assignment A=? I cannot see how this is possible.

Could someone please advise?

P.S. If you've got this far then thank you for reading my posts - this isn't an easy problem to describe and I appreciate your attention.
Back to top
View user's profile Send private message Send e-mail
brucebowler
Guest





PostPosted: Tue Feb 17, 2009 4:40 pm    Post subject: Reply with quote

How is "function" declared? Is it double precision? If not, it probably should be, and if so, there should probably be a
Code:
double precision function

in the declarations section of bsub
Back to top
Martin



Joined: 09 Sep 2004
Posts: 43

PostPosted: Tue Feb 17, 2009 10:08 pm    Post subject: Reply with quote

If a variable's value is changing but you can't see why I'd suggest it is a memory corruption error in some other part of the code, e.g. a buffer overflow.

I'd recommend compiling with checkmate and/or setting a write breakpoint on the variable being corrupted in SDBG - this will pause execution of the program when that variable is modified so you can see which statement is doing the damage.
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2388
Location: Yateley, Hants, UK

PostPosted: Wed Feb 18, 2009 12:11 am    Post subject: Reply with quote

In Fortran 77 style programming, you can pass variables between routines using the subprogram parameter lists OR through named COMMON blocks. If one method gives you grief - try the other. My advice is that if you are passing variables between routines via a parameter list and getting corruption, then an answer is to put the variables into an uniquely named COMMON block, mentioned in your ASUB and BSUB routines, and possibly nowhere else.

If the problem then goes away (relative to passing variables via the parameter list) then your bug is probably in the way you set up the variables to be passed. If it doesn't go away, then the values are being reassigned somewhere else.

Eddie
Back to top
View user's profile Send private message
technophobe



Joined: 28 May 2007
Posts: 29

PostPosted: Wed Feb 18, 2009 12:33 pm    Post subject: Thanks guys Reply with quote

Thanks for the replies guys, there's some very helpful information there.

I think I confused matters when I wrote
Code:
D=function(B,C)

by this I meant that D depends only upon the variables B and C, it's a horrible relationship that I won't bore you with. I haven't defined any functions in my code, nor have I used any intrinsic ones.

I think the last two comments are correct - the problem must be something to do with the way data is passed from one subroutine to the other. I will double check this and see if it fixes things.

Many thanks,
Bren[/code]
Back to top
View user's profile Send private message Send e-mail
technophobe



Joined: 28 May 2007
Posts: 29

PostPosted: Thu Feb 19, 2009 10:12 am    Post subject: For the record Reply with quote

Hi Guys,
I think I have finally fixed my problem. Unfortunately, I made two changes at the same time and am not sure which one to attribute the success to.

When I changed the code initially I added six new variables, these were added to the common file and I added them to some existing, named, common blocks. I didn't think that this would cause any problems as the program never refers to the common block by their names. I decided to change this and add a named common block with all the new variables.

Finally, I deleted the executable and recompiled everything - there are several dozen subroutines in seperate files and this was a last ditch attempt to see if it was a compiler error.

I think the former change is likely to have fixed things but can't completely rule out the second. I thoguht I should let you guys know in case someone comes across a similar problem someday.

Thanks very much for your help
Bren
Back to top
View user's profile Send private message Send e-mail
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Fri Feb 20, 2009 1:57 am    Post subject: Reply with quote

I think both changes would have been required.

Modules were introduced, to replace include files, as they require a single declaration of the variable and not also the order in the common block. They are a much more robust method, especially when changing a program.
While common blocks have a useable feature of defining the variable order, you do have to be careful when using different variable lists for the same common block. A good coding rule for common is never put common declarations in the code, but always define them by a single include file. Unfortunately I break this rule many times, although I am migrating to modules. ( I am yet to try EQUIVALENCE for variables in a module, to replicate some of the things you could do with COMMON. They are probably better not being done ! )

With regard to obsolete .obj files, and especially .mod files, I have never mastered the use of the make utility, and instead still use my make.bat file, where the first two lines are :-
del *.obj
del *.mod

This approach has been a good way of guaranteeing that recent coding changes take effect, as I don't think the date stamp method of the make utility is sufficiently robust, especially when developing on multiple computers.

Similarly, It is my impression that if the .mod file is younger than the .f95 (or .for) file that defines it, then FTN95 does not replace it. If I copy a younger .mod file from another computer, which had an older version of the .f95 definition, this leads to many problems. It is safer to delete the files and recompile everything. There are few compiles that take too long these days.

John
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group