forums.silverfrost.com

jcherw · Joined: 27 Sep 2018 Posts: 57 Location: Australia

I am working on a complex Finite Difference code and testing it's results vs. a standard (published) analytical solution. The code is from a reputable source (US government agency) and has been 'widely' used. When compiling the code with the Salford v254, I get the correct result for the test problem. However with the latest Silverfrost v8 32b compiler I get a significantly different and incorrect result. Could anyone advise me if there are some initialization or precision differences between these 2 compiler versions and/or if there are any compile options I could use to address this.

PaulLaidler · Posted: Thu Feb 28, 2019 10:10 am Post subject:

The results should be the same if both are compiled with FTN95.

Are you using the same FTN95 command line options in both cases?

JohnCampbell · Joined: 16 Feb 2006 Posts: 2618 Location: Sydney

If you are using Ver 8 32-bit, then both tests should be using x87 computation.
If they are different, then there may be different optimisation in place. (what compile options are you using)
You need to determine if the differences are due to round-off variability or are significantly different.
For significantly different, you need to identify where the differences come from.
You could try FTN95 /64, as this is a different instruction set, that doesn't use x87 instructions.

I would suggest you use /checkmate to utilise the run time checking available in FTN95.

Unfortunately "complex Finite Difference" suggests a difficult task

It might not be the last time that old legacy codes show up long hidden bugs.

jcherw · Joined: 27 Sep 2018 Posts: 57 Location: Australia

As far as OI am aware I am using teh same options - here is the detail:

Here is the Salford 254 command line to compile

C:\Progra~2\salford\FTN95 mod_array.f95 /LIST mod_array.f95.lis

and than SLINK simply load all obj files and create exe file

For the Silverfrost v8 I used Plato w project properties as following ticked / added

Miscellaneous
- extra compiler options - /FIXED_FORMAT
- output file name - HST2D.exe
- output file type - EXE
- pre-process source files is ticked

Switches /FPP /CFPP /FIXED_FORMAT

all other compiler/linker options are NOT ticked

Let me know if there is a difference after all.

jcherw · Joined: 27 Sep 2018 Posts: 57 Location: Australia

I did use the 32 b and 64 b compilation for Silverfros v8 and I also used release and checkmate.

All the above gave the same result ie. different from Salford 254.

mecej4 · Joined: 31 Oct 2006 Posts: 1899

I noticed the name "HST2D" in your post. It so happens that I worked with the successor USGS code HST3D 2.2.16 in 2016. I found several bugs (at least 15 of them) in the source code, using FTN95 and /checkmate. There were variables that needed initialisation, arrays that were too small or were referenced with indices outside their bounds, and variables that needed to be SAVEd.

You may see some posts in the thread http://forums.silverfrost.com/viewtopic.php?t=3255 that touch HST3D, although in that thread we were more concerned with the (then) new 64 bit FTN95 compiler and speed.

Assuming that HST3D is a corrected and augmented version of HST2D, I suspect that HST2D may contain similar errors as well. I cannot locate the source of HST2D, but HST3D is available at https://wwwbrr.cr.usgs.gov/projects/GW_Solute/hst/index.shtml .

Such bugs are not unusual, and finding them need not cause loss of respect for the authors of the software. Note that the Silverfrost compilers may also have had bugs that affected the results, as you may see from the list at https://www.silverfrost.com/default.aspx?id=19 . A bug in the compiler can hide a bug in the software being compiled!

If you can provide a link to the HST2D source (the original from USGS or your modified version) along with instructions to reproduce the behaviour that you noticed, I may be able to help.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2618 Location: Sydney

mecej4 · Joined: 31 Oct 2006 Posts: 1899

JohnCampbell · Joined: 16 Feb 2006 Posts: 2618 Location: Sydney

It is interesting what we consider to be a compiler bug.

This accusation is often applied to a new compiler, because the previous compiler tolerated, what we accepted to be my "old" Fortran. A number of these include:
# Un-initialised variables, because the old compiler set memory to zero.
# Local variables going out of scope, because the old compiler defaulted all local variables to SAVE. (this is a big problem for multi-thread codes where SAVE should not be used)

Those who have used static allocation compilers or linkers may disagree, but to me these are two serious coding bugs that are repeatedly appearing in complaints about most modern Fortran compilers.

I am not immune to this accusation, as I struggle with Intel iFort's management of array sections, by using non-contiguous memory for array arguments. I am probably now wrong, but I have always assumed that arrays are contiguous, and the array argument transfers the start memory address of the array. (for me to break type mixing rules)

The point is that just because the old compiler got what you now accept as the correct answer, doesn't mean that there are no bugs in the code. The bugs are still there, but did not cause a problem with the old compiler.

LitusSaxonicum · Posted: Fri Mar 01, 2019 10:40 am Post subject:

John_C,

I'm something of a fan of static allocation, but SAVE in all its guises is an abomination, and it is the job of the programmer, not the compiler to initialise variables before first use. It's a lucky accident, no more, if variables are initialised automatically at run time.

Those coding bugs may be present in any code, and I'll give you a win at the argument because they surface in old code when it is being resurrected after a decade or more when nobody looked at it.

There's also a situation where the fixing of a compiler bug invalidates something that was accepted in the past, and now declares it to be an error. The nightmare situation is when a really old code has been 'bodged' ('botched' in some versions of English) and then becomes rather unreadable.

Eddie

jcherw · Joined: 27 Sep 2018 Posts: 57 Location: Australia

Thanks for all comments so far.

As per my initial post, there must be an algorithm in the code that gets compiled by Salford v254 as intended, but with Silverfrost v8 somewhat differently making it working incorrectly.

Please review below document (link below) to see what the difference is. The code calculates the effect of an underground heat exchange cell where a negative heat influx reduces the temperature originally at 10.5 degrees C (Graph 1 - heat exchange cell). The second graph shows the temperature at an observation point at some distance.

https://1drv.ms/w/s!AuTT_gAwgmEIh4YoSh_vjzS_bwq7NA

As graph 2 (observation well) shows the calculation of the SF254 compiled code follows a published analytical solution very closely and the calculation of Silverfrost v8 shows a significant difference.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2618 Location: Sydney

I looked at your results, which are (disturbingly) only slightly different.

Did you use ftn95 /64 or ftn95 32-bit ?
32-bit will use x87 80-bit calculations, while I think 64-bit will use 64-bit calculations, which have a lower accuracy.

Apart from that, I would be compiling with /checkmate to check for out of range addressing and possible undefined values.

jcherw · Joined: 27 Sep 2018 Posts: 57 Location: Australia

I used for Silverfrost ftn95 v8 both 64 and 32 bit and obtained exactly the same (incorrect) result.

I also did compile Silverfrost ftn95 v8 both 64 and 32 bit checkmate This produced a number of warnings re un-initialized variables. After fixing this by initializing variables properly I still got the same incorrect result.

Note that the Salford ftn v254 compiled version, which produced the correct results, was NOT compiled with checkmate option. I am currently preparing a version that can be compiled w checkmate (needs the same fix to initialise variables as per above)

jcherw · Joined: 27 Sep 2018 Posts: 57 Location: Australia

I finally managed to track down the issue after some further discussion with mecej4, who kindly recompiled the code.

It turns out the the version of Salford v2.54 which was passed on to me as a straight folder under c:\Program Files (x86) (ie not an install) compiled by default with double precision reals. I am not sure whether this is standard or influenced by the original setup. I am also not sure if the previous developed was aware of this. I had inherited BAT compile files from the previous developer which did not stipulate 'all double precision' compilation flags.

Silverfrost v8.3 compiles by default with single precision reals.

For most of the test problems I had at hand this did not cause any issue other than totally insignificant differences.

However, when I further analysed the test problem I posted (and that created the problem) did change the basic input via a unit conversion and single precision truncation significantly reduced the heat influx.

I am still not sure if the code require requires a blanket double precision for all other calcs, but obviously as a matter of safety it can't hurt. Most codes of this style I have worked with before had the double precision declarations explicitly inside the code, obviously to avoid the problem I ran into.

Lessons learned (for the umpteenth time): never make assumptions about defaults, and understanding the physics (which was pointed out to me by mecej4) is very valuable when tracking down bug.

All comments from various contributors to this thread are much appreciated.

PaulLaidler · Posted: Mon Mar 04, 2019 8:57 am Post subject:

These things are easy to miss. The original compilations may have used /config to configure the compiler to use /dreal by default.