View previous topic :: View next topic |
Author |
Message |
jcherw
Joined: 27 Sep 2018 Posts: 57 Location: Australia
|
Posted: Thu Feb 28, 2019 7:45 am Post subject: Numerical difference Silverfrost v8 vs Salford 2.54 |
|
|
I am working on a complex Finite Difference code and testing it's results vs. a standard (published) analytical solution. The code is from a reputable source (US government agency) and has been 'widely' used. When compiling the code with the Salford v254, I get the correct result for the test problem. However with the latest Silverfrost v8 32b compiler I get a significantly different and incorrect result. Could anyone advise me if there are some initialization or precision differences between these 2 compiler versions and/or if there are any compile options I could use to address this. |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7925 Location: Salford, UK
|
Posted: Thu Feb 28, 2019 10:10 am Post subject: |
|
|
The results should be the same if both are compiled with FTN95.
Are you using the same FTN95 command line options in both cases? |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Thu Feb 28, 2019 11:49 am Post subject: |
|
|
If you are using Ver 8 32-bit, then both tests should be using x87 computation.
If they are different, then there may be different optimisation in place. (what compile options are you using)
You need to determine if the differences are due to round-off variability or are significantly different.
For significantly different, you need to identify where the differences come from.
You could try FTN95 /64, as this is a different instruction set, that doesn't use x87 instructions.
I would suggest you use /checkmate to utilise the run time checking available in FTN95.
Unfortunately "complex Finite Difference" suggests a difficult task
It might not be the last time that old legacy codes show up long hidden bugs. |
|
Back to top |
|
|
jcherw
Joined: 27 Sep 2018 Posts: 57 Location: Australia
|
Posted: Thu Feb 28, 2019 11:54 am Post subject: |
|
|
As far as OI am aware I am using teh same options - here is the detail:
Here is the Salford 254 command line to compile
C:\Progra~2\salford\FTN95 mod_array.f95 /LIST mod_array.f95.lis
and than SLINK simply load all obj files and create exe file
For the Silverfrost v8 I used Plato w project properties as following ticked / added
Miscellaneous
- extra compiler options - /FIXED_FORMAT
- output file name - HST2D.exe
- output file type - EXE
- pre-process source files is ticked
Switches /FPP /CFPP /FIXED_FORMAT
all other compiler/linker options are NOT ticked
Let me know if there is a difference after all. |
|
Back to top |
|
|
jcherw
Joined: 27 Sep 2018 Posts: 57 Location: Australia
|
Posted: Thu Feb 28, 2019 12:11 pm Post subject: |
|
|
I did use the 32 b and 64 b compilation for Silverfros v8 and I also used release and checkmate.
All the above gave the same result ie. different from Salford 254. |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1886
|
Posted: Thu Feb 28, 2019 1:37 pm Post subject: |
|
|
I noticed the name "HST2D" in your post. It so happens that I worked with the successor USGS code HST3D 2.2.16 in 2016. I found several bugs (at least 15 of them) in the source code, using FTN95 and /checkmate. There were variables that needed initialisation, arrays that were too small or were referenced with indices outside their bounds, and variables that needed to be SAVEd.
You may see some posts in the thread http://forums.silverfrost.com/viewtopic.php?t=3255 that touch HST3D, although in that thread we were more concerned with the (then) new 64 bit FTN95 compiler and speed.
Assuming that HST3D is a corrected and augmented version of HST2D, I suspect that HST2D may contain similar errors as well. I cannot locate the source of HST2D, but HST3D is available at https://wwwbrr.cr.usgs.gov/projects/GW_Solute/hst/index.shtml .
Such bugs are not unusual, and finding them need not cause loss of respect for the authors of the software. Note that the Silverfrost compilers may also have had bugs that affected the results, as you may see from the list at https://www.silverfrost.com/default.aspx?id=19 . A bug in the compiler can hide a bug in the software being compiled!
If you can provide a link to the HST2D source (the original from USGS or your modified version) along with instructions to reproduce the behaviour that you noticed, I may be able to help. |
|
Back to top |
|
|
John-Silver
Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley
|
Posted: Thu Feb 28, 2019 11:57 pm Post subject: |
|
|
JohnC wrote:
Quote: | If they are different, then there may be different optimisation in place. |
... which if true is worrying.
Optimisation should only touch the efficiency of execution not accuracy of the results surely. Or am I missing something.
If it does touch accuracy then it should be in the positive sense i.e. results should be 'better'.
Note I'm opinionating philosophically, not challenging if the results would change nor asking for an explanation. _________________ ''Computers (HAL and MARVIN excepted) are incredibly rigid. They question nothing. Especially input data.Human beings are incredibly trusting of computers and don't check input data. Together cocking up even the simplest calculation ... " |
|
Back to top |
|
|
John-Silver
Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley
|
Posted: Fri Mar 01, 2019 12:04 am Post subject: |
|
|
Of course it would be useful to the discussion to see what these 'differences' actually are too _________________ ''Computers (HAL and MARVIN excepted) are incredibly rigid. They question nothing. Especially input data.Human beings are incredibly trusting of computers and don't check input data. Together cocking up even the simplest calculation ... " |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Fri Mar 01, 2019 3:11 am Post subject: Re: |
|
|
John-Silver wrote: | ... which if true is worrying.
Optimisation should only touch the efficiency of execution not accuracy of the results surely. Or am I missing something.
|
For floating point calculations, Optimisation can change the instruction set or change the order of calculations. Especially for FTN95 /32, x87 register use can also change what is stored in registers and when values are transferred (truncated) to memory.
This can result in different round-off errors, which then need to be assessed for their significance.
The worst round-off might not occur at the end of the calculation, resulting in greater than expected errors. ( eg subtracting two very similar values with differing round-off errors )
For iterative solutions, this can change the number of cycles to convergence, which can induce a significant change in the calculated values due to an extra iteration.
Optimisation is many faceted, definitely not just optimum accuracy.
Then if there is a bug in the code, eg uninitialised values, optimisation can make a meal of this. |
|
Back to top |
|
|
mecej4
Joined: 31 Oct 2006 Posts: 1886
|
Posted: Fri Mar 01, 2019 3:23 am Post subject: Re: |
|
|
John-Silver wrote: | Optimisation should only touch the efficiency of execution not accuracy of the results surely. |
Only if exact arithmetic is used (i.e., integer, character and boolean variables) and no integer overflow occurs. If finite precision floating point calculations are performed, the usual rules of algebra, e.g., a+b-a = b, are not always obeyed. Here is an example program to illustrate this.
Code: | program inexact
implicit none
real a,b,s
integer i
a=1.0
b=2e-7
do i=1,5
s=a+b
print '(1x,i2,2ES15.7)',i,b,s-a
b=0.5*b
end do
end program |
FTN95, and Intel Ifort /Od (optimisations disabled) give
Code: | 1 2.0000000E-07 2.3841858E-07
2 1.0000000E-07 1.1920929E-07
3 5.0000001E-08 0.0000000E+00
4 2.5000000E-08 0.0000000E+00
5 1.2500000E-08 0.0000000E+00 |
Note that the second and third columns, which show b and (a+b)-a, do not agree at all.
Quote: | If it does touch accuracy then it should be in the positive sense i.e. results should be 'better'. |
Usually, the opposite is true. There is a trade-off between at least three factors: (i) speed of compilation and linking, (ii) speed of resulting program and (iii) accuracy of results. Many compilers provide "aggresive optimisations" or "unsafe optimisations".
Sometimes, you may get more accuracy than you expect, but for the wrong reasons. For the above program, if optimisations are allowed, the results from Ifort are:
Code: | 1 2.0000000E-07 2.0000000E-07
2 1.0000000E-07 1.0000000E-07
3 5.0000001E-08 5.0000001E-08
4 2.5000000E-08 2.5000000E-08
5 1.2500000E-08 1.2500000E-08 |
Looks great, does it not? The compiler realised that the results can be precomputed at compile time, and produced a program that simply prints the precomputed results. The Fortran standard allows the compile time arithmetic to be completely different from the runtime arithmetic; in this case, the compile time arithmetic was probably done in double precision. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Fri Mar 01, 2019 4:12 am Post subject: |
|
|
It is interesting what we consider to be a compiler bug.
This accusation is often applied to a new compiler, because the previous compiler tolerated, what we accepted to be my "old" Fortran. A number of these include:
# Un-initialised variables, because the old compiler set memory to zero.
# Local variables going out of scope, because the old compiler defaulted all local variables to SAVE. (this is a big problem for multi-thread codes where SAVE should not be used)
Those who have used static allocation compilers or linkers may disagree, but to me these are two serious coding bugs that are repeatedly appearing in complaints about most modern Fortran compilers.
I am not immune to this accusation, as I struggle with Intel iFort's management of array sections, by using non-contiguous memory for array arguments. I am probably now wrong, but I have always assumed that arrays are contiguous, and the array argument transfers the start memory address of the array. (for me to break type mixing rules)
The point is that just because the old compiler got what you now accept as the correct answer, doesn't mean that there are no bugs in the code. The bugs are still there, but did not cause a problem with the old compiler. |
|
Back to top |
|
|
LitusSaxonicum
Joined: 23 Aug 2005 Posts: 2388 Location: Yateley, Hants, UK
|
Posted: Fri Mar 01, 2019 10:40 am Post subject: |
|
|
John_C,
I'm something of a fan of static allocation, but SAVE in all its guises is an abomination, and it is the job of the programmer, not the compiler to initialise variables before first use. It's a lucky accident, no more, if variables are initialised automatically at run time.
Those coding bugs may be present in any code, and I'll give you a win at the argument because they surface in old code when it is being resurrected after a decade or more when nobody looked at it.
There's also a situation where the fixing of a compiler bug invalidates something that was accepted in the past, and now declares it to be an error. The nightmare situation is when a really old code has been 'bodged' ('botched' in some versions of English) and then becomes rather unreadable.
Eddie |
|
Back to top |
|
|
jcherw
Joined: 27 Sep 2018 Posts: 57 Location: Australia
|
Posted: Fri Mar 01, 2019 1:19 pm Post subject: |
|
|
Thanks for all comments so far.
As per my initial post, there must be an algorithm in the code that gets compiled by Salford v254 as intended, but with Silverfrost v8 somewhat differently making it working incorrectly.
Please review below document (link below) to see what the difference is. The code calculates the effect of an underground heat exchange cell where a negative heat influx reduces the temperature originally at 10.5 degrees C (Graph 1 - heat exchange cell). The second graph shows the temperature at an observation point at some distance.
https://1drv.ms/w/s!AuTT_gAwgmEIh4YoSh_vjzS_bwq7NA
As graph 2 (observation well) shows the calculation of the SF254 compiled code follows a published analytical solution very closely and the calculation of Silverfrost v8 shows a significant difference. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Sat Mar 02, 2019 6:33 am Post subject: |
|
|
I looked at your results, which are (disturbingly) only slightly different.
Did you use ftn95 /64 or ftn95 32-bit ?
32-bit will use x87 80-bit calculations, while I think 64-bit will use 64-bit calculations, which have a lower accuracy.
Apart from that, I would be compiling with /checkmate to check for out of range addressing and possible undefined values. |
|
Back to top |
|
|
jcherw
Joined: 27 Sep 2018 Posts: 57 Location: Australia
|
Posted: Sun Mar 03, 2019 1:36 am Post subject: |
|
|
I used for Silverfrost ftn95 v8 both 64 and 32 bit and obtained exactly the same (incorrect) result.
I also did compile Silverfrost ftn95 v8 both 64 and 32 bit checkmate This produced a number of warnings re un-initialized variables. After fixing this by initializing variables properly I still got the same incorrect result.
Note that the Salford ftn v254 compiled version, which produced the correct results, was NOT compiled with checkmate option. I am currently preparing a version that can be compiled w checkmate (needs the same fix to initialise variables as per above) |
|
Back to top |
|
|
|