forums.silverfrost.com
Welcome to the Silverfrost forums

 Unexpected FP div-by-zero with /64 Goto page 1, 2  Next
Author Message
mecej4

Joined: 31 Oct 2006
Posts: 1839

Posted: Sat Mar 06, 2021 8:15 pm    Post subject: Unexpected FP div-by-zero with /64

This short program runs when it is compiled with FTN95 8.70 as a 32-bit EXE, and prints -5.000000E+07.

When the source is then compiled with /64 and run, the run is aborted after reporting "floating point division by zero".

 Code: program Utanh implicit none real :: disp = 0.0125, vel =-2.5e-10, wetab, z, mytanh ! mytanh(z) = (exp(z)-exp(-z))/(exp(z)+exp(-z)) z = vel/disp !print *,z wetab = 1.0/mytanh(z)-1.0/z print *,wetab end program

Is the difference attributable to the range of values that the intrinsic EXP() is capable of in each case?
JohnCampbell

Joined: 16 Feb 2006
Posts: 2503
Location: Sydney

 Posted: Sun Mar 07, 2021 5:36 am    Post subject: Ver 8.62 produces an "Access violation" for /64 /lgo Ver 8.62 produces an "Floating point division by zero" for /64 /debug /lgo Gfortran 10.2 produces : GCC version 10.2.0 Infinity
mecej4

Joined: 31 Oct 2006
Posts: 1839

Posted: Sun Mar 07, 2021 10:01 am    Post subject:

Thanks, John.

The test program is an offshoot of an attempt to understand some odd behavior (in the Hydrus6 software from USGS). Really old Fortran did not have hyperbolic functions as intrinsic functions, so the authors used an ASF to implement it, and large parts of these old codes are in single precision.

Here is a different attempt on the issue (approximate 1/tanh(x) - 1/x for small x).
 Code: program mytanh real x,y,z integer i x = 1.0 do i=1,15    y = 1.0/x - 1.0/tanh(x) !error in 1st approx    z = y + x/3.0           !error in 2nd approx    print '(i3,3ES12.4)',i,x,y,z    x=0.2*x end do end program mytanh

Results from Gfortran (may vary with version, compile options):
 Code: 1  1.0000E+00 -3.1304E-01  2.0298E-02   2  2.0000E-01 -6.6489E-02  1.7745E-04   3  4.0000E-02 -1.3332E-02  9.6764E-07   4  8.0000E-03 -2.6627E-03  4.0082E-06   5  1.6000E-03 -4.8828E-04  4.5052E-05   6  3.2000E-04 -2.4414E-04 -1.3747E-04   7  6.4000E-05  0.0000E+00  2.1333E-05   8  1.2800E-05  0.0000E+00  4.2667E-06   9  2.5600E-06  0.0000E+00  8.5333E-07  10  5.1200E-07  0.0000E+00  1.7067E-07  11  1.0240E-07  1.0000E+00  1.0000E+00  12  2.0480E-08  0.0000E+00  6.8267E-09  13  4.0960E-09  0.0000E+00  1.3653E-09  14  8.1920E-10  0.0000E+00  2.7307E-10  15  1.6384E-10  0.0000E+00  5.4613E-11

Line-11 has rather surprising results. On this line x became < eps_mach.
JohnCampbell

Joined: 16 Feb 2006
Posts: 2503
Location: Sydney

Posted: Sun Mar 07, 2021 11:17 am    Post subject:

I changed the code slightly to
 Code: program mytanh !\$ use iso_fortran_env real x,y,z integer i !\$ write (*,*) compiler_version() x = 1.0 do i=1,15    y = 1.0/x - 1.0/tanh(x) !error in 1st approx    z = y + x/3.0           !error in 2nd approx    print '(i3,3ES12.4)',i,x,y,z    x=0.2*x end do end program mytanh

I compiled with:
gfortran mytanh.f90 -o mytanh.exe -ffast-math -fopenmp

resulting in
 Code: GCC version 10.2.0   1  1.0000E+00 -3.1304E-01  2.0298E-02   2  2.0000E-01 -6.6489E-02  1.7745E-04   3  4.0000E-02 -1.3332E-02  9.6764E-07   4  8.0000E-03 -2.6627E-03  4.0082E-06   5  1.6000E-03 -4.8828E-04  4.5052E-05   6  3.2000E-04  0.0000E+00  1.0667E-04   7  6.4000E-05  0.0000E+00  2.1333E-05   8  1.2800E-05  0.0000E+00  4.2667E-06   9  2.5600E-06  0.0000E+00  8.5333E-07  10  5.1200E-07  0.0000E+00  1.7067E-07  11  1.0240E-07  0.0000E+00  3.4133E-08  12  2.0480E-08  0.0000E+00  6.8267E-09  13  4.0960E-09  0.0000E+00  1.3653E-09  14  8.1920E-10  0.0000E+00  2.7307E-10  15  1.6384E-10  0.0000E+00  5.4613E-11

I also printed out "t = tanh(x)"
and tried a variety of alternative compile options.

I get similar results on 5900x and i5-2300 (with gf 9.2.0)
line 6 is also different
PaulLaidler

Joined: 21 Feb 2005
Posts: 7767
Location: Salford, UK

 Posted: Sun Mar 07, 2021 11:35 am    Post subject: This is just a general comment... In the first program z is small and tanh(z) is approximately z. So wetab is the difference of two large values that are almost equal. When the calculation is successful it will be highly suseptible to round-off error. 32 bit extended FTN95 precision real(3) using the intrinsic tanh may give the best result. 32 bit double FTN95 precision real(2) might use extended precision internally for part of the calculation. Single precision is not an option here and double precision probably gives a very poor result.
mecej4

Joined: 31 Oct 2006
Posts: 1839

 Posted: Sun Mar 07, 2021 11:55 am    Post subject: Thanks again, John. I obtained the peculiar line-11 with Gfortran 9.3-64-bit on Cygwin, 4.8.5-64-bit as well as 6.2.0-64-bit on WSL1/Ubuntu14.04.1, and 7.1.1 on a Linux cloud service. I'll have to install a newer 10.x version and see.
JohnCampbell

Joined: 16 Feb 2006
Posts: 2503
Location: Sydney

Posted: Sun Mar 07, 2021 12:35 pm    Post subject:

mecej4,

I only use a pre-built gFortran for windows from equation.com, which is what clearwin_64 uses.
Don't have 9.3, 4.8.5 or 7.1.1 but do have 6.2.0 which produces on i5-2300
 Code: GCC version 6.2.0   1  1.0000E+00 -3.1304E-01  2.0298E-02   2  2.0000E-01 -6.6489E-02  1.7745E-04   3  4.0000E-02 -1.3332E-02  9.6764E-07   4  8.0000E-03 -2.6627E-03  4.0082E-06   5  1.6000E-03 -4.8828E-04  4.5052E-05   6  3.2000E-04  0.0000E+00  1.0667E-04   7  6.4000E-05  0.0000E+00  2.1333E-05   8  1.2800E-05  0.0000E+00  4.2667E-06   9  2.5600E-06  0.0000E+00  8.5333E-07  10  5.1200E-07  0.0000E+00  1.7067E-07  11  1.0240E-07  0.0000E+00  3.4133E-08  12  2.0480E-08  0.0000E+00  6.8267E-09  13  4.0960E-09  0.0000E+00  1.3653E-09  14  8.1920E-10  0.0000E+00  2.7307E-10  15  1.6384E-10  0.0000E+00  5.4613E-11
looks to be identical
also tried mingw-w64 6.2.0 with same results, although mingw-w64 appears to have limited support.
A shame 9.3.0 is not available for windows (or ZEN3 support)

It is surprising you can get similar results on a variety of versions, but I get my similar results. Could it be different hardware instructions being selected ?
It does look to be different round-off which I thought IEEE standardised ?
I tested on i5-2300 and Ryzen 5900X which do use different instructions.
-ffast-math did not appear to modify my results.
mecej4

Joined: 31 Oct 2006
Posts: 1839

 Posted: Sun Mar 07, 2021 3:04 pm    Post subject: John, there IS something strange happening here, but it seems to affect only Gfortran, and probably depends on how various versions of Gfortran react to the precision loss that Paul commented upon. On my Dell XPS-17 laptop with an i7-2720QM, running W10-64, Gfortran 9.3 from Cygwin gives for y the strange value 1.0000E+00 on Line-11, but Gfortran 9.2 from Equation.com gives 0.0000E+00. I even asked the two compilers to generate an assembler file. The two assembler files were identical except for the lines containing .ascii "GCC version 9.3.0" in one and .ascii "GCC version 9.2.0" in the other. Therefore, the differences must be caused by the Gfortran RTLs and how those differ in the two distributions of GCC.Last edited by mecej4 on Mon Mar 08, 2021 1:13 am; edited 1 time in total
JohnCampbell

Joined: 16 Feb 2006
Posts: 2503
Location: Sydney

 Posted: Mon Mar 08, 2021 1:05 am    Post subject: Could you include: print '(i3,4ES12.4)',i,x,y,z,tanh(x) or t = tanh(x) print '(i3,4ES12.4)',i,x,y,z,t It is a strange value of x to produce a different result !
mecej4

Joined: 31 Oct 2006
Posts: 1839

Posted: Mon Mar 08, 2021 2:01 am    Post subject:

John, I did what you asked. Starting from line-5, the values of tanh(x) were the same as that of x to the five significant digits printed. Here are lines 10-12, as produced by Gfortran 9.3.0 (Cygwin):

 Code: 10  5.1200E-07  0.0000E+00  1.7067E-07  5.1200E-07  11  1.0240E-07  1.0000E+00  1.0000E+00  1.0240E-07  12  2.0480E-08  0.0000E+00  6.8267E-09  2.0480E-08
JohnCampbell

Joined: 16 Feb 2006
Posts: 2503
Location: Sydney

Posted: Mon Mar 08, 2021 4:46 am    Post subject:

y = 1/x - 1/tanh = (1-x/tanh)/x

So x/tanh is not 1.0 for 1.024e-7 but is for 5.12e-7 and 2.048e-8 ?

tanh - x must be non-zero in your compiler, which implies an error in the TANH function interpolation.
See if the following identifies the non-zero "tanh - x" (due to rounding ?)

 Code: program mytanh_test !\$ use iso_fortran_env real*4 x,y,z,t real*4 :: one = 1, three = 3, five = 5 integer i !\$ write (*,*) compiler_version() write (*,*) ' i       X           Y           Z         tanh-x' x = one do i=1,15    y = one/x - one/tanh(x)   !error in 1st approx    z = y + x/three           !error in 2nd approx    t = tanh(x)    print '(i3,6ES12.4)',i,x,y,z, t-x    x = x/five end do end program mytanh_test
mecej4

Joined: 31 Oct 2006
Posts: 1839

 Posted: Mon Mar 08, 2021 10:54 am    Post subject: John, even with those versions of Gfortran that gave the two 1.0000 s on Line-11, I find that replacing x = x*0.2 by x = x/5.0 restores expected/normal results. Strange to find the difference between 0.2 and 1/5.0 to cause a change in a result from 0.0E0 to 1.0E0. Perhaps, a striking example for use in a course on programming?
JohnCampbell

Joined: 16 Feb 2006
Posts: 2503
Location: Sydney

 Posted: Mon Mar 08, 2021 11:31 am    Post subject: I am thinking that the actual value x = 1.0240E-07, that is generated using x = x*0.2 is causing a roundoff error in the interpolation formula for tanh (x), so that tanh(x) /= x, so y /= 0 The value of x is very close to epsilon(x), so that a roundoff error would produce y = 1 If tanh(x) is returned as x*(1-epsilon) ; the nearest representable value less than x, ie roundoff error in the calculation of tanh y = one/x - one/tanh = 1/x - 1 / ( x*(1-epsilon) ) = ( x*(1-eps) - x ) / ( x * x*(1-eps) ) = x*eps / ( x * x ) = 1 when x = epsilon (x) you have "stumbled" on a unique value of x where tanh (x) fails due to roundoff, which is only present in some versions of the gFortran tanh library.
mecej4

Joined: 31 Oct 2006
Posts: 1839

Posted: Mon Mar 08, 2021 12:08 pm    Post subject:

John, the following program may help us get closer to the heart of the problem. It does the same calculation as before, but with values of x close to the problematic value. With versions of Gfortran on Windows that use the implementation of tanh() in the Microsoft VC RTL, such as Mingw, the last column is always zero. With Cygwin on Windows, MS WSL-1 and on Linux, we can speculate that the underlying GNU math library causes different behavior, with the last column giving either 1.00000E+00 or 0.00000E+00.

 Code: program mterr use iso_fortran_env implicit none integer i,ix real x,tnhx,f ! write (*,*) compiler_version() print * print *,'           x      x(IEEE hex)    tanh(x)    1/x - 1/tanh(x)' print *,'     ----------   --------    -----------   ---------------'      ix = Z'33DBE6F8' do i=1,16    x = transfer(ix,x)    tnhx = tanh(x)    f = 1.0/x - 1.0/tnhx    print 10,i,x,ix,tnhx,f    ix = ix+1 end do 10 format(i2,ES15.5,2x,Z8,2ES15.5) end program

I'd appreciate your trying this program with your Gfortran 10.x. On a Linux server in the Cloud, here is what I got:

 Code: \$gfortran -std=gnu *.f95 -o main \$main  GCC version 7.1.1 20170622 (Red Hat 7.1.1-3)             x      x(IEEE hex)    tanh(x)    1/x - 1/tanh(x)       ----------   --------    -----------   ---------------  1    1.02400E-07  33DBE6F8    1.02400E-07    1.00000E+00  2    1.02400E-07  33DBE6F9    1.02400E-07    1.00000E+00  3    1.02400E-07  33DBE6FA    1.02400E-07    0.00000E+00  4    1.02400E-07  33DBE6FB    1.02400E-07    1.00000E+00  5    1.02400E-07  33DBE6FC    1.02400E-07    1.00000E+00  6    1.02400E-07  33DBE6FD    1.02400E-07    0.00000E+00  7    1.02400E-07  33DBE6FE    1.02400E-07    1.00000E+00  8    1.02400E-07  33DBE6FF    1.02400E-07    1.00000E+00  9    1.02400E-07  33DBE700    1.02400E-07    0.00000E+00 10    1.02400E-07  33DBE701    1.02400E-07    1.00000E+00 11    1.02400E-07  33DBE702    1.02400E-07    1.00000E+00 12    1.02400E-07  33DBE703    1.02400E-07    1.00000E+00 13    1.02400E-07  33DBE704    1.02400E-07    0.00000E+00 14    1.02400E-07  33DBE705    1.02400E-07    1.00000E+00 15    1.02400E-07  33DBE706    1.02400E-07    1.00000E+00 16    1.02400E-07  33DBE707    1.02400E-07    0.00000E+00

For readers who have not been following this thread closely: FTN95 is not afflicted with this problem.
JohnCampbell

Joined: 16 Feb 2006
Posts: 2503
Location: Sydney

Posted: Mon Mar 08, 2021 1:04 pm    Post subject:

I used 10 format(i2,ES16.7,3x,Z8,ES16.7,ES15.5)

and got on i5-2300
 Code: Z:\Temp\mecej4>set prog=mytanh4 Z:\Temp\mecej4>del mytanh4.exe Z:\Temp\mecej4>gfortran mytanh4.f90 -o mytanh4.exe -ffast-math -fopenmp -v Z:\Temp\mecej4>mytanh4  GCC version 9.2.0              x       x(IEEE hex)     tanh(x)    1/x - 1/tanh(x)      -------------   --------   -------------   ---------------  1   1.0239995E-07   33DBE6F8   1.0239995E-07    0.00000E+00  2   1.0239996E-07   33DBE6F9   1.0239996E-07    0.00000E+00  3   1.0239997E-07   33DBE6FA   1.0239997E-07    0.00000E+00  4   1.0239997E-07   33DBE6FB   1.0239997E-07    0.00000E+00  5   1.0239998E-07   33DBE6FC   1.0239998E-07    0.00000E+00  6   1.0239999E-07   33DBE6FD   1.0239999E-07    0.00000E+00  7   1.0239999E-07   33DBE6FE   1.0239999E-07    0.00000E+00  8   1.0240000E-07   33DBE6FF   1.0240000E-07    0.00000E+00  9   1.0240001E-07   33DBE700   1.0240001E-07    0.00000E+00 10   1.0240002E-07   33DBE701   1.0240002E-07    0.00000E+00 11   1.0240002E-07   33DBE702   1.0240002E-07    0.00000E+00 12   1.0240003E-07   33DBE703   1.0240003E-07    0.00000E+00 13   1.0240004E-07   33DBE704   1.0240004E-07    0.00000E+00 14   1.0240004E-07   33DBE705   1.0240004E-07    0.00000E+00 15   1.0240005E-07   33DBE706   1.0240005E-07    0.00000E+00 16   1.0240006E-07   33DBE707   1.0240006E-07    0.00000E+00
 Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First
 All times are GMT + 1 HourGoto page 1, 2  Next Page 1 of 2