Here is a small code that displays the huge slowdowns caused by underflow interrupts occurring in DAXPY type code in Fortran. These slowdowns occur when underflow exceptions are handled in the default mode of code compiled with FTN95. Most of the issues were written up in another long thread, see https://forums.silverfrost.com/Forum/Topic/2171&postdays=0&postorder=asc&start=0, but here is a compact test code.
program xundfl
implicit none
integer, parameter :: NMAX=2000, NREP=50000
double precision, dimension(NMAX) :: X,Y,Z
double precision A,t1,t2
integer :: i,k,n,cnt1,cnt2
open(15,file='vecsub.bin',form='unformatted',status='old')
read(15)n,a,(x(i),y(i),i=1,n)
close(15)
call dclock@(t1)
call underflow_count@(cnt1)
do k=1,NREP
do i = 1,n
z(i) = y(i) - a*x(i)
end do
end do
call underflow_count@(cnt2)
call dclock@(t2)
write(*,10)n,t2-t1,(cnt2-cnt1)/NREP
10 format('Vectors of size ',I4,2x,' time = ',f6.3,' ufls = ',i4)
end program
The data file for the test code is an unformatted Fortran data file that contains two double precision vectors x and y of length n =1679, and the test program performs the operation z = y - a x fifty thousand times. You can download the file (27 KB) from the public link https://dl.dropboxusercontent.com/u/88464747/vecsub.bin .
Compile with /opt /p6 and link. Run the program and record the output. Set the environment variable SALFENVAR=MASK_UNDERFLOW, and run again. You will see something similar to the following:
s:\\lang\\JCampBell\\SAL>set SALFENVAR=
s:\\lang\\JCampBell\\SAL>xundfl
Vectors of size 1679 time = 33.860 ufls = 88
s:\\lang\\JCampBell\\SAL>set SALFENVAR=MASK_UNDERFLOW
s:\\lang\\JCampBell\\SAL>xundfl
Vectors of size 1679 time = 0.688 ufls = 0
From these results we can estimate the time spent per one execution of the FPU underflow interrupt handler to be the equivalent of about 12,000 CPU cycles, which agrees with a similar estimate made in John Campbell's thread linked above, where the program can take hours to run with the default underflow processing versus about 20 seconds with SSE2 code.