Quoted from PaulLaidler
I think that you misunderstand the issue
How i could misunderstand if i fixed it in my example 2 few messages up? 😃
Quoted from PaulLaidler
As far as the compiler is concerned, the code is treated in the same way and any difference is in the associated assembler instructions and the way in which these are implemented by the central processor in use.
This could not be the same obviously. Just comparing the speed of integer and FP operations in old 32 and new 64 compiler tells that the new one is way faster (up to 5x, compile with /opt)
CALL CPU_TIME(tStart)
k=1
1 j=1
do i=1,10000000
j=j+1
enddo
k=k+1
if(k.lt.100) goto 1
CALL CPU_TIME(tFinish)
RunTime=tFinish-tStart
OpPerSInt = 1e9/Runtime
Print*, RunTime, OpPerSInt
k=1
CALL CPU_TIME(tStart)
2 a=1.
do i=1,10000000
a=a+1.
enddo
k=k+1
if(k.lt.100) goto 2
CALL CPU_TIME(tFinish)
RunTime=tFinish-tStart
OpPerSfp = 1e9/Runtime
Print*, RunTime, OpPerSfp
end
32bit Time Op/second
INT 1.64063 6.095238E+08
FP 2.20313 4.539007E+08
64bit
INT 0.281250 3.555556E+09 6x speedup
FP 0.890625 1.122807E+09 2.5x speedup
Quoted from mecej4
It only adds to the confusion when the terms '32-bit' and '64-bit' are used in vain. Those are address sizes, and have very little to do with FPU registers, X87 or SSE/XMM.
One more confusion is added here: the user all his life expected 32bit integer+32bit FP to run to 2B before crash not to 30M. These integers were used as indices of arrays so the change in accuracy directly influencing address space. These compiler manufacturers together with processor designers choose speed. They switched to faster but smaller mantissa FP units and SSE to do integer operations. Did they warn in compilation LOG file that INT4 + FP4 could now be misleading ?
OK, this way is faster, but the compiler then must warn about use of real4 and integer4 together that the song may end way faster then they expect and suggest to switch at least to real*8 because this with new FP processors has no performance penalty (not sure about SSE) while 64bit vs 32bit resolves memory space penalty. Or must implement runtime crash of integer at 33M. Hell, otherwise you will never find the hidden bugs in large codes, this one specifically.