replica nfl jerseysreplica nfl jerseyssoccer jerseyreplica nfl jerseys forums.silverfrost.com :: View topic - EOSHIFT is very slow?
forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

EOSHIFT is very slow?

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
KennyT



Joined: 02 Aug 2005
Posts: 318

PostPosted: Mon Jul 23, 2012 3:35 pm    Post subject: EOSHIFT is very slow? Reply with quote

While benchmarking some code (comparing some old F77 code with new improved F95 code) we noticed a severe speed impact in the new code. The below test program illustrates the issue:

Code:

!ftn95$free
PROGRAM NDLIS

  REAL, ALLOCATABLE   :: TR(:), XX(:)
  REAL*8  :: HIGH_RES_CLOCK@


!   Set up arrays
    write(*,*)  "Setting up arrays"
   ALLOCATE (TR(500), XX(500), stat=IST)
   DO I = 1, size(TR)
    TR(I)   =  FLOAT(I)/100
   END DO

    WRITE (*,*)
    WRITE (*,*) 'EOSHIFT'
    NE   =  size(TR)
    NT1      =  4
!    Do array data shift using F95 intrinsic function   
    E1   =  HIGH_RES_CLOCK@ (.false.)
    DO K = 1, 100000
     XX   =  EOSHIFT(TR(1:NE), NT1)
    END DO
    E2   =  HIGH_RES_CLOCK@ (.false.)
!    Do array data shift using "F77-style" code   
    E3   =  HIGH_RES_CLOCK@ (.false.)
    DO K = 1, 100000
     DO I = 1, NE-NT1
      XX(I)   =  TR(I+NT1)
     END DO
    END DO
    E4   =  HIGH_RES_CLOCK@ (.false.)
     T1   =  E2-E1
     T2   =  E4-E3
     TT   =  (T1 + T2) / 100.
    WRITE (*,*)  ' F95 ', NINT(T1/TT)   ! typical    90%  (2.13s)
    WRITE (*,*)  ' F77 ', NINT(T2/TT)   !      10%  (0.25s)

END

So, it appears that EOSHIFT is 8-9x slower than the equivalent "F77" looping version!!!

We're busy recoding our code the "old" way, but thought you might be interested in examining what EOSHIFT is doing!

K
Back to top
View user's profile Send private message Visit poster's website
JohnCampbell



Joined: 16 Feb 2006
Posts: 2615
Location: Sydney

PostPosted: Tue Jul 24, 2012 2:26 am    Post subject: Reply with quote

Kenny,
It probably will not influence the result, but high_res_clock@ can sometimes produce problems. An alternative is to use QueryPerformanceCounter, which is fast and accurate as calibrated clock.
Code:
      SUBROUTINE ELAPSE_SECOND (ELAPSE)
!
!     Returns the total elapsed time in seconds
!     based on QueryPerformanceCounter
!     This is the fastest and most accurate timing routine
!
      real*8,   intent (out) :: elapse
!
      STDCALL   QUERYPERFORMANCECOUNTER 'QueryPerformanceCounter' (REF):LOGICAL*4
      STDCALL   QUERYPERFORMANCEFREQUENCY 'QueryPerformanceFrequency' (REF):LOGICAL*4
!
      real*8    :: freq  = 1
      logical*4 :: first = .true.
      integer*8 :: start = 0
      integer*8 :: num
      logical*4 :: ll
!      integer*4 :: lute
!
!   Calibrate this time using QueryPerformanceFrequency
      if (first) then
         num   = 0
         ll    = QueryPerformanceFrequency (num)
         freq  = 1.0d0 / dble (num)
         start = 0
         ll    = QueryPerformanceCounter (start)
         first = .false.
!         call get_echo_unit (lute)
!         WRITE (lute,*) 'Elapsed time counter :',num,' ticks per second'
      end if
!
      num    = 0
      ll     = QueryPerformanceCounter (num)
      elapse = dble (num-start) * freq
      return
      end
 
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2402
Location: Yateley, Hants, UK

PostPosted: Tue Jul 24, 2012 8:27 pm    Post subject: Reply with quote

Long experience tells me that compilers have what Les Hatton would have called "Safe Subsets" (he retired at the same time as me, so his web presence may not be as good as previously if you google him). He meant a subset of code elements that the majority of users could be relied upon to use correctly, most of the time. Taking it further, one can see that a compiler must have a core of facilities that users of the safe subset test and test again. These are likely to work well, and relative to comparatively newer facilities, are likely to be more optimised. In FTN95, therefore, one ought to see the FTN77 core perform rather better (reliably? faster? bug-free?) than the Fortran 90 and 95 parts, particularly anything a bit esoteric. You will see fewer bug fixes to Fortran 77 (nearly none) than to the 90/95 facilities in the published bug fix list. Indeed, apart from fixes to things that Microsoft broke wth newer Windows and Visual Studio, and a few upgraded facilities, that's about all there is.

I'm not saying that this is how it should be, only how it most likely is ...

If EOSHIFT is slow and you are the first to discover it, then it may be because no-one used it in earnest, it was fast enough for them, they never compared it to anything else and so don't know that it is slow, or they know it is slow but prefer to use a later Fortran facility rather than coding it by hand in the old-fashioned way (or something else!)

If the slowness of EOSHIFT is a show-stopper for you, then you could always try dumping the assembly language code from your Fortran 77 style version, and then using CODE ... EDOC write a faster version using more modern cpu instructions, as was done in another thread (Fortran 2003/2008) with dot products. I know several posters who would be interested in the outcome - for names, consult the thread I mentioned.

As for me, I'm an old dog, and this is a new trick that I never knew existed, much less ever imagined that I needed! (On reflection I might, but not any time soon .... MRU file lists being a case in point).

Best of luck fixing it.

Eddie
Back to top
View user's profile Send private message
KennyT



Joined: 02 Aug 2005
Posts: 318

PostPosted: Fri Jul 27, 2012 9:21 am    Post subject: Reply with quote

Les Hatton - a star! I remember a trial he did of the various seismic processing companies results. From memory, I think he defined exactly what parameters he wanted applied to a test dataset except one, which he left to the operators best judgement. When he compared the results he got back, the range of results was staggering - one set being 180degs out of phase to the others! I think he also wrote a witty article where he announced the best coloured pencils for doing seismic interpretation! Laughing


K
Back to top
View user's profile Send private message Visit poster's website
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 2402
Location: Yateley, Hants, UK

PostPosted: Fri Jul 27, 2012 5:10 pm    Post subject: Reply with quote

Yes, I'm a big fan. But ALL of his articles are witty.

My safe subset of Fortran 9x is Fortran 77 - and then not all of it.

E
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group