forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Fortran 2003/2008
Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8, 9
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
DanRRight



Joined: 10 Mar 2008
Posts: 1879
Location: South Pole, Antarctica

PostPosted: Tue Oct 29, 2013 12:57 pm    Post subject: Reply with quote

Yes, date and time are correct
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2004
Location: Sydney

PostPosted: Tue Nov 05, 2013 1:36 pm    Post subject: Reply with quote

Dan,

I have created a timing module (library) which contains routines that use most of the timing routines I have used with FTN95. I have copied it to dropbox. It includes estimates of calling time penalty and also the precision of the timers.
The approach I have taken is to provide 3 functions to interface each timer, being:
integer*8 function timer_tick () ! which returns the number of ticks
integer*8 function timer_frequency () ! which returns ticks per second
real*8 function timer_sec () ! which returns the seconds

For elapsed time, CPU_CLOCK@ and QueryPerformanceCounter perform well.
Most other timers perform very poorly, as their value is updated 64 times per second. This included all estimates of processor time. It is important to understand that all timers have an update frequency, which is different from the tick rate and those with an update frequency of 64 cycles per second are not worth using. This includes the intrinsics CPU_TIME, Date_and_Time and Salford's DCLOCK@.

My preference is to use CPU_CLOCK@ and ignore the warning message, as I have never experienced the problem reported.
I have provided a routine to calculate the clock rate for cpu_clock@, which runs fairly quickly ( < .001 seconds).

John


https://www.dropbox.com/s/yvsck1xysec4crm/timing_routines.f90
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1879
Location: South Pole, Antarctica

PostPosted: Wed Nov 06, 2013 10:54 pm    Post subject: Reply with quote

Would be good to check these timers with my parallel library, which might mess something but I'm out of place right now. Want to try yourself?
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2004
Location: Sydney

PostPosted: Mon Nov 11, 2013 12:02 am    Post subject: Reply with quote

Dan,

Email me the links to the library etc and I'll give it a test.
I don't think I have the libraries you are refering to.

John
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1031

PostPosted: Mon Feb 16, 2015 4:27 am    Post subject: Re: Reply with quote

DavidB,

Thanks for contributing the fast_asm_ddotprod code. I had occasion to look at it as part of John Campbell's code, where he experienced huge slowdowns caused by similar routines compiled from Fortran code.

Here is a minor tweak to take care of the part where you wrote
Code:
 Can't get final reduction to work, so will do this in Fortran for now
  80  movupd v, xmm0%            ; move xmm0 to v array
   edoc
   
   ! Final reduction, the result is the sum of the two values in v
   fast_asm_ddotprod = sum(v)

end function fast_asm_ddotprod

The change, which will not really affect the speed much, but make your code "clean SSE", is as follows:
Code:

  80    movaps   xmm1,xmm0
        unpckhpd xmm1,xmm0
        addsd    xmm0,xmm1
        movsd    v,xmm0
   edoc
   
   fast_asm_ddotprod = v

end function fast_asm_ddotprod

The declaration of the local variable v should be changed to REAL*8 v.
Unfortunately, FTN95 does not know the instruction unpckhpd, so instead of inline assembler one may (i) produce an obj file from your original code, (ii) dumpbin /disasm > asm file, (iii) edit file and make the changes indicated above, (iv) assemble the asm file.

Perhaps you felt all this was not worth the trouble, and that is what you meant by "cannot get ... to work".


Last edited by mecej4 on Mon Feb 16, 2015 2:07 pm; edited 1 time in total
Back to top
View user's profile Send private message
davidb



Joined: 17 Jul 2009
Posts: 553
Location: UK

PostPosted: Mon Feb 16, 2015 9:12 am    Post subject: Reply with quote

When I found that the assembly code I needed (unpckhpd) wasn't included in the FTN95 inline assembler I just fell back on using Fortran as I felt it was easier (Paul did look into adding support for unpckhpd and others but found it was not trivial so it wasn't pursued).

Your solution is better but requires more work to get a working subroutine. It probably does not make much difference in terms of efficiency.
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1879
Location: South Pole, Antarctica

PostPosted: Mon Apr 09, 2018 10:25 am    Post subject: Reply with quote

How about 64bit Vec_Add_SSE and Vec_Sum_SSE previously written in assembler for 32bits ? Anyone try to use FTN95 own or have done own 64bit version?
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2004
Location: Sydney

PostPosted: Mon Apr 09, 2018 1:21 pm    Post subject: Reply with quote

Dan,

The 64-bit routines work well and provide vector instruction speed-up.
For real*8, I have used:
Code:
 real*8 function Vec_Sum_SSE ( a, b, n )
!
!   Performs the vector opperation  Vec_Sum_DO = [a] . [b]

    integer*4 n
    real*8    a(n), b(n), s
!
    integer*8 n8
    real*8    DOT_PRODUCT8@
    external  DOT_PRODUCT8@
!
    if (n > 1) then
       n8 = n
       s = DOT_PRODUCT8@ (a, b, n8 )
    else if (n==1) then
       s = a(1) * b(1)
    else
       s = 0
    end if
!
   vec_sum_SSE = s

 end function vec_sum_SSE

 subroutine Vec_add_SSE ( Y, X, a, n )
!
!   Performs the vector operation  [Y] = [Y] + a * [X]
!
   integer*4 n
   real*8    Y(n), X(n), a
   integer*8 n8
!
     if ( n > 1 ) then
!       Y = Y + a * X
       n8 = n
       call AXPY8@ (y, x, n8, a)

     else if ( n == 1 ) then
       Y(1) = Y(1) + a * X(1)

     end if

 end subroutine Vec_add_SSE


note: integer*8 n8
check the 64 bit documentation for other routines.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1879
Location: South Pole, Antarctica

PostPosted: Tue Apr 10, 2018 6:02 am    Post subject: Reply with quote

Cool, thanks John,
I owe you !
And great job, Silverfrost
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Goto page Previous  1, 2, 3, 4, 5, 6, 7, 8, 9
Page 9 of 9

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group