forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

New Topic "NET"
Goto page Previous  1, 2, 3, 4  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Suggestions
View previous topic :: View next topic  
Author Message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sat May 04, 2013 5:58 am    Post subject: Reply with quote

Jalih,
As follows from added lines the code does not wait for proper threads completion. What's wrong?

Code:

module test
  INCLUDE <windows.ins>
  STDCALL attach_thread 'attach_thread' (REF, REF, REF):integer*4
  STDCALL wait_object 'wait_object' (VAL):integer*4
  STDCALL close_handle 'close_handle' (VAL):integer*4
  STDCALL create_mutex 'create_mutex' (VAL):integer*4
  STDCALL release_mutex 'release_mutex' (VAL):integer*4
 
  integer :: hMutex
  integer :: t(8)
  integer :: values(8) = (/1,2,3,4,5,6,7,8/)
  integer :: nEmployedThreads

  contains
    subroutine thread(p)
!      implicit none
      integer :: p, i

      i = wait_object(hMutex)
      write(*,*) 'Hello from thread ', p ! , nEmployedThreads
      i=release_mutex(hMutex)

      d =2.22
!      nEmployedThreads = 8
      do i=1,200000000/nEmployedThreads
       d=alog(exp(d))
      enddo

      call ExitThread(0)

    end subroutine thread

end module test


WINAPP
  use test
  implicit none
  integer :: i, x,  nEmployedThreads

  hMutex = create_mutex(1)
  write(*,*) 'Multithreading test'
  i = release_mutex(hMutex)

  nEmployedThreads = 8
  call clock@ (time_start)

  do i=1,8,1
    x = attach_thread(thread,values(i),t(i))
  end do

  do i=1,8,1
    x = wait_object(t(i))
  end do

  call clock@ (time_finish)

   time = time_finish-time_start
   time2= time * nEmployedThreads
   print*, 'Elapsed time, total CPU time=', time, time2

END
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Sat May 04, 2013 9:24 am    Post subject: Reply with quote

Dan,

I am not sure what .NET does, but could you achieve the same in a win32 environment ?
Could you wrap the API calls in fortran 95 wrappers and achieve what you want ?
Lots of Clesarwin+ do this.
You could create a library of FTN95 callable thread routines.

One of the problems with multi-thread testing is producing a non-trivial solution, as most thread examples do next to nothing significant.
I have had problems converting trivial OpenMP parallel examples to meaningful solutions.

John
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Sat May 04, 2013 6:47 pm    Post subject: Re: Reply with quote

DanRRight wrote:
Jalih,
As follows from added lines the code does not wait for proper threads completion. What's wrong?

Sorry for late reply, Dan...

Actually it was my mistake, the return value for attach_thread() is the thread handle and wait_object() should be called with thread handle as parameter.

Updated example available here

I spotted another problem: write(*,*) in WINAPP don't seem to be thread safe and should only be called in main thread.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sat May 04, 2013 10:40 pm    Post subject: Reply with quote

Jalih,
Well, definitely here is something even more wrong, may be on my side, i do not know.... It does not seem this one works at all...Please add these lines into the launching thread and see in task manager that threads do not even start
Code:

      d =2.22
      nEmployedThreads = 16
      do i=1,200000000/nEmployedThreads
       d=alog(exp(d))
      enddo


John,
It's a WinAPI scenario what jalih is doing above trying to create functional testbed with thread safe locks with ability to write(*,*) like it is done in NET (without that all efforts would be useless because we will lose the last chance to do at least some debugging in threads).
The NET way is dead for me and you because the only way i can make it work is by writing small program completely in NET, pass needed data not via common blocks or modules but via file exchange (or similar, may be, a bit better latest way transfering data via RAM allocation Paul promised) and launching NET part from x86 program using CISSUE or similar. It does not compile my program and has no /3GB. Try this testbed, John, and check it it works in your case, compilation is simple
Code:
FTN95 main.f95
slink main.obj t.dll
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Sun May 05, 2013 11:10 am    Post subject: Re: Reply with quote

DanRRight wrote:
Jalih,
Well, definitely here is something even more wrong, may be on my side, i do not know.... It does not seem this one works at all...


Try this multithreaded matrix multiply test.

Remember, write(*,*) into ClearWin window from worker thread in WINAPP application seems to lock things up, so it can't be used. Maybe Paul can help with this issue?
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sun May 05, 2013 12:03 pm    Post subject: Reply with quote

Thanks jalih, this works. Pity lock of threads does not work though. Also, one more moment puzzled me when i made the same code parallelized as in the NET example in the other thread substituting your matrix multiplication DO loop with with the DO loop from here

http://forums.silverfrost.com/viewtopic.php?t=2534

that speedup is only 3+ times versus 7+ times in NET case. Your matrix case also was around 3.9 times. Any clues why? Here is the text for convenience and simplicity
Code:

module test
  INCLUDE <windows.ins>
  STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
  STDCALL wait_object 'wait_object' (VAL):integer*4
  STDCALL close_handle 'close_handle' (VAL):integer*4
  STDCALL create_mutex 'create_mutex' (VAL):integer*4
  STDCALL release_mutex 'release_mutex' (VAL):integer*4

  integer, parameter :: threads = 8
  real d

  contains
    subroutine thread(ptr)
      integer :: ptr, i, j, x

      d =2.22
      nEmployedThreads = 8
      do i=1,200000000/nEmployedThreads
       d=alog(exp(d))
      enddo
     
      call ExitThread(0)
    end subroutine thread

end module test


WINAPP
  use test
  integer :: i, j, x
  integer :: thandle(threads)
  integer :: nEmployedThreads

  write(*,*) 'Single threaded :'
  call clock(start)
      d =2.22
      do i=1,200000000
       d=alog(exp(d))
      enddo
  call clock(finish)
  write(*,*) 'Total time in seconds:', finish-start

! Calculate work unit size for threads and assign starting positions for each thread

  write(*,*) 'Multi threaded  with 8 threads:'
  call clock(start)

! Start threads
  do i=1,threads,1
    thandle(i) = attach_thread(thread,loc(i))
  end do

! Wait for threads to finish
  do i=1,threads,1
    x = wait_object(thandle(i))
  end do

  call clock(finish)
  write(*,*) 'Total time in seconds:', finish-start
  write(*,*) 'All done.'

END


Last edited by DanRRight on Sun May 05, 2013 1:06 pm; edited 2 times in total
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Sun May 05, 2013 12:43 pm    Post subject: Re: Reply with quote

DanRRight wrote:
Thanks jalih, this works. Pity lock of threads does not work though

Actually locking of threads work fine. Compile and try example below as console application.

Only writing into Clearwin window don't work currently, that is probably because handling of window messages is blocked while waiting for threads. I will write you a non-blocking version and post update soon.

Code:

module test
  INCLUDE <windows.ins>
  STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
  STDCALL wait_object 'wait_object' (VAL):integer*4
  STDCALL close_handle 'close_handle' (VAL):integer*4
  STDCALL create_mutex 'create_mutex' (VAL):integer*4
  STDCALL release_mutex 'release_mutex' (VAL):integer*4

 
  integer :: hMutex
  integer :: values(8) = (/1,2,3,4,5,6,7,8/)

  contains
    subroutine thread(ptr)
      integer :: ptr, i

      i = wait_object(hMutex)
      write(*,*) 'Hello from thread', ptr
      i=release_mutex(hMutex)
      call ExitThread(0)
    end subroutine thread

end module test


program mt
  use test
  implicit none

  integer :: i, x
  integer :: thandle(8)
 
  hMutex = create_mutex(1)
  write(*,*) 'Multithreading test'
  x=release_mutex(hMutex)

  do i=1,8,1
    thandle(i) = attach_thread(thread,loc(values(i)))
  end do

  do i=1,8,1
    x = wait_object(thandle(i))
  end do
 
  x = close_handle(hMutex)
 
  write(*,*) 'All done. Bye!'
end program mt
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Sun May 05, 2013 1:24 pm    Post subject: Re: Reply with quote

jalih wrote:

Only writing into Clearwin window don't work currently, that is probably because handling of window messages is blocked while waiting for threads. I will write you a non-blocking version and post update soon.


Added non-blocking version of wait_object() and new example

Have fun!
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sun May 05, 2013 1:51 pm    Post subject: Reply with quote

Jalih, nice attempts, but I think too short thread confuses very much, seems the lock actually does not work in last example.Or may be works but threads are not launched. Please (always!) include this or similar snippet and see its actual launch and work for a second or two in Task Manager
Code:

      d =2.22
      nEmployedThreads = 8
      do i=1,200000000/nEmployedThreads
       d=alog(exp(d))
      enddo


After that code will never confuse if it works or not. Also, what do you think about speedup numbers as in my previous post couple hours ago?
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Sun May 05, 2013 4:09 pm    Post subject: Re: Reply with quote

DanRRight wrote:
Jalih, nice attempts, but I think too short thread confuses very much, seems the lock actually does not work in last example.

Starting threads and locking should work fine in my last non-blocking thread wait example. I updated my previous example with your suggested code and program runs as it supposed to. Just re-download and test it.

picture here

You can more clearly see that locking really works by adding your extra code inside mutex guarded region, so threads execute one by one.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sun May 05, 2013 7:59 pm    Post subject: Reply with quote

All works, Jalih, which is very good news, thanks for the efforts The last thing left is to understand why exactly the same NET example is almost twice (which i'd say is kind of surprising but unreasonable on 4 CPU cores because CPU may have 8 integer units perfect for 8 threads but it has only 4 fp FP ones and so increasing amount of threads to 8 should not give a boost to NET example but it DOES) faster. With my 4core/8threads processor i get NET executed in 2.5+ seconds versus 5 seconds with this last your modification using WinAPI. Getting speedup around 3 times is kind of smallish, good to expect it close to 4 but it is not unreasonable. Will do more testing and i of course invite others to do that too.
Here is modified code of Jalih to be exact as NET one for the purpose of comparisons.

Code:

! Compilation:
! FTN95 main.f95
! SLINK main.obj t.dll

module test
  INCLUDE <windows.ins>
  STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
  STDCALL wait_object 'wait_object' (VAL):integer*4
  STDCALL check_object 'check_object' (VAL):integer*4
  STDCALL close_handle 'close_handle' (VAL):integer*4
  STDCALL create_mutex 'create_mutex' (VAL):integer*4
  STDCALL release_mutex 'release_mutex' (VAL):integer*4

 
  integer :: hMutex
  integer :: values(8) = (/1,2,3,4,5,6,7,8/)
  integer :: nEmployedThreads

  contains
    subroutine thread(ptr)
      integer :: ptr, i
      real d

      i = wait_object(hMutex)
      write(*,*) 'Starting calculation in thread', ptr
      i=release_mutex(hMutex)
     
      d = 2.22
      do i=i,200000000/nEmployedThreads,1
        d=alog(exp(d))
      end do
     
      call ExitThread(0)
    end subroutine thread

end module test

winapp
program mt
  use test
  implicit none

  integer :: i, x
  integer :: thandle(8)
  real d, finish, start
 

  hMutex = create_mutex(1)
  write(*,*) 'Multithreading test with up to 8 threads:'
  x=release_mutex(hMutex)

1 print*,' Enter number of parallel threads <= 8'
  read(*,*)   nEmployedThreads
  if(nEmployedThreads.lt.1.or.nEmployedThreads.gt.8) nEmployedThreads=4

  call clock(start)

  do i=1,nEmployedThreads,1
    thandle(i) = attach_thread(thread,loc(values(i)))
  end do

  do i=1,nEmployedThreads,1
10 call temporary_yield@()
    x = check_object(thandle(i))
    if (x == 0) goto 10
  end do
 
  x = close_handle(hMutex)
 
  call clock(finish)
  write(*,*) 'Total time in seconds:', finish-start
  goto 1

end program mt



Questions
1) I just noticed that even not running but waiting user input the code grabs CPU time. Why it is that?
2) Do we need this lock in main program (versus in threads where we definitely need a lock)? Threads in main program should never collide and conflict, isn't it?
hMutex = create_mutex(1)
write(*,*) 'Multithreading test with up to 8 threads:'
x=release_mutex(hMutex)
3) Why BASIC was used in DLL to get WinAPI functions, aren't these functions available straight from FTN95?


Last edited by DanRRight on Mon May 06, 2013 7:38 am; edited 2 times in total
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Mon May 06, 2013 4:35 am    Post subject: Re: Reply with quote

DanRRight wrote:

Questions
1) I just noticed that even not running but waiting user input the code grabs CPU time. Why it is that?

Program using ClearWin window must process messages, so main thread in this case can't just sleep waiting for threads to finish. If you would just make a console application, then the wait_object() could be used instead of polling with check_event() and situation would be better.
Quote:

2) Do we need this lock in main program (versus in threads where we definitely need a lock)? Threads in main program should never collide and conflict, isn't it?
hMutex = create_mutex(1)
write(*,*) 'Multithreading test with up to 8 threads:'
x=release_mutex(hMutex)

You are right, the locking is only necessary after the threads have been attached (this is also necessary in main program). You can change the mutex creation line into hMutex = create_mutex(0) and remove the x = release_mutex(hMutex).
Quote:

3) Why BASIC was for used in DLL to get WinAPI functions, aren't these functions available straight from FTN95?

It's a personal preference. FTN95 don't have as complete header definitions as some other compilers do and I like the MiniBASIC's syntax. It's more readable than C and offers the same functionality.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Mon May 06, 2013 8:19 am    Post subject: Reply with quote

A bit more polishing and we are done. What you have been doing is very important. Single processor Fortran is almost dead. But the autoparallelization techniques are not in Fortran standard yet. So the only more or less portable approach at least within Windows is to use WinAPI (since the other way by using FTN95 for NET is not yet polished for extreme uses and OpenMP is not compatible with this compiler). I expect it should work with up to few dozen of threads OK on the good multi-core PC.

l'm OK to use third party libraries, but suspect that people will be reluctant to touch parallelization/multithreading with third party DLLs, so please think if possible to change with time BASIC's DLL to FTN95 existing definitions, with your knowledge of multiple languages that would be not that hard. That part was always not the best with this compiler. I know that Intel compiler has Fortran-friendly definitions of WinAPI functions, would be great if FTN95 developers adopted it too.

Is processing messages the reason of one core is not doing the calculations and so we see only 3 times speedup instead of at least 3.95 on 4 cores? (i lost your non-windows version, what is best to change in this code above to try and check this idea?)

May be Paul also will look at this and suggest something to improve this method and even include into the package, kind of standartizing it.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Mon May 06, 2013 7:01 pm    Post subject: Reply with quote

I am keeping an eye on this conversation with the hope that I can include something in the FTN95 Win32 library.
Back to top
View user's profile Send private message AIM Address
jalih



Joined: 30 Jul 2012
Posts: 196

PostPosted: Mon May 06, 2013 7:39 pm    Post subject: Reply with quote

I made some small changes into my wrapper functions. Now check_object(hObject) returns directly what WaitForSingleObject(hObject, 0) returns, so now it returns 0 instead of 1 if the thread is finished.

Also because Dan don't seem to like basic, I re-wrote it in assembler. It makes winapi call directly using import library label, just for fun. DLL, sample for console application and application using ClearWin window are included and source for the DLL.


Available here
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Suggestions All times are GMT + 1 Hour
Goto page Previous  1, 2, 3, 4  Next
Page 2 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group