|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Sat May 04, 2013 5:58 am Post subject: |
|
|
Jalih,
As follows from added lines the code does not wait for proper threads completion. What's wrong?
Code: |
module test
INCLUDE <windows.ins>
STDCALL attach_thread 'attach_thread' (REF, REF, REF):integer*4
STDCALL wait_object 'wait_object' (VAL):integer*4
STDCALL close_handle 'close_handle' (VAL):integer*4
STDCALL create_mutex 'create_mutex' (VAL):integer*4
STDCALL release_mutex 'release_mutex' (VAL):integer*4
integer :: hMutex
integer :: t(8)
integer :: values(8) = (/1,2,3,4,5,6,7,8/)
integer :: nEmployedThreads
contains
subroutine thread(p)
! implicit none
integer :: p, i
i = wait_object(hMutex)
write(*,*) 'Hello from thread ', p ! , nEmployedThreads
i=release_mutex(hMutex)
d =2.22
! nEmployedThreads = 8
do i=1,200000000/nEmployedThreads
d=alog(exp(d))
enddo
call ExitThread(0)
end subroutine thread
end module test
WINAPP
use test
implicit none
integer :: i, x, nEmployedThreads
hMutex = create_mutex(1)
write(*,*) 'Multithreading test'
i = release_mutex(hMutex)
nEmployedThreads = 8
call clock@ (time_start)
do i=1,8,1
x = attach_thread(thread,values(i),t(i))
end do
do i=1,8,1
x = wait_object(t(i))
end do
call clock@ (time_finish)
time = time_finish-time_start
time2= time * nEmployedThreads
print*, 'Elapsed time, total CPU time=', time, time2
END
|
|
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2556 Location: Sydney
|
Posted: Sat May 04, 2013 9:24 am Post subject: |
|
|
Dan,
I am not sure what .NET does, but could you achieve the same in a win32 environment ?
Could you wrap the API calls in fortran 95 wrappers and achieve what you want ?
Lots of Clesarwin+ do this.
You could create a library of FTN95 callable thread routines.
One of the problems with multi-thread testing is producing a non-trivial solution, as most thread examples do next to nothing significant.
I have had problems converting trivial OpenMP parallel examples to meaningful solutions.
John |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Sat May 04, 2013 6:47 pm Post subject: Re: |
|
|
DanRRight wrote: | Jalih,
As follows from added lines the code does not wait for proper threads completion. What's wrong?
|
Sorry for late reply, Dan...
Actually it was my mistake, the return value for attach_thread() is the thread handle and wait_object() should be called with thread handle as parameter.
Updated example available here
I spotted another problem: write(*,*) in WINAPP don't seem to be thread safe and should only be called in main thread. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Sat May 04, 2013 10:40 pm Post subject: |
|
|
Jalih,
Well, definitely here is something even more wrong, may be on my side, i do not know.... It does not seem this one works at all...Please add these lines into the launching thread and see in task manager that threads do not even start
Code: |
d =2.22
nEmployedThreads = 16
do i=1,200000000/nEmployedThreads
d=alog(exp(d))
enddo
|
John,
It's a WinAPI scenario what jalih is doing above trying to create functional testbed with thread safe locks with ability to write(*,*) like it is done in NET (without that all efforts would be useless because we will lose the last chance to do at least some debugging in threads).
The NET way is dead for me and you because the only way i can make it work is by writing small program completely in NET, pass needed data not via common blocks or modules but via file exchange (or similar, may be, a bit better latest way transfering data via RAM allocation Paul promised) and launching NET part from x86 program using CISSUE or similar. It does not compile my program and has no /3GB. Try this testbed, John, and check it it works in your case, compilation is simple
Code: | FTN95 main.f95
slink main.obj t.dll |
|
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Sun May 05, 2013 11:10 am Post subject: Re: |
|
|
DanRRight wrote: | Jalih,
Well, definitely here is something even more wrong, may be on my side, i do not know.... It does not seem this one works at all... |
Try this multithreaded matrix multiply test.
Remember, write(*,*) into ClearWin window from worker thread in WINAPP application seems to lock things up, so it can't be used. Maybe Paul can help with this issue? |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Sun May 05, 2013 12:03 pm Post subject: |
|
|
Thanks jalih, this works. Pity lock of threads does not work though. Also, one more moment puzzled me when i made the same code parallelized as in the NET example in the other thread substituting your matrix multiplication DO loop with with the DO loop from here
http://forums.silverfrost.com/viewtopic.php?t=2534
that speedup is only 3+ times versus 7+ times in NET case. Your matrix case also was around 3.9 times. Any clues why? Here is the text for convenience and simplicity
Code: |
module test
INCLUDE <windows.ins>
STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
STDCALL wait_object 'wait_object' (VAL):integer*4
STDCALL close_handle 'close_handle' (VAL):integer*4
STDCALL create_mutex 'create_mutex' (VAL):integer*4
STDCALL release_mutex 'release_mutex' (VAL):integer*4
integer, parameter :: threads = 8
real d
contains
subroutine thread(ptr)
integer :: ptr, i, j, x
d =2.22
nEmployedThreads = 8
do i=1,200000000/nEmployedThreads
d=alog(exp(d))
enddo
call ExitThread(0)
end subroutine thread
end module test
WINAPP
use test
integer :: i, j, x
integer :: thandle(threads)
integer :: nEmployedThreads
write(*,*) 'Single threaded :'
call clock(start)
d =2.22
do i=1,200000000
d=alog(exp(d))
enddo
call clock(finish)
write(*,*) 'Total time in seconds:', finish-start
! Calculate work unit size for threads and assign starting positions for each thread
write(*,*) 'Multi threaded with 8 threads:'
call clock(start)
! Start threads
do i=1,threads,1
thandle(i) = attach_thread(thread,loc(i))
end do
! Wait for threads to finish
do i=1,threads,1
x = wait_object(thandle(i))
end do
call clock(finish)
write(*,*) 'Total time in seconds:', finish-start
write(*,*) 'All done.'
END
|
Last edited by DanRRight on Sun May 05, 2013 1:06 pm; edited 2 times in total |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Sun May 05, 2013 12:43 pm Post subject: Re: |
|
|
DanRRight wrote: | Thanks jalih, this works. Pity lock of threads does not work though |
Actually locking of threads work fine. Compile and try example below as console application.
Only writing into Clearwin window don't work currently, that is probably because handling of window messages is blocked while waiting for threads. I will write you a non-blocking version and post update soon.
Code: |
module test
INCLUDE <windows.ins>
STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
STDCALL wait_object 'wait_object' (VAL):integer*4
STDCALL close_handle 'close_handle' (VAL):integer*4
STDCALL create_mutex 'create_mutex' (VAL):integer*4
STDCALL release_mutex 'release_mutex' (VAL):integer*4
integer :: hMutex
integer :: values(8) = (/1,2,3,4,5,6,7,8/)
contains
subroutine thread(ptr)
integer :: ptr, i
i = wait_object(hMutex)
write(*,*) 'Hello from thread', ptr
i=release_mutex(hMutex)
call ExitThread(0)
end subroutine thread
end module test
program mt
use test
implicit none
integer :: i, x
integer :: thandle(8)
hMutex = create_mutex(1)
write(*,*) 'Multithreading test'
x=release_mutex(hMutex)
do i=1,8,1
thandle(i) = attach_thread(thread,loc(values(i)))
end do
do i=1,8,1
x = wait_object(thandle(i))
end do
x = close_handle(hMutex)
write(*,*) 'All done. Bye!'
end program mt
|
|
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Sun May 05, 2013 1:24 pm Post subject: Re: |
|
|
jalih wrote: |
Only writing into Clearwin window don't work currently, that is probably because handling of window messages is blocked while waiting for threads. I will write you a non-blocking version and post update soon.
|
Added non-blocking version of wait_object() and new example
Have fun! |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Sun May 05, 2013 1:51 pm Post subject: |
|
|
Jalih, nice attempts, but I think too short thread confuses very much, seems the lock actually does not work in last example.Or may be works but threads are not launched. Please (always!) include this or similar snippet and see its actual launch and work for a second or two in Task Manager
Code: |
d =2.22
nEmployedThreads = 8
do i=1,200000000/nEmployedThreads
d=alog(exp(d))
enddo
|
After that code will never confuse if it works or not. Also, what do you think about speedup numbers as in my previous post couple hours ago? |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Sun May 05, 2013 4:09 pm Post subject: Re: |
|
|
DanRRight wrote: | Jalih, nice attempts, but I think too short thread confuses very much, seems the lock actually does not work in last example. |
Starting threads and locking should work fine in my last non-blocking thread wait example. I updated my previous example with your suggested code and program runs as it supposed to. Just re-download and test it.
picture here
You can more clearly see that locking really works by adding your extra code inside mutex guarded region, so threads execute one by one. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Sun May 05, 2013 7:59 pm Post subject: |
|
|
All works, Jalih, which is very good news, thanks for the efforts The last thing left is to understand why exactly the same NET example is almost twice (which i'd say is kind of surprising but unreasonable on 4 CPU cores because CPU may have 8 integer units perfect for 8 threads but it has only 4 fp FP ones and so increasing amount of threads to 8 should not give a boost to NET example but it DOES) faster. With my 4core/8threads processor i get NET executed in 2.5+ seconds versus 5 seconds with this last your modification using WinAPI. Getting speedup around 3 times is kind of smallish, good to expect it close to 4 but it is not unreasonable. Will do more testing and i of course invite others to do that too.
Here is modified code of Jalih to be exact as NET one for the purpose of comparisons.
Code: |
! Compilation:
! FTN95 main.f95
! SLINK main.obj t.dll
module test
INCLUDE <windows.ins>
STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
STDCALL wait_object 'wait_object' (VAL):integer*4
STDCALL check_object 'check_object' (VAL):integer*4
STDCALL close_handle 'close_handle' (VAL):integer*4
STDCALL create_mutex 'create_mutex' (VAL):integer*4
STDCALL release_mutex 'release_mutex' (VAL):integer*4
integer :: hMutex
integer :: values(8) = (/1,2,3,4,5,6,7,8/)
integer :: nEmployedThreads
contains
subroutine thread(ptr)
integer :: ptr, i
real d
i = wait_object(hMutex)
write(*,*) 'Starting calculation in thread', ptr
i=release_mutex(hMutex)
d = 2.22
do i=i,200000000/nEmployedThreads,1
d=alog(exp(d))
end do
call ExitThread(0)
end subroutine thread
end module test
winapp
program mt
use test
implicit none
integer :: i, x
integer :: thandle(8)
real d, finish, start
hMutex = create_mutex(1)
write(*,*) 'Multithreading test with up to 8 threads:'
x=release_mutex(hMutex)
1 print*,' Enter number of parallel threads <= 8'
read(*,*) nEmployedThreads
if(nEmployedThreads.lt.1.or.nEmployedThreads.gt.8) nEmployedThreads=4
call clock(start)
do i=1,nEmployedThreads,1
thandle(i) = attach_thread(thread,loc(values(i)))
end do
do i=1,nEmployedThreads,1
10 call temporary_yield@()
x = check_object(thandle(i))
if (x == 0) goto 10
end do
x = close_handle(hMutex)
call clock(finish)
write(*,*) 'Total time in seconds:', finish-start
goto 1
end program mt
|
Questions
1) I just noticed that even not running but waiting user input the code grabs CPU time. Why it is that?
2) Do we need this lock in main program (versus in threads where we definitely need a lock)? Threads in main program should never collide and conflict, isn't it?
hMutex = create_mutex(1)
write(*,*) 'Multithreading test with up to 8 threads:'
x=release_mutex(hMutex)
3) Why BASIC was used in DLL to get WinAPI functions, aren't these functions available straight from FTN95?
Last edited by DanRRight on Mon May 06, 2013 7:38 am; edited 2 times in total |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Mon May 06, 2013 4:35 am Post subject: Re: |
|
|
DanRRight wrote: |
Questions
1) I just noticed that even not running but waiting user input the code grabs CPU time. Why it is that?
|
Program using ClearWin window must process messages, so main thread in this case can't just sleep waiting for threads to finish. If you would just make a console application, then the wait_object() could be used instead of polling with check_event() and situation would be better.
Quote: |
2) Do we need this lock in main program (versus in threads where we definitely need a lock)? Threads in main program should never collide and conflict, isn't it?
hMutex = create_mutex(1)
write(*,*) 'Multithreading test with up to 8 threads:'
x=release_mutex(hMutex)
|
You are right, the locking is only necessary after the threads have been attached (this is also necessary in main program). You can change the mutex creation line into hMutex = create_mutex(0) and remove the x = release_mutex(hMutex).
Quote: |
3) Why BASIC was for used in DLL to get WinAPI functions, aren't these functions available straight from FTN95? |
It's a personal preference. FTN95 don't have as complete header definitions as some other compilers do and I like the MiniBASIC's syntax. It's more readable than C and offers the same functionality. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Mon May 06, 2013 8:19 am Post subject: |
|
|
A bit more polishing and we are done. What you have been doing is very important. Single processor Fortran is almost dead. But the autoparallelization techniques are not in Fortran standard yet. So the only more or less portable approach at least within Windows is to use WinAPI (since the other way by using FTN95 for NET is not yet polished for extreme uses and OpenMP is not compatible with this compiler). I expect it should work with up to few dozen of threads OK on the good multi-core PC.
l'm OK to use third party libraries, but suspect that people will be reluctant to touch parallelization/multithreading with third party DLLs, so please think if possible to change with time BASIC's DLL to FTN95 existing definitions, with your knowledge of multiple languages that would be not that hard. That part was always not the best with this compiler. I know that Intel compiler has Fortran-friendly definitions of WinAPI functions, would be great if FTN95 developers adopted it too.
Is processing messages the reason of one core is not doing the calculations and so we see only 3 times speedup instead of at least 3.95 on 4 cores? (i lost your non-windows version, what is best to change in this code above to try and check this idea?)
May be Paul also will look at this and suggest something to improve this method and even include into the package, kind of standartizing it. |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7933 Location: Salford, UK
|
Posted: Mon May 06, 2013 7:01 pm Post subject: |
|
|
I am keeping an eye on this conversation with the hope that I can include something in the FTN95 Win32 library. |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Mon May 06, 2013 7:39 pm Post subject: |
|
|
I made some small changes into my wrapper functions. Now check_object(hObject) returns directly what WaitForSingleObject(hObject, 0) returns, so now it returns 0 instead of 1 if the thread is finished.
Also because Dan don't seem to like basic, I re-wrote it in assembler. It makes winapi call directly using import library label, just for fun. DLL, sample for console application and application using ClearWin window are included and source for the DLL.
Available here |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|