View previous topic :: View next topic |
Author |
Message |
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Sun Jun 15, 2014 4:05 pm Post subject: Thread Pool API |
|
|
I wrote a simple DLL with FTN95 callable wrapper functions for using the Thread Pool API.
This should simplify application multithreading code. I will post the DLL with source code as soon as I have written some example code. |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Fri Aug 15, 2014 5:47 am Post subject: |
|
|
Test project for thread pool wrapper available here.
tp.mba file contains the minibasic source code for the wrapper DLL and might be useful when figuring out the function parameters. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2815 Location: South Pole, Antarctica
|
Posted: Sat Aug 16, 2014 8:18 pm Post subject: |
|
|
Jalih, looks like you have upgraded to 8 cores/threaded processor PC:-)
Please compare how this scales with number of threads from 1 to 8,
I'm away from PC and though can control it and run everything from the phone but still this is not convenient job (need VR glasses for that and virtual keyboard+mouse probably) |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Mon Aug 18, 2014 10:33 am Post subject: Re: |
|
|
DanRRight wrote: | Jalih, looks like you have upgraded to 8 cores/threaded processor PC:-) |
I wish that would be the case, but no I am still using my six years old PC.
Quote: | Please compare how this scales with number of threads from 1 to 8. |
I have not done any timings. Accurate timing of multithreaded code has it's difficulties. I only tried with Clock@() function and that resulted erroneus result and program hang.
With correct use, I expect thread pool to perfrom quite well. It can reduce overhead a lot and makes managing multithreaded code easy. You can use default application thread pool for simple work, so you only have to create and submit workitems, wait for work callbacks to complete and close workitems. Also process can make use of multiple thread pools. |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Mon Aug 18, 2014 3:44 pm Post subject: |
|
|
Another thread pool sample.
The sample above is a ClearWin+ application that uses application default thread pool to run update and drawing code inside separate thread. This could be used as a simple game template. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Tue Aug 19, 2014 6:09 am Post subject: |
|
|
Jalih,
SYSTEM_CLOCK is based on QueryPerformanceCounter, which is a good real time or elapsed timer. While there can be errors in the timers between different CPU's, the error is typically less than the accuracy of the timer call.
Lately, I have been testing gFortran for OMP programming, before venturing into Clearwin_64. This manages all the stack issues and the replication of private variables. One of the disadvantages of OMP is the overhead of the multiple thread initialisation.
A suitable OMP code must have sufficient computation for each thread to make them effective. I have also found that memory access speed and utilisation of cache are a significant influence on performance. The general rule for multi-loop code structures is to use OMP on the outer loop and vector instructions on the inner loop.
A single simple DO loop, such as Dot_Product below is not suitable for OMP coding, due to the thread overheads.
Code: | ! Simple Dot_Product using !$OMP
! (not recommended, due to OMP overheads)
!
!$OMP PARALLEL DO PRIVATE (i), SHARED (a,b,n), &
!$OMP& REDUCTION(+ : s)
s = 0
do i = 1,n
s = s + a(i) * b(i)
end do
!$OMP END PARALLEL DO |
What is the advantage of the multi-threading approach you are describing ? Does it's simpler thread structure have reduced thread overhead to overcome some of these OMP issues ?
John |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2815 Location: South Pole, Antarctica
|
Posted: Wed Aug 20, 2014 4:23 am Post subject: |
|
|
We have
- FTN95 for NET parallelization
http://forums.silverfrost.com/viewtopic.php?t=2534&start=15
- Two parallel designs by Jalih
- Paul's parallel design
- John's approach using OpenMP but that's on different compiler
Which design is most efficient when scaling to multiple independent threads ?
The example for NET above uses long run independent threads, the overhead does not matter there. By unknown reason it produced the best parallel scaling, way better then we could expect. Specifically on 4core 4770k with 8 threads it gives acceleration closer to 8 then 4, while other methods give 3-4
Last edited by DanRRight on Wed Aug 20, 2014 10:01 pm; edited 1 time in total |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Wed Aug 20, 2014 9:53 am Post subject: Re: |
|
|
JohnCampbell wrote: | What is the advantage of the multi-threading approach you are describing ? Does it's simpler thread structure have reduced thread overhead to overcome some of these OMP issues ? |
Probably the most OpenMP implementations use thread pools. I think the idea is to minimize overhead by not creating and destroying threads for each parallel region. Pool of workers is created at the first parallel region and these threads exist for the duration of program execution. The threads are not destroyed untill the last paraller region is executed.
Basically threads in the pool are queued and wait for work to become available. After thread has processed work, it then returns to the queue to get more work. |
|
Back to top |
|
|
|