forums.silverfrost.com

JohnCampbell · Joined: 16 Feb 2006 Posts: 2556 Location: Sydney

The following is an example where the number of threads can be varied more easily.
The same routine is called for multiple threads, with the "arg" indicating which thread is being called.
It appear to work, giving the same answer.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2556 Location: Sydney

The test routine is:

PaulLaidler · Posted: Mon Dec 11, 2017 4:35 pm Post subject:

Ken's original code raises two further issues.

1) The code fragment

John-Silver · Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley

how can x64 possibly 60% slower than x32 ?

Browsing haphazardly through the on-line doucmentation for Threading I note that there it's all seems to be tied to .NET programming, why is that ?

I also note that threading for implementing multi-processing appears to be non-automatic in that it needs a lot of knowledge to start with about potions of a code that could benefit it and then careful d�segregation of such code and 'enveloping' in the necessary threading code.
It seams, like most too-good-to-be-true mega-methods, laden with potential penelope pitfalls, not to mention dastardley Dan devils, until it evolves to be peter perfect.

(This is a global observation not a critique in any way of ftn95)

PaulLaidler · Posted: Mon Dec 11, 2017 8:49 pm Post subject:

John-Silver

Some of the results in this Forum thread are erroneous. If correct results are 60% slower then that will be a particular case. In general my experience is that 64 bit results are usually on a par with 32 bit results (for speed).

The real motivation for moving to 64 bits is the increase in address space.

start_thread@ is for Win32 and x64 and not .NET.

Yes, multi-threading can be tricky.

Kenneth_Smith · Posted: Mon Dec 11, 2017 9:53 pm Post subject:

Paul, John,

Thanks to you both for comments and detective work.

I can see how to make this work now.

Ken

JohnCampbell · Joined: 16 Feb 2006 Posts: 2556 Location: Sydney

Based on the examples I have posted, there does not appear to be significant performance problems with 64-bit multi-threading.

I have used two of Ken's examples, I called ;
Search1 : simple loop count, where 64-bit is faster
SearchX : loop of COMPLEX calculation, where 64-bit is slower.

I have not identified the problem, but I suspect it is the use of cmplx (x,y,kind=dp) or COMPLEX i_loc in the following:
i_loc = v1/cmplx((r1+r2/s_loc),(x1+x2),kind=dp)
t_loc = (k_loc)*((abs(i_loc)**2)*(r2/s_loc))

I rarely use complex type variables, so am not familiar with this type of problem.

My "Test arg" example posted above is an interesting example of multi-threading, as for my first try below with j as the arg value, this fails as the variable "j" changes through the loop and so changes in each of the initiated threads. This is because the same address of the variable j is transferred to each thread:
do j = 1,max_threads
hh(j) = start_thread@ (searchx,j)
end do

By changing to arg(j), a unique address for arg(j) is transferred to each thread and so it appears to work ok
do j = 1,max_threads
arg(j) = j
hh(j) = start_thread@ (searchx,arg(j))
end do

Further tests need to be done for varying max_threads to identify the multi-thread efficiency and overheads. My pc's support up to 4 or 8 threads. ( It has been a long time since I used a pc that supported only 2 threads. )

John

John-Silver · Joined: 30 Jul 2013 Posts: 1520 Location: Aerospace Valley

Paul wrote:

JohnCampbell · Joined: 16 Feb 2006 Posts: 2556 Location: Sydney

Thanks to Ken for providing an example of using multi-threading.

I have taken this and produced a program that tests multi-threading and demonstrates:
# how to vary the number of threads and call in a DO loop
# DO loop approach allows for variable number of threads.
# use the same thread-safe routine for multiple threads, using local private variables and shared variables in a module.
# how to transfer a unique argument to each thread call.
# how to share work between threads to improve performance

The attached example works for both 32-bit and 64-bit applications.

I hope it could be helpful for others to create useful solutions.

John

https://www.dropbox.com/s/5u4ojctkshhq87z/ken_thread_test4.zip?dl=0

PaulLaidler · Posted: Wed Dec 13, 2017 2:57 pm Post subject:

The failure of 64 bit CLOCK@ (and DCLOCK@) has now been fixed for the next release of clearwin64.dll.

JohnCampbell · Joined: 16 Feb 2006 Posts: 2556 Location: Sydney

Ken,

Have you made any progress with the threading ?

It would be good to be able to use a DO loop to manage the number of available threads.
We could have a DO for the threads available or tasks to perform and then associate the thread number with each task.
I am having a bit of a problem managing the private do index for each thread.

Interested to hear how you proceeded.

John

Kenneth_Smith · Posted: Fri Apr 05, 2019 10:22 am Post subject:

John,

Afraid this got put on the back burner last year when I was involved in some work for an arbitration - which consumed all my time. Thereafter I decided it was time for a change in the direction of my career, so I gave up work at the end of 2018. Now have my own one man business up and running and I am presently focusing on new clients - with some success Very Happy

, but not yet found the time to come back to this, although I do have a long list of "what happens if" scenarios I need to test. I will get back to this - once the business clears the Director's Loan and pays me a dividend (July hopefully).

Ken

JohnCampbell · Joined: 16 Feb 2006 Posts: 2556 Location: Sydney

Paul, Ken and others,

I have posted a multi-thread example using !$OMP in http://forums.silverfrost.com/viewtopic.php?t=4297&start=15

I am wondering how well this may be reproduced in the FTN95 parallel processing approach.
The basic !$OMP PARALLEL DO loop approach is a minimal approach for doing parallel processing. The code example is:

PaulLaidler · Posted: Sun Aug 30, 2020 2:08 pm Post subject:

John

Sorry but my knowledge of this subject is very limited.

Kenneth_Smith · Posted: Mon Aug 31, 2020 9:57 am Post subject:

John, this is a variation on one of the examples for the parallel processing approach. Unlike Gfortran, with FTN95 you cannot simply define a section of code to be executed in parallel. So all serial code prior to the parallel section must be within the IF( .not. IsSlaveProcess@()) THEN ...... END IF block.

It took me ages to get this example to work this way, and then I went off to do something else and never came back to it.