|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7932 Location: Salford, UK
|
Posted: Fri May 10, 2013 5:47 am Post subject: |
|
|
I have not had the opportunity to look at this subject but I am hopeful that I will be able to include all of Jalih's good work in salflibc.dll. |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Fri May 10, 2013 8:30 am Post subject: Re: |
|
|
DanRRight wrote: | I hope that in future all what you have done could be rewritten either to allow all definitions to be directly in the Fortran code or in C for SCC because no matter how great speedup will be achieved the ability to efficiently debug is more important. |
I haven't really played around much with the SDBG debugger but I think, it don't support the debugging of multiple threads.
As my DLL is just a wrapper for native win32 api functions, it should not matter what language it is written. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Fri May 10, 2013 11:12 am Post subject: |
|
|
Jalih, What inherently prevents SDBG to do the debug in initially defined thread? If set specific breakpoint condition in the specific thread shouldn't debugger stop at that condition and display whatever you want? Like here, for example, you tell debugger to stop at thread #3 when i=1000.
Code: |
!...ptr is thread handle number
do i=i,200000000/8,1
d=alog(exp(d))
if(i.eq.1000.and.ptr.eq.3) then
i2=i+1
endif
end do
|
Let other threads at this moment of debugger interruption do whatever they want and even finish the run (or debugger could stop other threads too, whatever). Right now something is preventing debugger from displaying debug information.
Debugger is absolutely necessary thing. Without it you can only debug using printing out of the code and write only small well structured subprograms. And as a programmer must be well organized (that's not me, i make 3 errors and a dozen of typos per line). If someone else will touch it - the whole code is dead and you will never find where and why. But if debugger will be able to tell about threads conflict when two threads write into the same variable - that would be best help ever, writing multithreaded codes would be super easy.
My initial experience with parallelization of real code? Right now i'm on the end of my second 16 hours day rebuilding relatively small subroutine of 2K lines (which is a postprocessor part of a large code) into parallel code. Feels like i am in the middle of the dark forest with the hope permanently leaving me and returning back. I'm placing debug prints and plant fake code to find which variable is still not local so it conflicts freezing the code. 1000th of freezes resolved by such blind carpet bombing and i'm just a half way done. I'd not even start doing that if this was not my own code i know each and every letter. But if debugger worked in threads or at least tell which variables are conflicting, i'd do that rebuild in 2 hours |
|
Back to top |
|
|
jalih
Joined: 30 Jul 2012 Posts: 196
|
Posted: Sat May 11, 2013 8:37 am Post subject: Re: |
|
|
DanRRight wrote: | Jalih, What inherently prevents SDBG to do the debug in initially defined thread? If set specific breakpoint condition in the specific thread shouldn't debugger stop at that condition and display whatever you want? |
The problem is that, the SDGB knows nothing about the threads. Remember all threads in the process share memory but have their own stack.
Maybe trace buffer support could be added to SDGB to provide log of the thread events? Simple example and information available here. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Sat May 11, 2013 1:51 pm Post subject: |
|
|
Was fighting whole day&night today to find the reason of code freeze and crash inside subroutine which is called by the main thread subroutine. Whole this subroutine code is completely legitimate, all variables are defined, all write variables are local, they do not use common blocks and passing their values back to thread via dummy variable list. Any suggestion for possible reasons why calling one more subroutine could cause crashing? For example, can even reading of the same array variables (which are in common block) by two competing threads cause conflict and freeze? Or FTN95 lacks similar /multi_threading key like NET has to make threads safe environment?
Anyway, i cornered the place of conflict causing freeze to single line with all variables printed before the crash absolutely OK and local. Still i get the crash. Yes, the code is bouncing neat unsafe place by using enlarged stack and /3GB which also could be the reason.
My enthusiasm is still high (i hope FTN95 developers will help) but right now things are not great after finding couple deadly pitfalls. Do not try to parallelize anything larger then few tens of lines. The worst thing is that the method right now basically kills the debugger and that means that for large codes the game is over. The parallelization of numerical methods is maximum what is worth to do right now.
The real application code needs complete rebuild and is not parallelizable because anything may freeze the code and finding where becomes impossible task. The FTN95 developers are the first who need to go through the whole process from beginning to the end. Debugger now has to show separately local and global variables. The most important addition to the parallel threads debugger would be revealing "writing to the same global variable" conflicts with exact pointing at this spot in the Fortran code. The code right now became too fragile. Any variable not set causes code to freeze. That is what is called "bewitched". LOL Debugger does not provide any information besides useless assembler texts. Since SDBG is not working, the debugging process looks like exact copy of how people were debugging in 1970th - 10000 of recompilations, line by line printing on the display everything, cutting code by 2, 4, 8 till the error is found etc.. Worst part is that after all legitimate reasons for crash were sorted out the code still crashes on some places with no reason.
Good news is gaining experience in parallel programming we all are moving to. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Sun May 12, 2013 1:56 pm Post subject: |
|
|
Well...after crazy week of non stop debugging I give up until better times. The work was so absurdly intense that Saturday evening i with big surprise found that it's not a Thursday LOL
It's impossible to to find the source of conflict violation. Probably some undefined variable exists or threads write to the same variable and there is almost no way to find WHERE without debugger...The method disables debugger entirely in all places not just threads which is unacceptable for large codes killing all potential benefits of large speedups.
It's time for developers to step in |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Tue May 14, 2013 6:29 pm Post subject: |
|
|
I made one step further (these large speedups are very appealing!) finding the reason of the crash in one specific place. This is mystery to me. It follows me for 25 years. Basically if the code has exp(-A) with A more then 50-70 which is causing underflow than this may cause access violation. I can not confirm that with the simple code because it usually works OK, but that's definitely the place. When i made restriction A=min(A,50.) the error was gone. Is it possible that processor issues interrupt reporting underflow which is ignored by the compiler (like it generally should) but when two interrupts are issued at the same time by the two threads then this causes the whole code crash?
Last edited by DanRRight on Thu May 16, 2013 8:52 am; edited 2 times in total |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Wed May 15, 2013 1:06 pm Post subject: |
|
|
Paul, Yes, i made the simple demo code, and seems the reason is as it was suspected above. Here is that old roach which was hiding under the rock for 25 years crashing this and even FTN77 compilers. Code freezes threads or not depending on using one of another marked line. This crashed sometimes even regular single-threaded code causing me to lose weeks - see this older post
http://forums.silverfrost.com/viewtopic.php?t=2291&highlight=freezes
Compilation
>FTN95 mt_cwin.f95
>slink mt_cwin.obj mt.dll
You will need Jalih's mt.dll from the links above. I keep Jalih's original Fortran text just as multithreading example so all DO loops here used for speed testing are not needed. And i decreased single threaded DO range by 10 which is also irrelevant to demonstrate this bug. Code takes exp(-A) one by one until A becomes too large around 70 and exp(-A) too small compared to real*4 machine zero which is somewhere below 10**(-37). Co-processor in this case must issue an underflow interrupt and potentially the hardware logics which handles it either slow or buggy (and in my IBM PC XT in 80th we with the friends made on the knees out of copycatted TTL parts including processors of some european country it was even absent always causing computer crash when there was any FP error or underflow... LOL ) or can only handle one core and when two co-processors issue interrupts all thing crashes
Code: |
module test
INCLUDE <windows.ins>
STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
STDCALL wait_object 'wait_object' (VAL):integer*4
STDCALL check_object 'check_object' (VAL):integer*4
STDCALL close_handle 'close_handle' (VAL):integer*4
STDCALL create_mutex 'create_mutex' (VAL):integer*4
STDCALL release_mutex 'release_mutex' (VAL):integer*4
integer :: hMutex
integer :: values(8) = (/1,2,3,4,5,6,7,8/)
real AAA(100)
contains
subroutine thread(ptr)
integer :: ptr, i
real d
i = wait_object(hMutex)
write(*,*) 'Starting calculation in thread', ptr
i=release_mutex(hMutex)
do i=1,100
underexp = aaa(i) ! <--- bug
! underexp = min(50.,aaa(i)) ! <--- works
d = exp(-underexp)
enddo
d = 2.22
do i=i,200000000/8,1
d=alog(exp(d))
end do
call ExitThread(0)
end subroutine thread
end module test
winapp
program mt
use test
implicit none
integer :: i, x
real :: start, finish, d
integer :: thandle(8)
do i=1,100;
AAA(i)=i+0.1;
enddo
write(*,*) 'Single threaded test'
call clock(start)
d = 2.22
do i=i,20000000 ! 0
d=alog(exp(d))
end do
call clock(finish)
write(*,*) 'Total time in seconds:', finish-start
write(*,*) 'Multithreading test'
hMutex = create_mutex(0)
call clock(start)
do i=1,8,1
thandle(i) = attach_thread(thread,loc(values(i)))
end do
do i=1,8,1
10 call temporary_yield@()
x = check_object(thandle(i))
if (x /= 0) goto 10
end do
call clock(finish)
x = close_handle(hMutex)
write(*,*) 'All done. Bye!'
write(*,*) 'Total time in seconds:', finish-start
end program mt
|
Last edited by DanRRight on Thu May 16, 2013 8:50 am; edited 6 times in total |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7932 Location: Salford, UK
|
Posted: Thu May 16, 2013 7:19 am Post subject: |
|
|
I have logged this for investigation. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2826 Location: South Pole, Antarctica
|
Posted: Fri May 17, 2013 8:03 pm Post subject: |
|
|
Finally i finished parallelization of my now even more bewitched code. Worst were the problems with the exponent as i described it above, then finding which variable is local and which global, and debugging by printouts from the code. Tremendous difficulty debugging parallel code without debugger is due to mainly that the errors which take place in one place (and you will get info about them only using printout from the code as a Neanderthals method of debugging) will confuse you and usually get reported in the other. That will happen until you plant inside the code literally hundreds of prints. Then after crazy weeks of such debugging the wild horse is yours.
So this approach works, thanks to Jalih, he has done pretty good job with everything operating reliably. Now it's time to adopt it most logical way into FTN95 and raising it to the next level. For example
- merging some syntax with NET by substituting wait_object(ihMutex)/release_mutex(ihMutex) to LOCK/UNLOCK,
- taking C WinAPI definitions,
- allowing debug in threads,
- threads collision/conflict in debugger has to point to exact place of access violation in the Fortran source code,
- separating the debug windows for local and global variables (or marking them by color or some other way) |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7932 Location: Salford, UK
|
Posted: Tue Jul 02, 2013 12:03 pm Post subject: |
|
|
Jalih
Is the code for attach_thread available for inspection?
I confess that I am missing a trick here somewhere.
Paul
p.s. Problem sorted. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|