forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

New Topic "NET"
Goto page Previous  1, 2, 3, 4
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Suggestions
View previous topic :: View next topic  
Author Message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5046
Location: Salford, UK

PostPosted: Fri May 10, 2013 5:47 am    Post subject: Reply with quote

I have not had the opportunity to look at this subject but I am hopeful that I will be able to include all of Jalih's good work in salflibc.dll.
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 189

PostPosted: Fri May 10, 2013 8:30 am    Post subject: Re: Reply with quote

DanRRight wrote:
I hope that in future all what you have done could be rewritten either to allow all definitions to be directly in the Fortran code or in C for SCC because no matter how great speedup will be achieved the ability to efficiently debug is more important.


I haven't really played around much with the SDBG debugger but I think, it don't support the debugging of multiple threads.

As my DLL is just a wrapper for native win32 api functions, it should not matter what language it is written.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1649
Location: South Pole, Antarctica

PostPosted: Fri May 10, 2013 11:12 am    Post subject: Reply with quote

Jalih, What inherently prevents SDBG to do the debug in initially defined thread? If set specific breakpoint condition in the specific thread shouldn't debugger stop at that condition and display whatever you want? Like here, for example, you tell debugger to stop at thread #3 when i=1000.

Code:


!...ptr is thread handle number

   do i=i,200000000/8,1
     d=alog(exp(d))
       if(i.eq.1000.and.ptr.eq.3) then
           i2=i+1
       endif
   end do


Let other threads at this moment of debugger interruption do whatever they want and even finish the run (or debugger could stop other threads too, whatever). Right now something is preventing debugger from displaying debug information.

Debugger is absolutely necessary thing. Without it you can only debug using printing out of the code and write only small well structured subprograms. And as a programmer must be well organized (that's not me, i make 3 errors and a dozen of typos per line). If someone else will touch it - the whole code is dead and you will never find where and why. But if debugger will be able to tell about threads conflict when two threads write into the same variable - that would be best help ever, writing multithreaded codes would be super easy.

My initial experience with parallelization of real code? Right now i'm on the end of my second 16 hours day rebuilding relatively small subroutine of 2K lines (which is a postprocessor part of a large code) into parallel code. Feels like i am in the middle of the dark forest with the hope permanently leaving me and returning back. I'm placing debug prints and plant fake code to find which variable is still not local so it conflicts freezing the code. 1000th of freezes resolved by such blind carpet bombing and i'm just a half way done. I'd not even start doing that if this was not my own code i know each and every letter. But if debugger worked in threads or at least tell which variables are conflicting, i'd do that rebuild in 2 hours
Back to top
View user's profile Send private message
jalih



Joined: 30 Jul 2012
Posts: 189

PostPosted: Sat May 11, 2013 8:37 am    Post subject: Re: Reply with quote

DanRRight wrote:
Jalih, What inherently prevents SDBG to do the debug in initially defined thread? If set specific breakpoint condition in the specific thread shouldn't debugger stop at that condition and display whatever you want?

The problem is that, the SDGB knows nothing about the threads. Remember all threads in the process share memory but have their own stack.

Maybe trace buffer support could be added to SDGB to provide log of the thread events? Simple example and information available here.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1649
Location: South Pole, Antarctica

PostPosted: Sat May 11, 2013 1:51 pm    Post subject: Reply with quote

Was fighting whole day&night today to find the reason of code freeze and crash inside subroutine which is called by the main thread subroutine. Whole this subroutine code is completely legitimate, all variables are defined, all write variables are local, they do not use common blocks and passing their values back to thread via dummy variable list. Any suggestion for possible reasons why calling one more subroutine could cause crashing? For example, can even reading of the same array variables (which are in common block) by two competing threads cause conflict and freeze? Or FTN95 lacks similar /multi_threading key like NET has to make threads safe environment?
Anyway, i cornered the place of conflict causing freeze to single line with all variables printed before the crash absolutely OK and local. Still i get the crash. Yes, the code is bouncing neat unsafe place by using enlarged stack and /3GB which also could be the reason.

My enthusiasm is still high (i hope FTN95 developers will help) but right now things are not great after finding couple deadly pitfalls. Do not try to parallelize anything larger then few tens of lines. The worst thing is that the method right now basically kills the debugger and that means that for large codes the game is over. The parallelization of numerical methods is maximum what is worth to do right now.

The real application code needs complete rebuild and is not parallelizable because anything may freeze the code and finding where becomes impossible task. The FTN95 developers are the first who need to go through the whole process from beginning to the end. Debugger now has to show separately local and global variables. The most important addition to the parallel threads debugger would be revealing "writing to the same global variable" conflicts with exact pointing at this spot in the Fortran code. The code right now became too fragile. Any variable not set causes code to freeze. That is what is called "bewitched". LOL Debugger does not provide any information besides useless assembler texts. Since SDBG is not working, the debugging process looks like exact copy of how people were debugging in 1970th - 10000 of recompilations, line by line printing on the display everything, cutting code by 2, 4, 8 till the error is found etc.. Worst part is that after all legitimate reasons for crash were sorted out the code still crashes on some places with no reason.
Good news is gaining experience in parallel programming we all are moving to.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1649
Location: South Pole, Antarctica

PostPosted: Sun May 12, 2013 1:56 pm    Post subject: Reply with quote

Well...after crazy week of non stop debugging I give up until better times. The work was so absurdly intense that Saturday evening i with big surprise found that it's not a Thursday LOL

It's impossible to to find the source of conflict violation. Probably some undefined variable exists or threads write to the same variable and there is almost no way to find WHERE without debugger...The method disables debugger entirely in all places not just threads which is unacceptable for large codes killing all potential benefits of large speedups.

It's time for developers to step in
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1649
Location: South Pole, Antarctica

PostPosted: Tue May 14, 2013 6:29 pm    Post subject: Reply with quote

I made one step further (these large speedups are very appealing!) finding the reason of the crash in one specific place. This is mystery to me. It follows me for 25 years. Basically if the code has exp(-A) with A more then 50-70 which is causing underflow than this may cause access violation. I can not confirm that with the simple code because it usually works OK, but that's definitely the place. When i made restriction A=min(A,50.) the error was gone. Is it possible that processor issues interrupt reporting underflow which is ignored by the compiler (like it generally should) but when two interrupts are issued at the same time by the two threads then this causes the whole code crash?

Last edited by DanRRight on Thu May 16, 2013 8:52 am; edited 2 times in total
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1649
Location: South Pole, Antarctica

PostPosted: Wed May 15, 2013 1:06 pm    Post subject: Reply with quote

Paul, Yes, i made the simple demo code, and seems the reason is as it was suspected above. Here is that old roach which was hiding under the rock for 25 years crashing this and even FTN77 compilers. Code freezes threads or not depending on using one of another marked line. This crashed sometimes even regular single-threaded code causing me to lose weeks - see this older post

http://forums.silverfrost.com/viewtopic.php?t=2291&highlight=freezes

Compilation
>FTN95 mt_cwin.f95
>slink mt_cwin.obj mt.dll

You will need Jalih's mt.dll from the links above. I keep Jalih's original Fortran text just as multithreading example so all DO loops here used for speed testing are not needed. And i decreased single threaded DO range by 10 which is also irrelevant to demonstrate this bug. Code takes exp(-A) one by one until A becomes too large around 70 and exp(-A) too small compared to real*4 machine zero which is somewhere below 10**(-37). Co-processor in this case must issue an underflow interrupt and potentially the hardware logics which handles it either slow or buggy (and in my IBM PC XT in 80th we with the friends made on the knees out of copycatted TTL parts including processors of some european country it was even absent always causing computer crash when there was any FP error or underflow... LOL ) or can only handle one core and when two co-processors issue interrupts all thing crashes

Code:

module test
  INCLUDE <windows.ins>
  STDCALL attach_thread 'attach_thread' (REF, VAL):integer*4
  STDCALL wait_object 'wait_object' (VAL):integer*4
  STDCALL check_object 'check_object' (VAL):integer*4
  STDCALL close_handle 'close_handle' (VAL):integer*4
  STDCALL create_mutex 'create_mutex' (VAL):integer*4
  STDCALL release_mutex 'release_mutex' (VAL):integer*4

 
  integer :: hMutex
  integer :: values(8) = (/1,2,3,4,5,6,7,8/)

  real AAA(100)   

  contains
    subroutine thread(ptr)
      integer :: ptr, i
      real d


      i = wait_object(hMutex)
      write(*,*) 'Starting calculation in thread', ptr
      i=release_mutex(hMutex)
     
      do i=1,100
        underexp = aaa(i)             !  <--- bug
!         underexp = min(50.,aaa(i))  !  <--- works
         d = exp(-underexp)
      enddo   
      d = 2.22
      do i=i,200000000/8,1
        d=alog(exp(d))
      end do
     
      call ExitThread(0)
    end subroutine thread

end module test

winapp
program mt
  use test
  implicit none

  integer :: i, x
  real :: start, finish, d
  integer :: thandle(8)


  do i=1,100;
    AAA(i)=i+0.1;
  enddo

  write(*,*) 'Single threaded test'
  call clock(start)

  d = 2.22
  do i=i,20000000 ! 0
    d=alog(exp(d))
  end do

  call clock(finish)
  write(*,*) 'Total time in seconds:', finish-start

  write(*,*) 'Multithreading test'
 
  hMutex = create_mutex(0)

  call clock(start)
 
  do i=1,8,1
    thandle(i) = attach_thread(thread,loc(values(i)))
  end do

  do i=1,8,1
    10 call temporary_yield@()
    x = check_object(thandle(i))
    if (x /= 0) goto 10
  end do

  call clock(finish)
 
  x = close_handle(hMutex)
 
  write(*,*) 'All done. Bye!'
  write(*,*) 'Total time in seconds:', finish-start
end program mt


Last edited by DanRRight on Thu May 16, 2013 8:50 am; edited 6 times in total
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5046
Location: Salford, UK

PostPosted: Thu May 16, 2013 7:19 am    Post subject: Reply with quote

I have logged this for investigation.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1649
Location: South Pole, Antarctica

PostPosted: Fri May 17, 2013 8:03 pm    Post subject: Reply with quote

Finally i finished parallelization of my now even more bewitched code. Worst were the problems with the exponent as i described it above, then finding which variable is local and which global, and debugging by printouts from the code. Tremendous difficulty debugging parallel code without debugger is due to mainly that the errors which take place in one place (and you will get info about them only using printout from the code as a Neanderthals method of debugging) will confuse you and usually get reported in the other. That will happen until you plant inside the code literally hundreds of prints. Then after crazy weeks of such debugging the wild horse is yours.

So this approach works, thanks to Jalih, he has done pretty good job with everything operating reliably. Now it's time to adopt it most logical way into FTN95 and raising it to the next level. For example
- merging some syntax with NET by substituting wait_object(ihMutex)/release_mutex(ihMutex) to LOCK/UNLOCK,
- taking C WinAPI definitions,
- allowing debug in threads,
- threads collision/conflict in debugger has to point to exact place of access violation in the Fortran source code,
- separating the debug windows for local and global variables (or marking them by color or some other way)
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5046
Location: Salford, UK

PostPosted: Tue Jul 02, 2013 12:03 pm    Post subject: Reply with quote

Jalih

Is the code for attach_thread available for inspection?
I confess that I am missing a trick here somewhere.

Paul


p.s. Problem sorted.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Suggestions All times are GMT + 1 Hour
Goto page Previous  1, 2, 3, 4
Page 4 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group