forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

DANGLING FORTRAN POINTER

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
PaulMather



Joined: 19 Sep 2017
Posts: 10
Location: Nottingham

PostPosted: Sat Nov 18, 2017 6:02 pm    Post subject: DANGLING FORTRAN POINTER Reply with quote

I get an error message, dangling fortran pointer, with a section of code using mixed I*1 and character data. All arrays initialised to zero. They are allocated in the calling routine.

Case (1)

do i=0,2
do j=0,windows(id)%npix-1
do k=0,windows(id)%nl-1
! *** FAILS HERE images1(i,j,k)=char(windows(id)%lut(ichar(images1(i,j,k)),i))
enddo
call temporary_yield@
enddo
enddo

I have not any pointers.

I have a version of the program, compiled in 2015, that works.

Anyone have an idea of what's wrong?
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1979
Location: Sydney

PostPosted: Sun Nov 19, 2017 2:18 am    Post subject: Reply with quote

Paul,

There is probably a bug in the compiler related to char or ichar, but to make it easier I would certainly try to clean it up:
I would also change the DO loop order, so that memory is processed sequentially. For large images1 arrays this would be significant.

I assume you are transforming the colour palette, based on lut(0:255,0:2)
Although the following is more verbose, I am sure it would not have a performance penalty.
Code:
  type win_dim
     integer npix
     integer nl
     integer lut(0:255,0:2)  ! colour palette transformation ?
  end type win_dim
  type (win_dim) :: windows(2)

  integer, parameter :: mx = 500
  integer, parameter :: my = 300
  character images1(0:2,0:mx,0:my), ch, cc(0:2)

  integer id, i,j,k, ic,jc

  id = 1

  do k=0,windows(id)%nl-1
    do j=0,windows(id)%npix-1
! *** FAILS HERE images1(i,j,k)=char(windows(id)%lut(ichar(images1(i,j,k)),i))
      do i=0,2
        ch = images1(i,j,k)           
        ic = ichar(ch)
        jc = windows(id)%lut(ic,i)
        images1(i,j,k) = char(jc)
      end do
    end do
    call temporary_yield@
  end do

 end


I am wondering if you could change ch to cc(0:2) and lut(0:2,0:255) and replace the inner loop with array syntax, if that would help, although as I am assuming LUT is a transformation lookup array, that may not be easy; perhaps use : call LUT_transform ( images1(:,j,k), id )
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5496
Location: Salford, UK

PostPosted: Mon Nov 20, 2017 10:01 am    Post subject: Reply with quote

Paul

Please supply details of the user-defined type in this code so that the fragment can be compiled and run on its own.
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 913
Location: Aerospace Valley

PostPosted: Mon Nov 20, 2017 1:12 pm    Post subject: Reply with quote

I'm always confused by the term 'pointer' and usually avoid like the plague any discussions relating,but attracted like a foolish lamb to the slaughter by the beguiling adjective 'dangling' I googled to find a simple (yes I'm forever the optimist) explanation of this terminology and dropped on this first time, which maybe will help in pointing (sic) towards an explication of the problem ?

https://software.intel.com/en-us/articles/pointer-checker-to-detect-buffer-problems-and-dangling-pointers-part-2

Quote:
"A dangling pointer arises when you use the address of an object after its lifetime. This may occur in situations like returning addresses of the automatic variables from a function or using the address of the memory block after it is freed.
The dangling pointer manifests when a programmer erases the allocated object being pointed to by using runtime function free() or the delete() operator. Also, if the programmer erases or kills the pointer instance by setting it to NULL, there will be no issue because the program halts with segmentation fault if the NULL pointer is used. So, the freed pointer should always be set to NULL. If the developer forgets and keeps the pointer in the code creating a dangling pointer, which can be malaciously exploited, this leads to significant quality and safety issues."

Back to top
View user's profile Send private message
PaulMather



Joined: 19 Sep 2017
Posts: 10
Location: Nottingham

PostPosted: Mon Nov 20, 2017 3:18 pm    Post subject: Re: Dangling Fortran Pointers Reply with quote

Dear John, Paul
Thanks for answering my query. Yes, LUT is a colour table with 3 bands. I am trying to ersurrect some Fortran code that I had running in 2015; nowit won't work. I had suspected a compiler fault, but I am puzzled by the fact that the section of code that I quoted is now working and an image is written to the screen. Now it fails a few lines later with the dangling pointer error.

I will report back if your suggestions make any difference.

I was under the impression that the order in which arrays are accessed was important (if not fundamental) when data were being transferred from disc to physical memory in the days of virtual memory. Now the physical memory on my PC is 16Gb and there is presumably no penalty in accessing the contents in any order.

Many thanks,

Paul





JohnCampbell wrote:
Paul,

There is probably a bug in the compiler related to char or ichar, but to make it easier I would certainly try to clean it up:
I would also change the DO loop order, so that memory is processed sequentially. For large images1 arrays this would be significant.

I assume you are transforming the colour palette, based on lut(0:255,0:2)
Although the following is more verbose, I am sure it would not have a performance penalty.
Code:
  type win_dim
     integer npix
     integer nl
     integer lut(0:255,0:2)  ! colour palette transformation ?
  end type win_dim
  type (win_dim) :: windows(2)

  integer, parameter :: mx = 500
  integer, parameter :: my = 300
  character images1(0:2,0:mx,0:my), ch, cc(0:2)

  integer id, i,j,k, ic,jc

  id = 1

  do k=0,windows(id)%nl-1
    do j=0,windows(id)%npix-1
! *** FAILS HERE images1(i,j,k)=char(windows(id)%lut(ichar(images1(i,j,k)),i))
      do i=0,2
        ch = images1(i,j,k)           
        ic = ichar(ch)
        jc = windows(id)%lut(ic,i)
        images1(i,j,k) = char(jc)
      end do
    end do
    call temporary_yield@
  end do

 end


I am wondering if you could change ch to cc(0:2) and lut(0:2,0:255) and replace the inner loop with array syntax, if that would help, although as I am assuming LUT is a transformation lookup array, that may not be easy; perhaps use : call LUT_transform ( images1(:,j,k), id )
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1979
Location: Sydney

PostPosted: Mon Nov 20, 2017 3:34 pm    Post subject: Reply with quote

Quote:
Now the physical memory on my PC is 16Gb and there is presumably no penalty in accessing the contents in any order.

There is now a penalty in transferring between memory and cache.
These cache transfers are managed in pages (64kbytes ?) so if you are addressing 1 byte, the whole page is transferred. If you step all over memory, there are a lot of 64k page transfers.
Code:
do I = 1,l
  do j = 1,m
    do k = 1,n
      byte = g_array(k,j,i)  ! this is good sequential memory access for Fortran
      byte = b_array(I,j,k)  ! this is skipping all over memory
    end do
  end do
end do

You should try testing this for an arrays of say 100mb and time the two different access sequences.
Code:
    real*4, allocatable :: b_array(:,:,:)
    real*4, allocatable :: g_array(:,:,:)
!
    integer*4 :: l =   15
    integer*4 :: m = 3000
    integer*4 :: n = 3000
    integer*4 :: i,j,k
    real*4    :: byte
    real*4    :: elapse_sec, t1
   
    allocate ( g_array(n,m,l) )
    t1 = elapse_sec ()
    do I = 1,l
      do j = 1,m
        do k = 1,n
          g_array(k,j,i) = i+j+k ! this is good sequential memory access for Fortran
        end do
      end do
    end do
    byte = 0
    do I = 1,l
      do j = 1,m
        do k = 1,n
          byte = byte + g_array(k,j,i)  ! this is good sequential memory access for Fortran
        end do
      end do
    end do
    t1 = elapse_sec () - t1
    write (*,*) t1, ' good_array', byte
    deallocate ( g_array )
!
    allocate ( b_array(l,m,n) )
    t1 = elapse_sec ()
    do I = 1,l
      do j = 1,m
        do k = 1,n
          b_array(i,j,k) = i+j+k ! this is skipping all over memory 
        end do
      end do
    end do
    byte = 0
    do I = 1,l
      do j = 1,m
        do k = 1,n
          byte = byte + b_array(i,j,k)  ! this is skipping all over memory 
        end do
      end do
    end do
    t1 = elapse_sec () - t1
    write (*,*) t1, ' bad_array', byte
    deallocate ( b_array )
!
  end

  real*4 function elapse_sec ()
    integer*4 tick, rate
    call system_clock ( tick, rate )
    elapse_sec = real(tick) / real(rate)
  end function elapse_sec
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5496
Location: Salford, UK

PostPosted: Mon Nov 20, 2017 4:11 pm    Post subject: Reply with quote

Paul

If you would like us to investigate and fix a potential bug then we would need a sample program from you that demonstrates the failure at runtime. As far as I can see John has not provided such a program - or am I missing something?
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1979
Location: Sydney

PostPosted: Mon Nov 20, 2017 11:46 pm    Post subject: Reply with quote

Paul L,
Yes, my first example does not address the compiler problem and the code is not complete, as the arrays are not defined.

Paul M,
You may want to consider the example below, where I have tested more options with character arrays, although the results are the same. I was looking at the order of DO sizes, but not much effect.
Code:
    character*1, allocatable :: b_array(:,:,:)
    character*1, allocatable :: g_array(:,:,:)
!
    integer*4 :: i,j,k, l,m,n, test, big
    real*4    :: byte
    real*4    :: elapse_sec, t1

    big = 12000
    do test = 1,3

      select case (test)
        case (1)
          l = big ; m = big ; n = big ; l = 3
        case (2)
          l = big ; m = big ; n = big ; m = 3
        case (3)
          l = big ; m = big ; n = big ; n = 3
      end select

      allocate ( g_array(n,m,l) )
      t1 = elapse_sec ()
      do I = 1,l ; do j = 1,m ; do k = 1,n
         g_array(k,j,i) = char(i+j+k)        ! this is good sequential memory access for Fortran
      end do     ; end do     ; end do
 
      byte = 0
      do I = 1,l ; do j = 1,m ; do k = 1,n
         byte = byte + ichar(g_array(k,j,i))  ! this is good sequential memory access for Fortran
      end do     ; end do     ; end do
      t1 = elapse_sec () - t1
      write (*,*) t1, ' good_array', byte
      deallocate ( g_array )

      allocate ( b_array(l,m,n) )
      t1 = elapse_sec ()
      do I = 1,l ; do j = 1,m ; do k = 1,n
         b_array(i,j,k) = char(i+j+k)        ! this is skipping all over memory 
      end do     ; end do     ; end do
 
      byte = 0
      do I = 1,l ; do j = 1,m ; do k = 1,n
         byte = byte + ichar(b_array(i,j,k))  ! this is skipping all over memory 
      end do     ; end do     ; end do
      t1 = elapse_sec () - t1
      write (*,*) t1, ' bad_array ', byte
      deallocate ( b_array )

    end do
!
  end

  real*4 function elapse_sec ()
    integer*4 tick, rate
      call system_clock ( tick, rate )
      elapse_sec = real(tick) / real(rate)
  end function elapse_sec
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 913
Location: Aerospace Valley

PostPosted: Tue Nov 21, 2017 5:21 am    Post subject: Reply with quote

I ran JohnC's code and got a not insignificant almost 5 fold increase in speed on my lowly laptop.
I'd read about the need to reverse the intuitive looping order when I first read up about changes between F77 and F90/95.
Is there a good reason for doing this or was it just being done wrong in the previous fortran compiler standard without realising it ?
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1979
Location: Sydney

PostPosted: Tue Nov 21, 2017 9:59 am    Post subject: Re: Reply with quote

John-Silver wrote:
I'd read about the need to reverse the intuitive looping order when I first read up about changes between F77 and F90/95.


I am not sure what you mean by "intuitive". The idea is to process memory sequentially, so the following is good:
do I = 1,l ; do j = 1,m ; do k = 1,n
byte = byte + ichar(g_array(k,j,i)) ! this is good sequential memory access for Fortran
end do ; end do ; end do

while the following is bad:
do I = 1,l ; do j = 1,m ; do k = 1,n
byte = byte + ichar(b_array(i,j,k)) ! this is skipping all over memory
end do ; end do ; end do

I suspect you are saying the following is intuitive
do I; do j; do k ; array(I,j,k)

.. but it is not; the same principles of sequential memory access applied for virtual memory and now for cached memory. No change of approach from F77 on a mini to F90 on a cached multiprocessor.

Other problems come when using multiple arrays which have different subscript orders. This can often occur when arrays are used for multiple phases of an analysis.

I have just been reading chapter 5 of "Using OpenMP" by Chapman, Jost & van der Pas, which discusses some of the issues of using a processor with cache. Unfortunately with typically 3 levels of cache their explanation is still a bit simplified.

It is good your testing showed that the speed savings (5x) are comparable to other approaches like multi-threading, where cache to memory transfers become even more of a bottleneck for performance.

John
Back to top
View user's profile Send private message
PaulMather



Joined: 19 Sep 2017
Posts: 10
Location: Nottingham

PostPosted: Wed Nov 22, 2017 3:08 pm    Post subject: Re: Dangling Fortran Pointers Reply with quote

Many thanks for all that interesting stuff. Looks as if I will need to change a large number of multiple do-loops! The differences in time are remarkable.

Re the previous correspondence on the three level do loop that used char and ichar to define the loop parameters: I'll leave that on hold for a while. I now get errors occurring elsewhere in the program to do with window handles. This is all very strange as I wrote the win-handle stuff years ago and lots of students have used it without problems. There must be something that is apparently acting in a random way to generate errors. I will continue digging and will report back when or if I find something.

Paul

JohnCampbell wrote:
John-Silver wrote:
I'd read about the need to reverse the intuitive looping order when I first read up about changes between F77 and F90/95.


I am not sure what you mean by "intuitive". The idea is to process memory sequentially, so the following is good:
do I = 1,l ; do j = 1,m ; do k = 1,n
byte = byte + ichar(g_array(k,j,i)) ! this is good sequential memory access for Fortran
end do ; end do ; end do

while the following is bad:
do I = 1,l ; do j = 1,m ; do k = 1,n
byte = byte + ichar(b_array(i,j,k)) ! this is skipping all over memory
end do ; end do ; end do

I suspect you are saying the following is intuitive
do I; do j; do k ; array(I,j,k)

.. but it is not; the same principles of sequential memory access applied for virtual memory and now for cached memory. No change of approach from F77 on a mini to F90 on a cached multiprocessor.

Other problems come when using multiple arrays which have different subscript orders. This can often occur when arrays are used for multiple phases of an analysis.

I have just been reading chapter 5 of "Using OpenMP" by Chapman, Jost & van der Pas, which discusses some of the issues of using a processor with cache. Unfortunately with typically 3 levels of cache their explanation is still a bit simplified.

It is good your testing showed that the speed savings (5x) are comparable to other approaches like multi-threading, where cache to memory transfers become even more of a bottleneck for performance.

John
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 913
Location: Aerospace Valley

PostPosted: Wed Nov 22, 2017 9:29 pm    Post subject: Reply with quote

JohnC wrote ....
Quote:
I suspect you are saying the following is intuitive
do I; do j; do k ; array(I,j,k)


exactly John by intuitive I meant 'in order' (of the array indexes) like you've written.

I'll have to search around and try to find where I saw that it had changed in F90, it was on one of those sites 'how to convert F77 code to F90/95' . I'll come back if I find it.

I have a lot of old code which will need changing to optimise it. I was never taught to loop in the way which has apparently always been optimum, but then I'm self-taught in a practical enviožronment after a beginners course at uni and I don't alwats rtfm :O)

Thse type of tips would have come in useful running to the limit on 32Mb of memory on an IBM mainframe in early90's !!!!
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1979
Location: Sydney

PostPosted: Thu Nov 23, 2017 1:28 am    Post subject: Re: Reply with quote

John-Silver wrote:
These type of tips would have come in useful running to the limit on 32Mb of memory on an IBM mainframe in early90's !!!!

John, if only you were a bit older. Running a 3-loop your "intuitive" way on a disk based virtual memory mini could have been the difference between a few seconds and tens of minutes. The delay gave you time to RTFM and re-write the code. I do recall in 1977 when I started an analysis, saw it was going slow, read a text book (no internet!), rewrote the analysis, compiled, ran and got the correct answer (well at least the same); all before the first program had finished. That was on a Pr1me mini.
I would not bother looking for that F77 > F90 advice site, as it was probably written by a Cxx expert who did not bother to learn Fortran array index order. The C array convention is to store in the reverse index order.

What is "intuitive" ?

John
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 1872
Location: South Pole, Antarctica

PostPosted: Thu Nov 23, 2017 2:38 am    Post subject: Reply with quote

I hope future compilers will be able to virtualize this problem so that it will be not important in which order index goes.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 1979
Location: Sydney

PostPosted: Thu Nov 23, 2017 3:41 am    Post subject: Reply with quote

Dan,

You may hope for loop order optimisation, but, I have seen recent benchmark tests with FORALL that show this is not being achieved; for multiple compilers.
When you are using multiple arrays with different subscript order, the best order can be a bit confusing to predict, but with simple loops, like above, it could be achieved.

A good example (I have posted previously) is matrix multiplication, where a change of approach from dot_product to daxpy can reduce memory access delays. Future compilers may be able to provide these solutions. gFortran has certainly improved MATMUL at Ver 7.2.0, but I have not seen documentation of what was done.

John
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group