forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Bringing a slice of a rank 3 into a rank 2 array

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
johannes



Joined: 21 Jan 2011
Posts: 65
Location: Leimen, Germany

PostPosted: Fri Sep 26, 2014 9:30 am    Post subject: Bringing a slice of a rank 3 into a rank 2 array Reply with quote

Hi all,
could anyone give a clue how accelerate this primitive loop by means of advanced f90 capabilities?
Code:
 
       k=...
       FORALL (i=1:n,j=1:n)
          b2(i,j)=b3(i,j,k)
       end forall

Is it possible to use RESHAPE to map b3 to b2?
BR
johannes
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 1878
Location: Yateley, Hants, UK

PostPosted: Fri Sep 26, 2014 10:22 am    Post subject: Reply with quote

Just a guess, Johannes, but couldn't you do something of the kind with EQUIVALENCE? I suppose it depends how big the maximum for k is before that gets tedious. (it would be handy if k=1).

If n is small, you may get some benefit from loop unrolling, or if n is always a multiple of some other number by partial unrolling

Eddie
Back to top
View user's profile Send private message
johannes



Joined: 21 Jan 2011
Posts: 65
Location: Leimen, Germany

PostPosted: Fri Sep 26, 2014 10:34 am    Post subject: Reply with quote

Hi Eddie,
even back in times of f70 I never touched EQUIVALENCE. Shocked
Let my try later . IÄll come back
johannes
Back to top
View user's profile Send private message
johannes



Joined: 21 Jan 2011
Posts: 65
Location: Leimen, Germany

PostPosted: Fri Sep 26, 2014 2:00 pm    Post subject: Reply with quote

b2 and b3 arrays are allocatable. No EQUIVALENCE allowed.

johannes
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5600
Location: Salford, UK

PostPosted: Fri Sep 26, 2014 5:05 pm    Post subject: Reply with quote

Code:
program main
real b2(4,5), b3(4,5,3)
do i = 1,4
  do j = 1,5
    do k = 1,3
      b3(i,j,k)= 100*i+10*j+k
    end do
  end do
end do     
k = 2
b2(:,:)=b3(:,:,k)
print*, b2(1,:)
print*, b2(2,:)
print*, b2(3,:)
print*, b2(4,:)
end
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 1878
Location: Yateley, Hants, UK

PostPosted: Fri Sep 26, 2014 8:47 pm    Post subject: Reply with quote

Brilliant Paul!

The three nested loops aren't part of the answer of course, but wouldn't it be better to have the i loop te innermost one?

Eddie
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5600
Location: Salford, UK

PostPosted: Sat Sep 27, 2014 7:17 am    Post subject: Reply with quote

I don't know off hand. One could look at the /explist and also look to see if /opt makes any difference.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2004
Location: Sydney

PostPosted: Tue Sep 30, 2014 1:44 am    Post subject: Reply with quote

Eddie,

The following change would certainly improve cache usage, especially when the array sizes increase.
Code:
do k = 1,3
  do j = 1,5
    do i = 1,4
      b3(i,j,k)= 100*i+10*j+k
    end do
  end do
end do


This becomes much more significant in the case of larger arrays, such as the equation solver SYMSOL where when the stiffness matrix is stored as ST(equations,band) will take much longer to run, compared to using the array ST(band,equations). The run time difference can be a factor of x10 to x100, which swamps any other attempt at coding efficiency. It occurs when the matrix is much larger than the processor cache size.

John
Back to top
View user's profile Send private message
LitusSaxonicum



Joined: 23 Aug 2005
Posts: 1878
Location: Yateley, Hants, UK

PostPosted: Tue Sep 30, 2014 7:06 am    Post subject: Reply with quote

Hi John,

I made it a question not an assertion so as not to offend anyone. This business of index order was something I first encountered in Kreitzberg and Schneiderman's book, c. early 1970s.

I had little idea that it was so costly in time, but I had forgotten about cache misses, and was thinking only about how big the arrays can be. Paul's example probably doesn't count for much delay.

Eddie
Back to top
View user's profile Send private message
johannes



Joined: 21 Jan 2011
Posts: 65
Location: Leimen, Germany

PostPosted: Mon Oct 06, 2014 3:22 pm    Post subject: Reply with quote

Hi all,
accessing every array element costs time.

Isn't there any solution using pointers or so?
Like: ptr=>b3(1,1,k) ! pinting to the first element in the slice
and b2(1,1) = using the pointer ptr???

best regards
johannes
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5600
Location: Salford, UK

PostPosted: Mon Oct 06, 2014 7:01 pm    Post subject: Reply with quote

The essence of the answer is in my code

Code:
b2(:,:)=b3(:,:,k)


The rest is just for illustration.
Back to top
View user's profile Send private message
johannes



Joined: 21 Jan 2011
Posts: 65
Location: Leimen, Germany

PostPosted: Tue Oct 07, 2014 8:21 am    Post subject: Reply with quote

Hi Paul,
did you want to say, that b2(:,Smile=b3(:,:,k) does not store b2 element by element, instead it is more or less like shifting some adress?
BR
johannes
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 5600
Location: Salford, UK

PostPosted: Tue Oct 07, 2014 9:58 am    Post subject: Reply with quote

No it does not move addresses. But it will significantly reduce the evaluation of the addresses. It may lead to a "block copy" rather than copying element by element. This will depend on whether or not the elements are contiguous in memory and how clever the optimiser is. (Note FTN95 does some optimisation even when /OPT is not switched on. Note also that Fortran uses "column major" ordering of array elements).

The usual approach is to wrap a timer (e.g. "call system_clock(clock_count)") around the relevant code and try some experiments.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2004
Location: Sydney

PostPosted: Wed Oct 08, 2014 11:16 am    Post subject: Reply with quote

You would certainly have to be careful, where you have different size arrays, such as
Code:
      integer, parameter :: n=5
      integer b2(n,n)
      integer b3(10,10,3)
        k=...
        FORALL (i=1:n,j=1:n)
           b2(i,j)=b3(i,j,k)
        end forall

The forall would work here, but the use of (:,:) requires the same size for the first 2 dimensions.
If you are looking for a faster approach, move@ might help, although (:,:) should work well. If you need array sections, or the arrays are not the same size then you should consider an alternative, such as using move@ in a loop, such as:
do j = 1,n
call move@ ( b3(1,j,k), b2(1,j), n*4 )
end do

Another alternative could be the following without any /check option
do I = 1,n*n
b2(I,1) = b3(I,1,k)
end do

or
call move@ ( b3(1,1,k), b2, n*n*4 )
John
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group