forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Direct access binary speed - again!

 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
Kenneth_Smith



Joined: 18 May 2012
Posts: 697
Location: Hamilton, Lanarkshire, Scotland.

PostPosted: Mon Jan 26, 2015 7:39 pm    Post subject: Direct access binary speed - again! Reply with quote

By strange coincidence, I too have been surprised with the poor performance writing to binary direct access files. I've not used this approach before, but I was aware that sometimes my programming style does not led itself to speed - lots of writing and then reading the data by different program units - so I thought I look at this some wet winter evening.

On my machine the code below, takes 47 ms to allocate and populate the R4 1000x1000 array A, 17.2 s to write it to the direct access file, and 296 ms to recover same the data into array B.

Is this typical performance for writing, or am I doing something silly here?

Cheers Ken

Code:

      PROGRAM TEST
      IMPLICIT NONE     
      INTEGER RECLEN, NROW, NCOL, I, J, RECORD
      REAL, ALLOCATABLE:: A(:,:), B(:,:)
      REAL T1,T2,T3,T4
     
      CALL CLOCK@(T1)
     
      NROW = 1000
      NCOL = 1000
      ALLOCATE(A(NROW,NCOL))
      DO J = 1, NCOL, 1
        DO I = 1, NROW, 1
          A(I,J) = (J-1)*NCOL + I
        END DO
      END DO
     
      CALL CLOCK@(T2)
     
      WRITE(6,*)'TIME ALLOCATE AND POLULATE ARRAY IN MEMORY', T2-T1
      INQUIRE(IOLENGTH=RECLEN)A(1,1)
      OPEN(UNIT=10,FILE='data1.bin',FORM='UNFORMATTED',ACCESS='DIRECT',
     &     RECL=reclen)
      DO J = 1, NCOL, 1
        DO I = 1, NROW, 1
          RECORD = (J-1)*NCOL + I
          WRITE(10,REC=RECORD) A(I,J)
        END DO
      END DO
     
      CALL CLOCK@(T3)
     
      WRITE(6,*)'TIME TO WRITE ARRAY TO DIRECT ACCESS FILE',T3-T2
      ALLOCATE(B(NROW,NCOL))
      DO J = 1, NCOL, 1
        DO I = 1, NROW, 1
          RECORD = (J-1)*NCOL + I
          READ(10,REC=RECORD) B(I,J)
        END DO
      END DO
     
      CALL CLOCK@(T4)
     
      WRITE(6,*)'TIME TO READ ARRAY FROM DIRECT ACCESS FILE',T4-T3
      CLOSE(UNIT=10,STATUS='DELETE')
      END
Back to top
View user's profile Send private message Visit poster's website
davidb



Joined: 17 Jul 2009
Posts: 560
Location: UK

PostPosted: Mon Jan 26, 2015 9:03 pm    Post subject: Reply with quote

For me with your example, I get 1.6 seconds for the write section and 0.16 for the read section. So there is certainly a difference.
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
Kenneth_Smith



Joined: 18 May 2012
Posts: 697
Location: Hamilton, Lanarkshire, Scotland.

PostPosted: Mon Jan 26, 2015 9:58 pm    Post subject: Reply with quote

Interesting David, with that write speed I would not have raised the query!
I realise it will be machine/hardware dependant, but it's intriguing to see how much difference there can be. Out of curiosity I've bug out the old machines I have sitting at home, listed below in ascending order of increasing performance spec, along with their windows version and the time to perform the write operation:-

Code:

Windows XP                      7.2s
Windows 7 Home Premium 17.2s
Windows 7 Enterprise         8.5s


Suddenly my old XP machine looks good - but it's not good when there's a lot of number crunching to be carried out!

Ken
Back to top
View user's profile Send private message Visit poster's website
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Mon Jan 26, 2015 10:16 pm    Post subject: Reply with quote

Ken,

I have not seen "INQUIRE(IOLENGTH=RECLEN)A(1,1)" used in this way before. Is this legal ?

I tried to improve the performance, by reducing the number of records from 1,000,000 to 1,000. This does have a significant improvement in speed, even though your original code wrote the records out sequentially. I would have expected better buffering.

When I changed this, "INQUIRE(IOLENGTH=RECLEN)A(:,1)" did not work, so a different approach is needed for this. The intrinsic SIZEOF would be a great addition to FTN95!

I would also have expected an improvement with Windows 7xx over XP, but you are not finding this. Perhaps the improvement is more noticeable with larger files.
Anyway, the problem is 1,000,0000 records of 4 bytes is much slower than 1,000 records of 4,000 bytes. Reading is faster, as it is being buffered. Note that clock@ is only accurate to 1/64 second, so the read from buffer can be quicker than this. Hence 0 seconds for read. Use system_clock for better accuracy.

My modified code to see the change is:
Code:
      PROGRAM TEST
       IMPLICIT NONE     
       INTEGER RECLEN, NROW, NCOL, I, J, RECORD, K
       REAL, ALLOCATABLE:: A(:,:), B(:,:)
       REAL T1,T2,T3,T4
       LOGICAL :: big_blocks = .true.
       do k = 1,2
       big_blocks = k > 1
       CALL CLOCK@(T1)
       
       NROW = 1000
       NCOL = 1000
       ALLOCATE(A(NROW,NCOL))
       DO J = 1, NCOL, 1
         DO I = 1, NROW, 1
           A(I,J) = (J-1)*NCOL + I
         END DO
       END DO
       
       CALL CLOCK@(T2)
       
       WRITE(6,*)'TIME ALLOCATE AND POLULATE ARRAY IN MEMORY', T2-T1
!       INQUIRE(IOLENGTH=RECLEN)A(:,1)
       reclen = 4
       if ( big_blocks) reclen = 4*nrow
       write(6,*)'RECL=',reclen
       OPEN (UNIT=10,FILE='data1.bin',FORM='UNFORMATTED',ACCESS='DIRECT',  &
      &     RECL=reclen)
       DO J = 1, NCOL, 1
         if ( big_blocks) then
          RECORD = J
          WRITE(10,REC=RECORD) A(:,J)
         else
          DO I = 1, NROW, 1
            RECORD = (J-1)*NCOL + I
            WRITE(10,REC=RECORD) A(I,J)
          END DO
         end if
       END DO
       
       CALL CLOCK@(T3)
       
       WRITE(6,*)'TIME TO WRITE ARRAY TO DIRECT ACCESS FILE',T3-T2
       ALLOCATE(B(NROW,NCOL))
       DO J = 1, NCOL, 1
         if ( big_blocks) then
          RECORD = j
          READ(10,REC=RECORD) B(:,J)
         else
          DO I = 1, NROW, 1
            RECORD = (J-1)*NCOL + I
            READ(10,REC=RECORD) B(I,J)
          END DO
         end if
       END DO
       
       CALL CLOCK@(T4)
       
       WRITE(6,*)'TIME TO READ ARRAY FROM DIRECT ACCESS FILE',T4-T3
       CLOSE(UNIT=10,STATUS='DELETE')
       deallocate (a,b)
       end do
       END
Back to top
View user's profile Send private message
davidb



Joined: 17 Jul 2009
Posts: 560
Location: UK

PostPosted: Mon Jan 26, 2015 10:52 pm    Post subject: Re: Reply with quote

JohnCampbell wrote:

I have not seen "INQUIRE(IOLENGTH=RECLEN)A(1,1)" used in this way before. Is this legal ?

I think you can use a list of variables which matches the list of variables written or read, so A(1,1), which is a real scalar value is OK.
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
davidb



Joined: 17 Jul 2009
Posts: 560
Location: UK

PostPosted: Mon Jan 26, 2015 10:54 pm    Post subject: Re: Reply with quote

Kenneth_Smith wrote:

I realise it will be machine/hardware dependant, but it's intriguing to see how much difference there can be.


Perhaps this depends on the hard disc you have fitted. Mine is pretty zippy.

Perhaps, also, writes are slower than reads?
_________________
Programmer in: Fortran 77/95/2003/2008, C, C++ (& OpenMP), java, Python, Perl
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Mon Jan 26, 2015 11:03 pm    Post subject: Reply with quote

INQUIRE (IOLENGTH=RECLEN) A(:,1) and
INQUIRE (IOLENGTH=RECLEN) A(1:NROW,1)
both produces a compiler error or stack overflow.

The following is F95 standard conforming for SIZEOF

reclen = size ( transfer ( a(1,1), (/'A'/) ) )
if ( big_blocks) reclen = reclen*size (A(:,1))
write(6,*)'RECL=',reclen

John
Back to top
View user's profile Send private message
Kenneth_Smith



Joined: 18 May 2012
Posts: 697
Location: Hamilton, Lanarkshire, Scotland.

PostPosted: Mon Jan 26, 2015 11:40 pm    Post subject: Reply with quote

Thanks John, that is super fast!

As I understand you code B(:,J) is writing a whole column of data as a single record - I must admit that construction does not jump out to me as the obvious way to do things - guess I still think too much in the F77 world!

I did try writing rows of data via an implied do loop - without much success and with hindsight that would still be the same number of writes and my serial approach. I did think that, one possible advantage of the serial approach might be that there is no need to recover the whole array into memory, simply access the required element i,j when required by it's record number, but clearly it may well be easier to recover the whole column that contains i,j.

I see I have much to learn, i would have written
Code:
big_blocks = k > 1

as
Code:
IF (K.GT.1) THEN
  BIG_BLOCKS = .TRUE.
ELSE
  BIG_BLOCKS = .FALSE.
END IF

Thanks again!

Ken
Back to top
View user's profile Send private message Visit poster's website
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Thu Jun 30, 2016 6:56 am    Post subject: Reply with quote

Ken,

I was just looking at IOLENGTH and noticed a problem with your example.
The following code is in error :
Code:
      DO J = 1, NCOL, 1
         DO I = 1, NROW, 1
           RECORD = (J-1)*NCOL + I
           WRITE(10,REC=RECORD) A(I,J)
         END DO
       END DO


It should be : RECORD = (J-1)*NROW + I

If NROW = NCOL all would be ok !

John
Back to top
View user's profile Send private message
John-Silver



Joined: 30 Jul 2013
Posts: 1520
Location: Aerospace Valley

PostPosted: Thu Jun 30, 2016 5:57 pm    Post subject: Reply with quote

reading this post I observe a typical non-anglo-saxon useage of the word INQUIRE.
Maybe all programmers should be forced to read the Oxford ENGLISH Dictionary by heart instead of that woman's Wink version over the pond.
....

Quote:
Enquire’ or ‘inquire’?

The traditional distinction between the verbs enquire and inquire is that enquire is to be used for general senses of ‘ask’, while inquire is reserved for uses meaning ‘make a formal investigation’.

In practice, however, enquire, and the associated noun enquiry, are more common in British English while inquire (and the noun inquiry) are more common in American English, but otherwise there is little discernible distinction in the way the words are used. Some style guides require that only inquire or only enquire be used.


Clearly it should be called ENQUIRE !!!!! ... no bobbies required to discover the lngth ! LOL
"I was proceding in my investigations concerning said array length and made some enquiries during the course of my inquiry " LOL
Back to top
View user's profile Send private message
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Fri Jul 01, 2016 3:25 am    Post subject: Reply with quote

What I found a long time ago (http://forums.silverfrost.com/viewtopic.php?t=2992) was the record length has an effect, but it's really the total number of I/O calls that determines the performance. Short record, lots of I/O, poor performance as compared to doing the identical kinds of I/O in "C" (but without the flexibility [and overhead] that FTN95 I/O lists gives).

I have abandoned almost all of the direct access file I/O in my software because the performance is so poor. And, for still unknown reasons. Running my benchmarks (written long ago also) on a solid state drive that is blindingly fast still shows the lack of performance discovered a couple of years ago. Running the same benchmark on a RAM disk also showed the same poor performance (albeit less poor than on a hard drive, but the percentages still held true).

For me, that which used to be done in a direct access temporary file is now done in memory. Which limits the size of the data sets that can be processed. Not good, right?

Still modifying code to remove the file limitations of direct access and placing all the files in memory......
Back to top
View user's profile Send private message Visit poster's website
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Fri Jul 01, 2016 6:28 am    Post subject: Reply with quote

Bill,

I read the link you posted. I must admit I don't use Win 8.1 except for 1 program on that PC.
I run most analysis runs on Win 7 desktops as single user, ie no change to SHARE. Apart from your test example from Jan-15, my experience of Direct Access has always been of very good performance and I would always recommend that approach for substantial disk I/O. Did you come up with any fixes to the problem ?

Perhaps the combination of file buffers and some SHARE options is a problem, but reading the post from 18 months ago that may have been dismissed as a major cause. The other problem could have been the virus checker, as it can clash with I/O performance. Perhaps you should split the files, with a small token file with SHARE status, while the large database files do not require share write to work. (Share read should not be a problem, although I don't remember where the previous thread finished up)

My experience is direct access works very well and combines well with the file cache/buffers. All my programs that use it perform well for disk I/O. I would always recommend this approach.

John
Back to top
View user's profile Send private message
wahorger



Joined: 13 Oct 2014
Posts: 1217
Location: Morrison, CO, USA

PostPosted: Fri Jul 01, 2016 2:15 pm    Post subject: Reply with quote

John, it works well, but..... The performance hit is too great to ignore. And, it was the same hit whether running the same benchmark under Windows 8 or Windows 2000.

I think, when I get some time, I'll repeat the benchmarking, this time linking in an equivalent "C" implementation and getting the number for that to help in the comparisons. The source of the performance penalty, should we be able to find it, would be of interest to many, I think.

Just FYI, under Windows 3.1 up through XP, I used a different FORTRAN compiler to create the operational code. None of these effects were seen. The same code also harkens back to CP/M days using an early Microsoft compiler. The same FORTRAN code was ported to a VAX and a PDP-11. In all these cases, there was not a performance hit due to the direct-access unformatted I/O.

I may have a platform that can support an old compiler to compile and run the same benchmark. I'll look at adding that to the mix. That would at least eliminate the OS as a culprit.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group