forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Fails to save arrays > 4GB
Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support
View previous topic :: View next topic  
Author Message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Tue Oct 11, 2022 8:27 am    Post subject: Reply with quote

John

At the moment "pos = end_pos" only works for 32 bit end_pos.

I will aim to fix this and then upload a new set of DLLs. We can then consider other issues after you have tested this fix.
Back to top
View user's profile Send private message AIM Address
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7916
Location: Salford, UK

PostPosted: Tue Oct 11, 2022 11:56 am    Post subject: Reply with quote

John

I have sent you a pm with links to a trial fix.
Back to top
View user's profile Send private message AIM Address
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Thu Oct 13, 2022 12:52 am    Post subject: Reply with quote

Thanks, Paul.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Tue Oct 18, 2022 12:48 pm    Post subject: Reply with quote

The following is my latest stream IO test program that demonstrates problems with stream I/O in FTN95 when overwriting files.
Others may confirm my tests or identify problems with my approach ?

This is a program that can be run in PLATO, using either FTN95 /64 or gfortran

https://www.dropbox.com/s/cxe6wr8vdd3vl6n/stream_readc.f90?dl=0

I am using gfortran as a reference, as it appears to perform correctly.
The comments in the program describe 3 tests in more detail, that produce 3 files which should all be the same.

FTN95 fails in tests 2 and 3 for WRITE ( pos=address ) when rewriting and for INQUIRE

Please build using ftn95 /64 stream_readc.f90 /link then run.

You can compare the 3 files from the 3 tests.
stream_filec.bin
stream_filez.bin
stream_file$.bin
( you can delete these files before the test, but is not necessary to confirm errors.)

They should all be the same, but are not with FTN95

file stream_readc.log is appended with results of each test run to confirm errors.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Tue Oct 18, 2022 5:43 pm    Post subject: Reply with quote

John, Do your tests need new dll with latest fix of 4GB issue? And what are speeds of READ / WRITE ( pos=address ) in comparison with Method2 above ? I suspect it might not be worth to use it anymore
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Wed Oct 19, 2022 5:15 am    Post subject: Reply with quote

Dan : Do your tests need new dll with latest fix of 4GB issue?

ans: The test I am doing is using version 8.91.1.
This version has a fix for integer*8 address.
I have not yet tested 4gb+ files.

The main purpose of my tests are to confirm that READ ( ..., pos=address,...) and WRITE ( ..., pos=address,...) are working for stream I/O.

What I am finding is "WRITE ( lu, pos=address, ...) IO_list", appears to fail on rewriting to the file.
I have found that "READ ( lu, pos=address, ...) IO_list" appears to work in the test program I attached.
This test also uses integer*8, so FTN95 Ver 8.91.1 does address integer*8 problems, but fails on re-writing data to the file.
The "pos=address" capability, provides for an addressible file data structure

Dan : And what are speeds of READ / WRITE ( pos=address ) in comparison with Method2 above ?
I suspect it might not be worth to use it anymore

Ans : The Stream read/write ( pos=address ) should be supported by Windows buffered I/O, similar to Fortran unformatted, fixed length record performance. It should be a good solution.
The stream I/O could be a much better solution than "Method 2" which appears to be Fortran unformatted sequential I/O for records larger than 2gbytes.
I still have not seen the header/footer syntax for this FTN95 solution, but I suspect it is not compatible with ifort/gfortran. I have read that ifort and gfortran are compatible for Fortran unformatted sequential I/O, including records larger than 2gbytes, althlough the header/footer syntax is a bit of a mess, using 2gbyte sub-records.

I would expect that stream I/O is a better solution, although it is probably safer to partition the data into records smaller than 2gbytes.

It is probably better to have a file dump something like:
Code:
     write(11) nB,size(Arr4(:,1))
    do i=1,nB
      write(11) i,Arr4(:,i)
    enddo
    write (11) -1
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Wed Oct 19, 2022 9:29 pm    Post subject: Reply with quote

Depending on array dimension sizes this way stream I/O will be up to 10x slower than by fast Method2. Don't remember if the files made by fast Method2 are compatible with other compilers and Linux. I currently temporally do not use Method2. Method1 is compatible.
Using READF@ blows them all
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Thu Oct 20, 2022 4:14 am    Post subject: Re: Reply with quote

DanRRight wrote:
Depending on array dimension sizes this way stream I/O will be up to 10x slower than by fast Method2


A bit of over-reach in the 10x !

I just think it is better to write "records" that are smaller than 4 gbytes, although I do not recall what size nB and size(Arr4(:,1)) are. Spliting the binary dump into bits does give the option of inspecting the data read, rather than waiting for many gigabytes to be loaded.

It is hard to keep up with the latest available PCIe 4.0 M.2 SSDs and ver 5.0 to be available shortly. My understanding these 4.0 are over 5,000 MBps, which I think is 5 gigabytes per second ! At these rates, you have to carefully look at the resulting Fortran code being able to process this fast.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Thu Oct 20, 2022 8:47 am    Post subject: Reply with quote

1) There is no good reason to chop the files by size <4GB and complicate the code

2) Put 6 into my test above instead of 11 and you will get speeds 0.28GB/sec --> 10x slower than Method2 and 20x slower than with READF@.
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Fri Oct 21, 2022 6:34 am    Post subject: Reply with quote

Dan,

I updated your program and tested with FTN95 and gfortran.
Method 1 produces 0.4 GBytes/sec, while
Method 2 is up to 2.9 GBytes/sec

This is on a Samsung 970 EVO Plus NVMe M.2 drive which I think is PCIe Gen 3.0

The program is
Code:
! compilation: ftn95 aaa.f90 /link /64 >z
!
  use iso_fortran_env
 
    integer*4, parameter :: million = 1000000
    real*4, dimension(:,:), allocatable :: Arr4
    integer*4 :: nA, nB, ierr, i, pass
    real*4    :: SpeedGBps, t0, t1, dt
    real*4, external :: delta_sec

    write (*,*) 'Compiler Version :',compiler_version ()
    write (*,*) 'Compiler Options :',compiler_options ()

    dt = delta_sec ()
    nA = 6 ! 11
    nB = 200 * million

!...Allocating array

    Print*, 'Trying to allocate GB of RAM :', 1.d-9 * 4. * nA * nB
    allocate ( Arr4 (nA, nB), stat = ierr)

    if (ierr.eq.0) then
       Print*, 'Allocation success'
    else
       Print*, 'Fail to allocate'
       goto 1000
    endif

!...Filling the array with some data
    do i=1,nB
      Arr4(:,i) = [1,2,3,4,5,6,7,8,9,10,11]
    enddo
    dt = delta_sec ()
      SpeedGBps = 4. * nA * nB / 1024.**3 /(dt+1.e-10)
      print*,' Speed of initialising array =', SpeedGBps,' GB/sec',dt,' sec'   !   typically  ~0.5 GB/s

    do pass = 1,2     ! do 2 passes for file not/already exists
   
       Print*,'Trying to save the data Method 1 '
       call cpu_time(t0)
       dt = delta_sec ()
       open (11, file='LargeFile.dat', FORM='UNFORMATTED', access="STREAM", err=900)
       do i=1,nB
         write(11) Arr4(:,i)
       enddo
       close(11)   
       call cpu_time(t1)
       dt = delta_sec ()
   
   !...Speeed of writing method 1
         SpeedGBps = 4. * nA * nB / 1024.**3 /(dt+1.e-10)
         print*,' Speed of write Method 1 =', SpeedGBps,' GB/sec',dt,' sec',(t1-t0)   !   typically  ~0.5 GB/s
   
       Print*,'Trying to save the data Method 2'
       call cpu_time(t0)
       dt = delta_sec ()
       open (11, file='LargeFile.dat', FORM='UNFORMATTED', access="STREAM", err=900)
       write(11) Arr4
       close(11)   
       call cpu_time(t1)
       dt = delta_sec ()
   
   !...Speeed of writing Method 2
         SpeedGBps = 4. * nA * nB / 1024.**3 /(dt+1.e-10)
         print*,' Speed of write  Method 2=', SpeedGBps,' GB/sec',dt,' sec',(t1-t0)   !   typically  ~2.6 GB/s
   
    end do !  pass

    write (*,*) 'File LargeFile.dat test completed OK'
      goto 1000

!...............
!...Errors
900 Print*,'Can not open file LargeFile.dat'
    goto 1000
910 Print*,'Can not save file LargeFile.dat'


1000 Continue

    End
 
    real*4 function delta_sec ()
      integer*8 :: tick, rate, last=0
      call system_clock ( tick, rate)
      delta_sec = dble (tick-last) / dble (rate)
      last = tick
    end function delta_sec
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Fri Oct 21, 2022 6:38 am    Post subject: Reply with quote

The .bat file to test is
Code:
set program=dan_stream

now  > %program%.log

del LargeFile.dat
del %program%.exe
del %program%.obj
del %program%.o

ftn95 %program%.f90 /64 /debug /link >>%program%.log
%program% >>%program%.log

del LargeFile.dat
del %program%.exe
del %program%.obj
del %program%.o

gfortran %program%.f90 -g -fimplicit-none -O2 -o %program%.exe >>%program%.log
%program% >>%program%.log

notepad %program%.log


The results are:
Code:
  It is now Friday, 21 October 2022 at 16:18:53.036
[FTN95/x64 Ver. 8.91.1.0 Copyright (c) Silverfrost Ltd 1993-2022]
     Licensed to:  John Campbell
     Organisation: John Campbell

[Current options] 64;DEBUG;ERROR_NUMBERS;IMPLICIT_NONE;INTL;LINK;LOGL;

0089) 910 Print*,'Can not save file LargeFile.dat'
WARNING - 21: Label 910 is declared, but not used
    NO ERRORS, 1 WARNING  [<main program> FTN95 v8.91.1.0]
    NO ERRORS  [<DELTA_SEC> FTN95 v8.91.1.0]
[SLINK64 v3.04, Copyright (c) Silverfrost Ltd. 2015-2022]
Loading C:\temp\forum\stream_io\lgotemp@.obj
Creating executable file dan_stream.exe
 Compiler Version :FTN95 v8.91.1
 Compiler Options :64;DEBUG;ECHO_OPTIONS;ERROR_NUMBERS;IMPLICIT_NONE;INTL;LINK;LOGL;
 Trying to allocate GB of RAM :          4.80000000000   
 Allocation success
  Speed of initialising array =     1.85777     GB/sec     2.40630     sec
 Trying to save the data Method 1
  Speed of write Method 1 =    0.387883     GB/sec     11.5250     sec     11.4062   
 Trying to save the data Method 2
  Speed of write  Method 2=     2.69314     GB/sec     1.65990     sec     1.56250   
 Trying to save the data Method 1
  Speed of write Method 1 =    0.386933     GB/sec     11.5533     sec     11.5312   
 Trying to save the data Method 2
  Speed of write  Method 2=     2.87482     GB/sec     1.55500     sec     1.56250   
 File LargeFile.dat test completed OK
 Compiler Version :GCC version 11.1.0
 Compiler Options :-mtune=generic -march=x86-64 -g -O2 -fimplicit-none
 Trying to allocate GB of RAM :   4.8000000000000007     
 Allocation success
  Speed of initialising array =   4.88828278      GB/sec  0.914502800      sec
 Trying to save the data Method 1
  Speed of write Method 1 =  0.441231251      GB/sec   10.1315317      sec   10.0937500   
 Trying to save the data Method 2
  Speed of write  Method 2=   1.66670907      GB/sec   2.68214083      sec   2.68750000   
 Trying to save the data Method 1
  Speed of write Method 1 =  0.448078960      GB/sec   9.97669792      sec   9.96875000   
 Trying to save the data Method 2
  Speed of write  Method 2=   1.75272882      GB/sec   2.55050778      sec   2.53125000   
 File LargeFile.dat test completed OK


I prefer elapsed time testing, but CPU_time is similar in this case.
Paul's implementation of 4GB+ stream I/O is very efficient !!
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Fri Oct 21, 2022 10:05 pm    Post subject: Reply with quote

John, You confirmed my numbers. How about your method? Is it indeed slow?
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Sat Oct 22, 2022 7:06 am    Post subject: Reply with quote

Dan,

You are comparing the overhead of writing 200 million 24 byte records vs 1 gigantic 4.8 gigabyte record.
I have tested a middle ground of writing 240 byte or 2,400 byte records where the performance is not so different.

My preference would be to target "records" of about 1 kb to 100 kbytes, although smaller records can be easier to manage in post processing.

It is good we have seen the extremes of method 1 vs method 2.

Stream I/O is certainly more portable, as there is no clash with different header/footer formats. It is also easy with stream I/O to replicate or read ifort or gfortran sequential binary file formats.

Another option with STREAM I/O is to read a sequential binary file format and construct a table of record addresses. This can be later used to randomly access the records using "read ( lu, pos=rec_address(rec_id) ) IO_list", which opens up a much more flexible way of accessing the data.

Stream I/O is a very useful addition to Fortran

John
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2554
Location: Sydney

PostPosted: Sun Oct 23, 2022 3:43 am    Post subject: Reply with quote

Dan,

I have increased the test options for record size (ie number of records) vs I/O performance. ( Do pass = 1,4 )

However, depending on the amount of memory installed, the test appears to be more testing Microsoft disk buffering, without actually testing the disk performance. This is because the test program generates the file in memory and does not enforce the data is transferred to the disk.

I have a HP i7-6800 notebook with SADA SSD, but only 8gbytes of memory. It's performance is nothing like the NVMe M.2 drives we have been reporting, mainly because there is not sufficient memory for both storing the 6gb array and buffering the 6gb file.
I think the estimated disk I/O performance of 2 to 4 gbytes/sec being reported are due mainly to memory buffering and not the SSD drives (which also have memory buffers)

I thought the background to your tests were for reading large data sets from disk into memory. Again these tests can not be repeated as once the data is read from disk to the memory buffers, the reported transfer rates would not represent disk I/O performance.

I think if you want to understand the I/O performance, you should test reading a terrabyte data file (ie much larger than installed memory).

Also, while you are claiming 2 to 6 gigabytes/sec disk transfer rates (really memory buffer transfer rates), the performance for processing the real data can be much less, probably less than 0.5 GBytes/sec.

In real data tests, the NVMe M.2 drive performance rates for I/O will decline significantly once the PC memory buffering and SSD drive memory buffering capacities are exhausted.

( Hopefully, my next PC will have 128 gbytes of DDR5 memory !! )
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Sun Oct 23, 2022 5:47 am    Post subject: Reply with quote

I will assure you 128GB like 640k in the past is not "enough to fit everyone". Laughing
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> Support All times are GMT + 1 Hour
Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next
Page 2 of 7

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group