|
forums.silverfrost.com Welcome to the Silverfrost forums
|
View previous topic :: View next topic |
Author |
Message |
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7928 Location: Salford, UK
|
Posted: Tue Oct 11, 2022 8:27 am Post subject: |
|
|
John
At the moment "pos = end_pos" only works for 32 bit end_pos.
I will aim to fix this and then upload a new set of DLLs. We can then consider other issues after you have tested this fix. |
|
Back to top |
|
|
PaulLaidler Site Admin
Joined: 21 Feb 2005 Posts: 7928 Location: Salford, UK
|
Posted: Tue Oct 11, 2022 11:56 am Post subject: |
|
|
John
I have sent you a pm with links to a trial fix. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2820 Location: South Pole, Antarctica
|
Posted: Thu Oct 13, 2022 12:52 am Post subject: |
|
|
Thanks, Paul. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Tue Oct 18, 2022 12:48 pm Post subject: |
|
|
The following is my latest stream IO test program that demonstrates problems with stream I/O in FTN95 when overwriting files.
Others may confirm my tests or identify problems with my approach ?
This is a program that can be run in PLATO, using either FTN95 /64 or gfortran
https://www.dropbox.com/s/cxe6wr8vdd3vl6n/stream_readc.f90?dl=0
I am using gfortran as a reference, as it appears to perform correctly.
The comments in the program describe 3 tests in more detail, that produce 3 files which should all be the same.
FTN95 fails in tests 2 and 3 for WRITE ( pos=address ) when rewriting and for INQUIRE
Please build using ftn95 /64 stream_readc.f90 /link then run.
You can compare the 3 files from the 3 tests.
stream_filec.bin
stream_filez.bin
stream_file$.bin
( you can delete these files before the test, but is not necessary to confirm errors.)
They should all be the same, but are not with FTN95
file stream_readc.log is appended with results of each test run to confirm errors. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2820 Location: South Pole, Antarctica
|
Posted: Tue Oct 18, 2022 5:43 pm Post subject: |
|
|
John, Do your tests need new dll with latest fix of 4GB issue? And what are speeds of READ / WRITE ( pos=address ) in comparison with Method2 above ? I suspect it might not be worth to use it anymore |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Wed Oct 19, 2022 5:15 am Post subject: |
|
|
Dan : Do your tests need new dll with latest fix of 4GB issue?
ans: The test I am doing is using version 8.91.1.
This version has a fix for integer*8 address.
I have not yet tested 4gb+ files.
The main purpose of my tests are to confirm that READ ( ..., pos=address,...) and WRITE ( ..., pos=address,...) are working for stream I/O.
What I am finding is "WRITE ( lu, pos=address, ...) IO_list", appears to fail on rewriting to the file.
I have found that "READ ( lu, pos=address, ...) IO_list" appears to work in the test program I attached.
This test also uses integer*8, so FTN95 Ver 8.91.1 does address integer*8 problems, but fails on re-writing data to the file.
The "pos=address" capability, provides for an addressible file data structure
Dan : And what are speeds of READ / WRITE ( pos=address ) in comparison with Method2 above ?
I suspect it might not be worth to use it anymore
Ans : The Stream read/write ( pos=address ) should be supported by Windows buffered I/O, similar to Fortran unformatted, fixed length record performance. It should be a good solution.
The stream I/O could be a much better solution than "Method 2" which appears to be Fortran unformatted sequential I/O for records larger than 2gbytes.
I still have not seen the header/footer syntax for this FTN95 solution, but I suspect it is not compatible with ifort/gfortran. I have read that ifort and gfortran are compatible for Fortran unformatted sequential I/O, including records larger than 2gbytes, althlough the header/footer syntax is a bit of a mess, using 2gbyte sub-records.
I would expect that stream I/O is a better solution, although it is probably safer to partition the data into records smaller than 2gbytes.
It is probably better to have a file dump something like:
Code: | write(11) nB,size(Arr4(:,1))
do i=1,nB
write(11) i,Arr4(:,i)
enddo
write (11) -1
|
|
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2820 Location: South Pole, Antarctica
|
Posted: Wed Oct 19, 2022 9:29 pm Post subject: |
|
|
Depending on array dimension sizes this way stream I/O will be up to 10x slower than by fast Method2. Don't remember if the files made by fast Method2 are compatible with other compilers and Linux. I currently temporally do not use Method2. Method1 is compatible.
Using READF@ blows them all |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Thu Oct 20, 2022 4:14 am Post subject: Re: |
|
|
DanRRight wrote: | Depending on array dimension sizes this way stream I/O will be up to 10x slower than by fast Method2 |
A bit of over-reach in the 10x !
I just think it is better to write "records" that are smaller than 4 gbytes, although I do not recall what size nB and size(Arr4(:,1)) are. Spliting the binary dump into bits does give the option of inspecting the data read, rather than waiting for many gigabytes to be loaded.
It is hard to keep up with the latest available PCIe 4.0 M.2 SSDs and ver 5.0 to be available shortly. My understanding these 4.0 are over 5,000 MBps, which I think is 5 gigabytes per second ! At these rates, you have to carefully look at the resulting Fortran code being able to process this fast. |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2820 Location: South Pole, Antarctica
|
Posted: Thu Oct 20, 2022 8:47 am Post subject: |
|
|
1) There is no good reason to chop the files by size <4GB and complicate the code
2) Put 6 into my test above instead of 11 and you will get speeds 0.28GB/sec --> 10x slower than Method2 and 20x slower than with READF@. |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Fri Oct 21, 2022 6:34 am Post subject: |
|
|
Dan,
I updated your program and tested with FTN95 and gfortran.
Method 1 produces 0.4 GBytes/sec, while
Method 2 is up to 2.9 GBytes/sec
This is on a Samsung 970 EVO Plus NVMe M.2 drive which I think is PCIe Gen 3.0
The program is Code: | ! compilation: ftn95 aaa.f90 /link /64 >z
!
use iso_fortran_env
integer*4, parameter :: million = 1000000
real*4, dimension(:,:), allocatable :: Arr4
integer*4 :: nA, nB, ierr, i, pass
real*4 :: SpeedGBps, t0, t1, dt
real*4, external :: delta_sec
write (*,*) 'Compiler Version :',compiler_version ()
write (*,*) 'Compiler Options :',compiler_options ()
dt = delta_sec ()
nA = 6 ! 11
nB = 200 * million
!...Allocating array
Print*, 'Trying to allocate GB of RAM :', 1.d-9 * 4. * nA * nB
allocate ( Arr4 (nA, nB), stat = ierr)
if (ierr.eq.0) then
Print*, 'Allocation success'
else
Print*, 'Fail to allocate'
goto 1000
endif
!...Filling the array with some data
do i=1,nB
Arr4(:,i) = [1,2,3,4,5,6,7,8,9,10,11]
enddo
dt = delta_sec ()
SpeedGBps = 4. * nA * nB / 1024.**3 /(dt+1.e-10)
print*,' Speed of initialising array =', SpeedGBps,' GB/sec',dt,' sec' ! typically ~0.5 GB/s
do pass = 1,2 ! do 2 passes for file not/already exists
Print*,'Trying to save the data Method 1 '
call cpu_time(t0)
dt = delta_sec ()
open (11, file='LargeFile.dat', FORM='UNFORMATTED', access="STREAM", err=900)
do i=1,nB
write(11) Arr4(:,i)
enddo
close(11)
call cpu_time(t1)
dt = delta_sec ()
!...Speeed of writing method 1
SpeedGBps = 4. * nA * nB / 1024.**3 /(dt+1.e-10)
print*,' Speed of write Method 1 =', SpeedGBps,' GB/sec',dt,' sec',(t1-t0) ! typically ~0.5 GB/s
Print*,'Trying to save the data Method 2'
call cpu_time(t0)
dt = delta_sec ()
open (11, file='LargeFile.dat', FORM='UNFORMATTED', access="STREAM", err=900)
write(11) Arr4
close(11)
call cpu_time(t1)
dt = delta_sec ()
!...Speeed of writing Method 2
SpeedGBps = 4. * nA * nB / 1024.**3 /(dt+1.e-10)
print*,' Speed of write Method 2=', SpeedGBps,' GB/sec',dt,' sec',(t1-t0) ! typically ~2.6 GB/s
end do ! pass
write (*,*) 'File LargeFile.dat test completed OK'
goto 1000
!...............
!...Errors
900 Print*,'Can not open file LargeFile.dat'
goto 1000
910 Print*,'Can not save file LargeFile.dat'
1000 Continue
End
real*4 function delta_sec ()
integer*8 :: tick, rate, last=0
call system_clock ( tick, rate)
delta_sec = dble (tick-last) / dble (rate)
last = tick
end function delta_sec
|
|
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Fri Oct 21, 2022 6:38 am Post subject: |
|
|
The .bat file to test is Code: | set program=dan_stream
now > %program%.log
del LargeFile.dat
del %program%.exe
del %program%.obj
del %program%.o
ftn95 %program%.f90 /64 /debug /link >>%program%.log
%program% >>%program%.log
del LargeFile.dat
del %program%.exe
del %program%.obj
del %program%.o
gfortran %program%.f90 -g -fimplicit-none -O2 -o %program%.exe >>%program%.log
%program% >>%program%.log
notepad %program%.log
|
The results are: Code: | It is now Friday, 21 October 2022 at 16:18:53.036
[FTN95/x64 Ver. 8.91.1.0 Copyright (c) Silverfrost Ltd 1993-2022]
Licensed to: John Campbell
Organisation: John Campbell
[Current options] 64;DEBUG;ERROR_NUMBERS;IMPLICIT_NONE;INTL;LINK;LOGL;
0089) 910 Print*,'Can not save file LargeFile.dat'
WARNING - 21: Label 910 is declared, but not used
NO ERRORS, 1 WARNING [<main program> FTN95 v8.91.1.0]
NO ERRORS [<DELTA_SEC> FTN95 v8.91.1.0]
[SLINK64 v3.04, Copyright (c) Silverfrost Ltd. 2015-2022]
Loading C:\temp\forum\stream_io\lgotemp@.obj
Creating executable file dan_stream.exe
Compiler Version :FTN95 v8.91.1
Compiler Options :64;DEBUG;ECHO_OPTIONS;ERROR_NUMBERS;IMPLICIT_NONE;INTL;LINK;LOGL;
Trying to allocate GB of RAM : 4.80000000000
Allocation success
Speed of initialising array = 1.85777 GB/sec 2.40630 sec
Trying to save the data Method 1
Speed of write Method 1 = 0.387883 GB/sec 11.5250 sec 11.4062
Trying to save the data Method 2
Speed of write Method 2= 2.69314 GB/sec 1.65990 sec 1.56250
Trying to save the data Method 1
Speed of write Method 1 = 0.386933 GB/sec 11.5533 sec 11.5312
Trying to save the data Method 2
Speed of write Method 2= 2.87482 GB/sec 1.55500 sec 1.56250
File LargeFile.dat test completed OK
Compiler Version :GCC version 11.1.0
Compiler Options :-mtune=generic -march=x86-64 -g -O2 -fimplicit-none
Trying to allocate GB of RAM : 4.8000000000000007
Allocation success
Speed of initialising array = 4.88828278 GB/sec 0.914502800 sec
Trying to save the data Method 1
Speed of write Method 1 = 0.441231251 GB/sec 10.1315317 sec 10.0937500
Trying to save the data Method 2
Speed of write Method 2= 1.66670907 GB/sec 2.68214083 sec 2.68750000
Trying to save the data Method 1
Speed of write Method 1 = 0.448078960 GB/sec 9.97669792 sec 9.96875000
Trying to save the data Method 2
Speed of write Method 2= 1.75272882 GB/sec 2.55050778 sec 2.53125000
File LargeFile.dat test completed OK
|
I prefer elapsed time testing, but CPU_time is similar in this case.
Paul's implementation of 4GB+ stream I/O is very efficient !! |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2820 Location: South Pole, Antarctica
|
Posted: Fri Oct 21, 2022 10:05 pm Post subject: |
|
|
John, You confirmed my numbers. How about your method? Is it indeed slow? |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Sat Oct 22, 2022 7:06 am Post subject: |
|
|
Dan,
You are comparing the overhead of writing 200 million 24 byte records vs 1 gigantic 4.8 gigabyte record.
I have tested a middle ground of writing 240 byte or 2,400 byte records where the performance is not so different.
My preference would be to target "records" of about 1 kb to 100 kbytes, although smaller records can be easier to manage in post processing.
It is good we have seen the extremes of method 1 vs method 2.
Stream I/O is certainly more portable, as there is no clash with different header/footer formats. It is also easy with stream I/O to replicate or read ifort or gfortran sequential binary file formats.
Another option with STREAM I/O is to read a sequential binary file format and construct a table of record addresses. This can be later used to randomly access the records using "read ( lu, pos=rec_address(rec_id) ) IO_list", which opens up a much more flexible way of accessing the data.
Stream I/O is a very useful addition to Fortran
John |
|
Back to top |
|
|
JohnCampbell
Joined: 16 Feb 2006 Posts: 2554 Location: Sydney
|
Posted: Sun Oct 23, 2022 3:43 am Post subject: |
|
|
Dan,
I have increased the test options for record size (ie number of records) vs I/O performance. ( Do pass = 1,4 )
However, depending on the amount of memory installed, the test appears to be more testing Microsoft disk buffering, without actually testing the disk performance. This is because the test program generates the file in memory and does not enforce the data is transferred to the disk.
I have a HP i7-6800 notebook with SADA SSD, but only 8gbytes of memory. It's performance is nothing like the NVMe M.2 drives we have been reporting, mainly because there is not sufficient memory for both storing the 6gb array and buffering the 6gb file.
I think the estimated disk I/O performance of 2 to 4 gbytes/sec being reported are due mainly to memory buffering and not the SSD drives (which also have memory buffers)
I thought the background to your tests were for reading large data sets from disk into memory. Again these tests can not be repeated as once the data is read from disk to the memory buffers, the reported transfer rates would not represent disk I/O performance.
I think if you want to understand the I/O performance, you should test reading a terrabyte data file (ie much larger than installed memory).
Also, while you are claiming 2 to 6 gigabytes/sec disk transfer rates (really memory buffer transfer rates), the performance for processing the real data can be much less, probably less than 0.5 GBytes/sec.
In real data tests, the NVMe M.2 drive performance rates for I/O will decline significantly once the PC memory buffering and SSD drive memory buffering capacities are exhausted.
( Hopefully, my next PC will have 128 gbytes of DDR5 memory !! ) |
|
Back to top |
|
|
DanRRight
Joined: 10 Mar 2008 Posts: 2820 Location: South Pole, Antarctica
|
Posted: Sun Oct 23, 2022 5:47 am Post subject: |
|
|
I will assure you 128GB like 640k in the past is not "enough to fit everyone". |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|