forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Bug in SCC 3.88
Goto page Previous  1, 2, 3, 4  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Mon Dec 19, 2016 4:04 pm    Post subject: Re: Reply with quote

DanRRight wrote:
Something deeply wrong is here

1) Does the CrystalDiskMark test use parallelism to leave all the test results here in shameful misery?

Test results don't feel shame. Question for a philosopher, perhaps?

I look out of my window and I see a bird flying around, chirping happily. I don't feel shame for not being able to fly, and the bird probably is not jealous because it cannot talk Fortran with DanRRight.

Quote:
Also as a note, ReadF@ and ReadFA@ may be fast to read big chunk or data but they are still very slow in reading line by line (10 numbers or ~160 characters per line)

Yes, these are among the facts of life that one has to accept and cope with. I think that we have covered these points already, and repeatedly.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Tue Dec 20, 2016 2:04 am    Post subject: Reply with quote

Then perhaps i did not articulate my points clearly. Leaving philosophic motives off the table (here for example could be absolutely different view that in reality it is not a happy song of your bird but more a "swan song". Every spring a bird has tripled family size in order to be the same size next spring. Which means of bird's potential life expectancy for a decade poor birdy actually lives a quarter of a year. Isn't this a total misery being a food for others or die from hunger?), here are questions a bit different way

1) I see other tests of read/write are almost an order of magnitude faster then anyone in this forum can show. Reasons for that? Can we get similar speeds if everything is "MS DLLs"?

2) Why there is no way to load 12GB into RAM in one second directly by probably somehow bypassing slow formatting processing while we see that it is possible to unload these 12GB into the RAMdisk space which is supposed to be slower then just RAM itself?

OKok, in our case we are kind of slow, the free domain C compiler leaves us to bite the dust and listen birds laughing, but still we can load data into 1-dimensional array Arr(X) with the speed 1.8GB/s. Can we load the data into 3D array Arr(X,Y,Z) with the same speed ?

As a matter of discussion let me to illustrate last point suggesting one of potential way of doing that. I need to put these 10 numbers of data from each line of the file into array Arr (X, Y, X), or say Arr(10,1000000,100) to be exact which keeps 1 billion numbers. The data on the disk is formatted a bit differently then we played before in this or different thread. First 2 additional numbers in line will be array indices Y and Z and all the rest 10 numbers will go into X array elements. That is done to not calculate X,Y,Z indices and to eliminate processing for calculation of an index of element in the array Arr. Though this index calculation overhead could be actually negligible versus additional time reading indices, i did not check that yet. Adding two numbers per line decreases reading speed just 20% which means instead 12 GB/s we will get 10. "Big" deal....

Again, the superfast reading program let's call it ReadSuperFast2@ will read 12 numbers, first two are indices Y1 and Z1 and place 10 numbers into first indices 1 to 10:

Arr(1:10, Y1, Z1),

then

Arr(1:10, Y2, Z2) etc

Simpler case of lower rank array would require only one index Y and the ReadSuperFast1@

More general case would require the program ReadSuperFast3@ which will use all 3 indices X, Y ad Z to fill sparse array data. Even in this case read speed would be 12/4 = 3 GB/s.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Tue Dec 20, 2016 10:58 am    Post subject: Reply with quote

The CrystalDisk benchmark program is, as far as I can see, just a GUI placed on top of the Microsoft Diskspeed command line utility. Instead of arbitrarily picking the highest speed reported by CrystalDisk, which corresponds to using multiple threads, and feeling miserable, read through the options of Diskspeed in https://github.com/Microsoft/diskspd/blob/master/DiskSpd_Documentation.pdf , select the options best matching your intended usage of I/O, and rejoice.

Your own reported speed of 1.8 GiB/s on your PC for block binary I/O is actually about the same as the speed reported by CrystalDisk for single-thread sequential I/O on large files. You can try this out yourself. Open a command window in Administrator mode, change to the directory containing your large input file, and run the command
Code:
<DiskSpeed directory>\amd64fre>diskspd.exe -fs -F1 -b64K <big_data_file>


Even these speeds are out of reach if you must do formatted READ. As we saw in http://forums.silverfrost.com/viewtopic.php?t=3359&postdays=0&postorder=asc&start=30, formatted READ using standard Fortran gives speeds of about about 30 MB/s. If we assume that the input data contains no errors and we do format conversions ourselves, we can raise the speed to about 300 MB/s.

That is approximately the best that you can do with a single thread, even if you could do disk I/O with infinite speed.


Last edited by mecej4 on Tue Dec 20, 2016 12:58 pm; edited 2 times in total
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Tue Dec 20, 2016 12:45 pm    Post subject: Reply with quote

Such oil do exist but unfortunately no one sells it to Fortraneers in this ngroup. Parallel NetCDF and HDF5 are just few. There exist all libraries for that but again good to find fortrameers which will do initial testing with FTN95. I've heard about complaints too but slowness of large data is more then a nail in the foot
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Tue Dec 20, 2016 1:43 pm    Post subject: Reply with quote

Even if we don't agree on what I/O speeds are possible, there is one good outcome from these discussions.

Herman Cain, a US Republican Primary Presidential candidate in 2012, became well known for his 9-9-9 tax plan. We have come up with something similar and quite useful in planning large programs.

On a PC circa 2016, we can use this rule of thumb:
Code:

    30 MB/s - 300 MB/s - 3 GB/s
    formatted  custom     unformatted
       read      formatted      read
                      read

are the upper limits of what is possible with a single thread processing a large input file.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Thu Dec 22, 2016 6:33 am    Post subject: Reply with quote

Mecej4, Did you change writef to readf or readfa when reading file? With first one my code crashes with second gives 3x less performance then writef. Post your code, i couldn't find what is wrong.

I still hope to find much simpler mechanism of direct load of binary data into RAM bypassing all kind of deciphering. 300MB/s is not 2,3 or 5 but 40x smaller then the peak I/O speed. Are we living in the times of Big Data or not?

The only difference between what we have with readf@/readfa@ (which hopefully can reach the same GB/s as writef as you are claiming in your case) is that they read data into 1D array Arr(ix) while i need to read it into 2D or 3D ones like Arr(ix,iy,iz). There could exist some tricks and workarounds to solve that problem, couple of them i'd like to test (with EQUIVALANCE if it is not yet totally obsolete or cutting structured 1D array into pieces)

/* What was average taxation level of current US territory when it was still under UK versus average current US taxation rate?
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Thu Dec 22, 2016 1:23 pm    Post subject: Re: Reply with quote

DanRRight wrote:
Did you change writef to readf or readfa when reading file?

Readf@. Readfa@ is for text files, and only reads one line with each call.

Quote:
The only difference between what we have with readf@/readfa@ is that they read data into 1D array Arr(ix) while i need to read it into 2D or 3D ones like Arr(ix,iy,iz).

At the machine code level (or even assembler level) there is no such thing as a 2D or 3D array. File I/O, whether binary or formatted, moves bytes between the file and a memory buffer designated by its base address. Fortran uses the column-major convention for multiple dimension arrays, so given the declaration DIMENSION A(imax,jmax), the statement READ (iu) A is the same as READ (iu) ((A(1:imax, j), j = 1, jmax). If you do I/O with only a section of A, the compiler will need to emit extra code to break up the incoming data into chunks and put them into discontiguous parts of memory (for READ) or assemble the data from different blocks of memory and send to the file (for WRITE). The fastest I/O is possible if doing unformatted/binary transfers to/from whole arrays.

Here are the test source codes. First, the binary I/O code:
Code:
program FIOBIN
implicit none
!
! Test raw file I/O speeds to a 64 MiB file. Check for space before running!
!
integer, parameter :: I2 = selected_int_kind(4), I4 = selected_int_kind(9), &
                      I8 = selected_int_kind(18)
integer, parameter :: BSIZ = Z'4000000'   ! 64 MiB
character (Len=1) :: buf(BSIZ)
integer (I2) :: hndl, ecode
integer (I8) :: nbytes = BSIZ, nread
real :: t1,t2
character(len=7) :: fil='BIG.BIN'
!
call openw@(fil,hndl,ecode)
if(ecode /= 0)stop 'Error opening file BIG.BIN for writing'
call cpu_time(t1)
call writef@(buf,hndl,nbytes,ecode)
call cpu_time(t2)
if(ecode /= 0)stop 'Error writing file BIG.BIN'
call closef@(hndl,ecode)
if(ecode /= 0)stop 'Error closing file'
write(*,'(A,2x,F7.3,A)')'Time for writing 64 MB file: ',t2-t1,' s'
write(*,'(A,6x,F6.0,A)')'Estimated write throughput = ',64.0/(t2-t1),' MiB/s'
!
call openr@(fil,hndl,ecode)
if(ecode /= 0)stop 'Error opening file BIG.BIN for writing'
call cpu_time(t1)
call readf@(buf,hndl,nbytes,nread,ecode)
call cpu_time(t2)
if(ecode /= 0)stop 'Error reading file BIG.BIN'
call closef@(hndl,ecode)
if(ecode /= 0)stop 'Error closing file'
write(*,'(A,2x,F7.3,A)')'Time for reading 64 MB file: ',t2-t1,' s'
write(*,'(A,6x,F6.0,A)')'Estimated read throughput  = ',64.0/(t2-t1),' MiB/s'
!
call erase@(fil,ecode)
call doserr@(ecode)
end program

On my laptop, the output:
Code:
s:\FTN95>fiobin
Time for writing 64 MB file:     0.063 s
Estimated write throughput =        1024. MiB/s
Time for reading 64 MB file:     0.031 s
Estimated read throughput  =        2048. MiB/s

Next, the ASCII I/O code:
[***HAVE TO BREAK UP THE POST -- FORUM line limit reached***]


Last edited by mecej4 on Thu Dec 22, 2016 1:27 pm; edited 1 time in total
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Thu Dec 22, 2016 1:26 pm    Post subject: Reply with quote

[CONTINUED]
Next, the ASCII I/O test code:
Code:
program FIOASC
implicit none
!
! Test text file I/O speeds to a 64 MiB file. Check for space before running!
!
integer, parameter :: I2 = selected_int_kind(4), I4 = selected_int_kind(9), &
                      I8 = selected_int_kind(18)
integer, parameter :: FSIZ = Z'100000'   ! File size, 2^20 lines
character (Len=64) :: lbuf               ! Line buffer
integer (I2) :: hndl, ecode
integer (I8) :: nlines = FSIZ
real :: t1,t2
character(len=7) :: fil='BIG.TXT'
integer :: i,j,nread
character(len=1) :: c
!
lbuf = 'If at first you do not succeed, try, try, try again, she said.'
call openw@(fil,hndl,ecode)
if(ecode /= 0)stop 'Error opening file BIG.TXT for writing'
call cpu_time(t1)
do i=1,nlines
   call writefa@(lbuf,hndl,ecode)
   if(ecode /= 0)stop 'Error writing file BIG.TXT'
   j=mod(i,64)+1         ! swap characters to provide variation in lines written
   c=lbuf(j:j)
   lbuf(j:j)=lbuf(1:1)
   lbuf(1:1)=c
end do   
call cpu_time(t2)
call closef@(hndl,ecode)
if(ecode /= 0)stop 'Error closing file'
write(*,'(A,2x,F7.3,A)')'Time for writing 64 MB file: ',t2-t1,' s'
write(*,'(A,6x,F6.0,A)')'Estimated write throughput = ',64.0/(t2-t1),' MiB/s'
!
call openr@(fil,hndl,ecode)
if(ecode /= 0)stop 'Error opening file BIG.TXT for writing'
call cpu_time(t1)
do i=1,nlines
   call readfa@(lbuf,hndl,nread,ecode)
   if(ecode /= 0)stop 'Error reading file BIG.TXT'
end do   
call cpu_time(t2)
call closef@(hndl,ecode)
if(ecode /= 0)stop 'Error closing file'
write(*,'(A,2x,F7.3,A)')'Time for reading 64 MB file: ',t2-t1,' s'
write(*,'(A,6x,F6.0,A)')'Estimated read throughput  = ',64.0/(t2-t1),' MiB/s'
!
call erase@(fil,ecode)
call doserr@(ecode)
end program

My output from this:
Code:
s:\FTN95>fioasc
Time for writing 64 MB file:     0.141 s
Estimated write throughput =         455. MiB/s
Time for reading 64 MB file:     0.266 s
Estimated read throughput  =         241. MiB/s

Expect even slower formatted I/O if format conversions of floating point REALs or DOUBLE PRECISION values are to be done. We have seen some examples of this in a recent thread, http://forums.silverfrost.com/viewtopic.php?t=3359 .


Last edited by mecej4 on Thu Dec 22, 2016 2:08 pm; edited 1 time in total
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Thu Dec 22, 2016 1:59 pm    Post subject: Reply with quote

Thanks, found my error due to missing parameter... damn, seems besides possible Alzheimer i am getting also ADT Wink

By the way I got the following result for the second code ( increased file size to ~1GB, the 64MB is too small to measure time correctly)

Code:
Time for writing file of size MB:       819  1.141 s
Estimated throughput =         718. MB/s
Time for reading file of size MB:       819  1.953 s
Estimated throughput =         419. MB/s


But more interesting was the first test

Code:
Time for writing file of size MB:      1024  0.516 s
Estimated throughput =        1986. MB/s
Time for reading file of size MB:      1024  0.234 s
Estimated throughput =        4369. MB/s


It shows reading speed 4.4GB/s. With this speed is already interesting to work. Now after reading the data if fill the columns of another the same size but 3D array with such 1D data preliminary pre-formed and saved column-by-column, thinking but not yet know how, we may get tremendous reading speed...In this case the loaded array buf(BSIZ) of multi-GB size will be lost but who cares, memory becomes cheaper and cheaper and 64bit compiler is hopefully close to be complete.

What we need is the way of transferring real*4 and integer*4 numbers into bytes, save them in character array, save this array on disk, then read this binary array exactly as these 4 bytes are representing real*4 and integer*4 numbers inside the computer. This way we will avoid burden of processing.
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Thu Dec 22, 2016 5:19 pm    Post subject: Re: Reply with quote

DanRRight wrote:
... 64bit compiler is hopefully close to be complete.


If you try the FIOASC program with /64, you will see that there is a performance problem. The writing phase is about ten times slower than with FTN95-32 bit, although the reading speed is about the same.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Thu Dec 22, 2016 11:59 pm    Post subject: Reply with quote

Then we don't have to use it, it is slow anyway even if in future excessive slowness will be fixed. Binary readf@ is ok. Or I miss something?
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Fri Dec 23, 2016 4:46 am    Post subject: Reply with quote

Guessing why Salford made byte readf@ and character readfa@ but did not make real*4 and real*8 utilities? How best way to convert real*4 number into 4 character*1 numbers and vice versa? Ideally portable way across all platforms and languages like with hdf5?
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Fri Dec 23, 2016 6:40 am    Post subject: Reply with quote

That cannot be done.

Text files use LF (or CR+LF) to separate lines. These characters are not used for any other purpose or with any other meaning in a text file. The READFA@ subroutine reads one line for each call to it. The buffer that you provide is filled with all characters in the line up to, but not including, the LF.

Real numbers in their internal binary format cannot be placed in text files. Why? Consider, for example, the REAL*4 number 552.0. It has an IEEE representation of Z'440A0000'. Look at the second most significant byte, 0A. How do you tell that that is part of a number and not a record separator? How to tell that the next byte, Z'44', is not the letter 'D'?

Another reason is that such files cannot be printed or viewed by most people, who are not proficient at mental calculations using hexadecimal numbers.
Back to top
View user's profile Send private message
DanRRight



Joined: 10 Mar 2008
Posts: 2813
Location: South Pole, Antarctica

PostPosted: Fri Dec 23, 2016 12:09 pm    Post subject: Reply with quote

I do not think you are right, mecej4, data is perfectly converted. Problem is solved unless you will find any errors. Here is bottom line: we have some big data in real*4 array A(iz,iy,iz), saving it on ramdisk (with 2GB/s) and then reading and recovering it into real*4 array C(ix,iy, iz) with unseen

*** 4+ GB/s reading speed *** not losing anything on format conversion. Binary I/O is used as a carrier. Same can be done for any other big data

Code:

Time for reading file of size MB:       280  0.063 s
Estimated throughput =        4480. MB/s


Code:

integer, parameter :: ix = 10, iy=700000, iz=10 
integer, parameter :: BSIZ = 4 * ix * iy * iz 
character (Len=1) :: buf(BSIZ), bufRead(BSIZ)
integer*2 :: hndl, ecode
integer*8 :: nbytes = BSIZ, nread

real*4 A(ix,iy,iz)
equivalence (A,Buf)
real*4 C(ix,iy,iz)
equivalence (C,bufRead)

do iiz=1,iz
 do iiy=1,iy
  do iix=1,ix
   A(iix,iiy,iiz) = iix
  enddo
 enddo
enddo

print*, 'A=', A(1,1,1), A(2,1,1)

call openw@('Y.bin',hndl,ecode)
if(ecode /= 0)stop 'Error opening file Y.BIN for writing'
call writef@(buf,hndl,nbytes,ecode)
if(ecode /= 0)stop 'Error writing file Y.BIN'
call closef@(hndl,ecode)

! .............................

call openr@('Y.bin',hndl,ecode)
if(ecode /= 0)stop 'Error opening file BIG.BIN for writing'
call cpu_time(t1)
call readf@(bufRead,hndl,nbytes,nread,ecode)
call cpu_time(t2)
if(ecode /= 0)stop 'Error reading file BIG.BIN'
call closef@(hndl,ecode)

write(*,'(A,2x,i7, F7.3,A)')'Time for reading file of size MB: ',BSIZ/1000000, t2-t1,' s'
write(*,'(A,6x,F6.0,A)')'Estimated throughput = ',BSIZ/1000000/max(1.e-6,(t2-t1)),' MB/s'

print*, 'Checking if C=A', C(1,1,1), C(2,1,1)

end


Last edited by DanRRight on Fri Dec 23, 2016 12:40 pm; edited 1 time in total
Back to top
View user's profile Send private message
mecej4



Joined: 31 Oct 2006
Posts: 1884

PostPosted: Fri Dec 23, 2016 12:37 pm    Post subject: Reply with quote

Dan, problem not solved. Problem under rug. Try reading the data file in a text editor.

You are still calling READF@ and WRITEF@. These subroutines simply read and write bytes with no awareness of what those bytes represent. The code that you just posted is the same as my binary I/O example with a few lines added to initialize the array A. You cannot view the file or print it and make sense of its contents. No conversions involved.

What I said you cannot do is to perform I/O of REAL variables to text files without format conversion, by calling WRITEFA@ and READFA@. You can certainly convert your terabyte-sized data files to binary files and then process the binary files. The conversion is an unavoidable and time-consuming process. The less often you need to do the conversion, the better.

If your data is coming from someone else, you can work with them to define a custom file exchange format or use HDF/NETCDF. If you receive text files from them, you cannot avoid slow format conversions.

If you end up using binary files, you had better add some safety features to protect and verify the integrity of the "non-human-readable" data in them. For example, you can add check-sums after every MiB of data, a separate companion check-sum file, etc.


Last edited by mecej4 on Fri Dec 23, 2016 1:46 pm; edited 1 time in total
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Goto page Previous  1, 2, 3, 4  Next
Page 2 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group