forums.silverfrost.com Forum Index forums.silverfrost.com
Welcome to the Silverfrost forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Can REAL*4 array store more than 4GB?
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General
View previous topic :: View next topic  
Author Message
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Mon Sep 27, 2021 4:51 am    Post subject: Can REAL*4 array store more than 4GB? Reply with quote

In small test programs I allocate real*4 arrays to store more than 4GB no problem. But when in large program they unpredictably crash program with access violation even sometimes with smaller than 4 GB. Is this because internally the indexes of real*4 arrays use integer*4 numbers and they somehow get integer overflow ?
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7099
Location: Salford, UK

PostPosted: Mon Sep 27, 2021 8:23 am    Post subject: Reply with quote

The addressing uses 64 bit integers for both 32 bit and 64 bit applications so I assume the fault lies elsewhere.
Back to top
View user's profile Send private message AIM Address
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Mon Sep 27, 2021 9:13 pm    Post subject: Reply with quote

I do not know where to search for error... Here is where crash happen

Code:
real*4, dimension(:,:), allocatable :: Partcl4

write(19,err=970) partcl4


Crashes for larger than few GB arrays, smaller ones usually go OK.
May be FTN95 has limitation on size of writable array such way ? First dimension is around 10, second around 100-500 M

And if substitute write with DO loop then all works OK but 10x slower

Code:
  do iii=1,NumbOfPartInData(ion)
  write(19,err=970) partcl4(:,iii)
  enddo


Could be that denormal numbers somehow are involved ? Such numbers could appear and loaded from large data set ( i read them as real*8 and then i assign these real*8 to real*4 to save on space and speed)
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7099
Location: Salford, UK

PostPosted: Tue Sep 28, 2021 9:13 am    Post subject: Reply with quote

Dan

If you have a reasonably small sample program that demonstrates the failure then we could investigate. We would need a "working" program and details of the command line arguments (did you mention if it's 64 bits?).
Back to top
View user's profile Send private message AIM Address
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Wed Sep 29, 2021 9:13 am    Post subject: Reply with quote

Paul,

This is an example of the problem.
It appears that unformatted binary file records can not exceed 2-gb in size.
I presume the record header/footer does not support it.
The test code is:
Code:
    real*4, parameter :: gb = 2.5     ! = 1.5 works, but = 2.5 fails
    real*4    :: size_6gb = gb * 2.**30
    integer*4 :: n,m,i,j, iostat

    integer*1 :: header(17)
    integer*4 :: H4(4)
    equivalence ( header(2), H4(1) )

    integer*4, dimension(:,:), allocatable :: Partcl4

    m = 2**20
    N = size_6gb / (4.*m)
    write ( *,*) 'target size =',gb,' GBytes'
    write ( *,*) 'target dimensions =',n,m
    write ( *,*) size_6gb / (n*m), '  should = 4.0 bytes per word'

    open (unit=19, file='large_binary.bin', form='unformatted', iostat=iostat)
    write (*,*) ' file=large_binary.bin : opening unformatted file : iostat=',iostat

    allocate ( Partcl4(N,M), stat=iostat )
    write ( *,*) ' Partcl4(N,M) allocated, stat=',iostat
    write ( *,*) ' size(Partcl4) =',size(Partcl4)

    do j = 1,m
      do i = 1,n
        Partcl4(i,j) = i+j
      end do
    end do
    write ( *,*) ' Partcl4(N,M) initialised'

    write (19,iostat=iostat) partcl4
    write (*,*) ' large record written, iostat=',iostat
    close (unit=19)

!z    open (unit=11, file='large_binary.bin', access='transparent', iostat=iostat)
    open (unit=11, file='large_binary.bin', access='stream', iostat=iostat)
    write (*,*) ' opening transparent file : iostat=',iostat

    read (11) header
    close (unit=11)
    write (*,*) 'record header'
    do i = 1,size(header)
      write (*,*) i, header(i)
    end do
    do i = 1,size(h4)
      write (*,*) i, h4(i)
    end do

    end

My build test is
Code:
del %1.exe                       >> %1.log
del large_binary_file.bin        >> %1.log
ftn95 %1 /64 /debug /link        >> %1.log
del large_binary.bin             >> %1.log
%1                               >> %1.log
dir *.bin                        >> %1.log

type %1.log


my test results WHERE
Code:
[FTN95/x64 Ver. 8.74.0 Copyright (c) Silverfrost Ltd 1993-2021]

[Current options] 64;DEBUG;ERROR_NUMBERS;IMPLICIT_NONE;INTL;LINK;LOGL;

    NO ERRORS  [<main program> FTN95 v8.74.0]
[SLINK64 v3.02, Copyright (c) Silverfrost Ltd. 2015-2021]
Loading c:\temp\forum\danR\lgotemp@.obj
Creating executable file gb6_file.exe
 target size =     1.50000     GBytes
 target dimensions =         384     1048576
      4.00000      should = 4.0 bytes per word
  file=large_binary.bin : opening unformatted file : iostat=           0
  Partcl4(N,M) allocated, stat=           0
  size(Partcl4) =            402653184
  Partcl4(N,M) initialised
  large record written, iostat=           0
  opening transparent file : iostat=           0
 record header
            1          -1
            2           0
            3           0
            4           0
            5          96
            6           2
            7           0
            8           0
            9           0
           10           3
           11           0
           12           0
           13           0
           14           4
 Volume in drive C is Acer
 Volume Serial Number is 5CF1-CCA3

 Directory of c:\temp\forum\danR

29/09/2021  05:54 PM     1,610,612,746 large_binary.bin
               2 File(s)  1,610,612,746 bytes
               0 Dir(s)  689,366,872,064 bytes free


It woul
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Wed Sep 29, 2021 9:56 am    Post subject: Reply with quote

/ctd

It would be good to:
1) support records greater than 2GBytes
2) have a compiler option to support gFortran and iFort record header formats, which may support larger records. I don't know the format.

It appears that gFortran writes records larger than ~2gb (or ~ 4gb) as multiple records.

My gFortran test code is
Code:
    real*4, parameter :: gb = 4.5     ! = 1.5 works, but = 2.5 fails
    real*4    :: size_6gb = gb * 2.**30
    integer*4 :: n,m,i,j, iostat
    integer*1 :: header(16)
    integer*4 :: H4(4)
    equivalence ( header(1), H4(1) )

    integer*4, dimension(:,:), allocatable :: Partcl4

    m = 2**20
    N = size_6gb / (4.*m)
    write ( *,*) 'target size =',gb,' GBytes'
    write ( *,*) 'target dimensions =',n,m
    write ( *,*) size_6gb / (n*m), '  should = 4.0 bytes per word'

    open (unit=19, file='large_binary.bin', form='unformatted', iostat=iostat)
    write (*,*) ' file=large_binary.bin : opening unformatted file : iostat=',iostat

    allocate ( Partcl4(N,M), stat=iostat )
    write ( *,*) ' Partcl4(N,M) allocated, stat=',iostat
    write ( *,*) ' size(Partcl4) =',size(Partcl4)

    do j = 1,m
      do i = 1,n
        Partcl4(i,j) = i+j
      end do
    end do
    write ( *,*) ' Partcl4(N,M) initialised'

    write (19,iostat=iostat) partcl4
    write (*,*) ' large record written, iostat=',iostat
    close (unit=19)

!z    open (unit=11, file='large_binary.bin', access='transparent', iostat=iostat)
    open (unit=11, file='large_binary.bin', access='stream', iostat=iostat)
    write (*,*) ' opening transparent file : iostat=',iostat

    read (11) header
    close (unit=11)
    write (*,*) 'record header'
    do i = 1,size(header)
      write (*,*) i, header(i)
    end do
    do i = 1,size(h4)
      write (*,*) i, h4(i)
    end do

    end


the build is
Code:
del %1.exe                       >> %1.log
del large_binary_file.bin        >> %1.log
gfortran %1.f90 -fimplicit-none -o2 -o %1.exe   >> %1.log 2>&1
del large_binary.bin             >> %1.log
%1                               >> %1.log
dir *.bin                        >> %1.log

type %1.log


You can experiment with record sizes of 1.5, 2.5 and 4.5 gb to get the header format, but it appears to use a 4-byte header and reads/writes multiple records when oversized.
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7099
Location: Salford, UK

PostPosted: Wed Sep 29, 2021 2:46 pm    Post subject: Reply with quote

Thanks John

I have made a note of the issue that you have raised.
Back to top
View user's profile Send private message AIM Address
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Thu Sep 30, 2021 9:27 am    Post subject: Reply with quote

FTN95 uses a 1-byte or 5-byte record header/trailer format.

1-byte is for records smaller than 255 bytes (1-byte /= -1),
while for records >= 255 bytes, byte-1 = -1 and bytes 2-5 = larger record size.
I assume the largest header-data-trailer is limited to 2^31-1 bytes ?

It appears that in FTN95 there is no support for records larger than 2^31-1 bytes in size. This is based on the example above.

gFortran however uses a 4-byte record header/trailer format.
The largest header-data-trailer is limited to 2^31-9 bytes.

However unformatted read/write supports larger "records" by writing a group record consisting of :
multiple records of 2^31-9 bytes (where header is -ve and trailer is +ve) ,
followed by a final record (where header is +ve and trailer is -ve)

The maximum record size can be changed via compiler options,
An option to change the maximum (component) record size, or
an option to change the header and trailer format to to 8-bytes.

If FTN95 does not support records larger than 2^31-1 bytes in size, it has the option of either going to
i) a (1+4+8) 13-byte format or
ii) using a group record approach similar to as gFortran uses.

The gFortran approach could both extend the record size supported by use of a /gFortran_sequential_binary compiler option and also support a gFortran unformatted binary file interchange.

At present I use a Fortran direct-access fixed length record interface file to share data between FTN95 and gFortran. This library also overlays a variable length record format on the direct access file by generating a table of record addresses and sizes which must be stored on another databse file.

Unformatted sequential access binary files are a very simple file interchange.
By combining these with stream/transparent access, they can provide a random access variable length format, by creating a list of record start addresses.
Back to top
View user's profile Send private message
Robert



Joined: 29 Nov 2006
Posts: 390
Location: Manchester

PostPosted: Thu Sep 30, 2021 5:05 pm    Post subject: Reply with quote

When would you use a record length greater than 2GB?
Back to top
View user's profile Send private message Visit poster's website
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Thu Sep 30, 2021 6:43 pm    Post subject: Reply with quote

Robert,
Typical PIC code files can be 50-100GB, total size for the run up to 2-10TB. In this case using real*4 instead of real*8 is critically important

John,
Thanks for the insights
Back to top
View user's profile Send private message
PaulLaidler
Site Admin


Joined: 21 Feb 2005
Posts: 7099
Location: Salford, UK

PostPosted: Thu Sep 30, 2021 7:05 pm    Post subject: Reply with quote

Are there multiple very large records or just one per file?

If there is just one then there may be a case for changing to STREAM access which is now available with FTN95.
Back to top
View user's profile Send private message AIM Address
DanRRight



Joined: 10 Mar 2008
Posts: 2450
Location: South Pole, Antarctica

PostPosted: Thu Sep 30, 2021 9:31 pm    Post subject: Reply with quote

Paul,
First few records are header which tells what version of file it is,
what dimensions idim1,idim2 for the array Partcl4(idim1,idim2) are etc.

Then follows one large record. But currently because this does not work i record hundreds millions of records of 11 real*4 numbers each. Or if it is 4D array then something like 1000x1000x1000 records of 3 real*4 numbers each.

Is "stream" usable in this case ? If yes, how to do that specifically?

Despite the humongous size compared to Gates' "640k which will fit everyone" if save such arrays as one record this takes just a second or two because the write speed may reach 3-7 GB/second and with PCIe5 next year will reach 15 GB/second on the NVMe drives. But because this does not work for larger files and i use sequential write, this goes for tens of seconds. Reading is also similarly super fast if it is just one record, and tens of seconds if read sequentially all records one by one (speed around 0.5 GB/second)

Such dimensions are not considered large. I use around 1000-2500 cores maximum. Some colleagues got access to 100,000 cores and this is also nothing as widespread and cheap become supercomputers which have millions cores. So my files could be 100x larger easily, literally next day i get access to monstrous supercomputer, this is a matter of changing two-three setup numbers


Last edited by DanRRight on Fri Oct 01, 2021 4:12 am; edited 2 times in total
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Fri Oct 01, 2021 3:07 am    Post subject: Reply with quote

A couple of points:
1) Large (virtual) records > 2 GB are used to save arrays larger than 2GBytes, so the capability is required.
2) The main benefit of unformatted I/O is the use of an I/O list, so more than a single array can be transferred. eg writing derived type components in F95.
3) The main problem of unformatted I/O is incompatible header/trailer formats between different Fortran compilers. (stream or direct access files can be better for transfers)
4) There can be multiple very large records in a file. For gFortran, unformatted read/write manages each very large record as a set of (2gb-9) size records + a final record , as a group of records.
5) stream I/O can do all this in a single record but you need to code the header/trailer. Yesterday I wrote a simple stream I/O program to test records and using an 8-byte file address and 8-byte header-data-trailer format. It works well. The library routines demonstrate a solution, but without the I/O flexibility + automatic header-data-trailer structure.

Below are links to :
stream_test.f90 : stream I/O record demonstration with 8-byte header-data-trailer. ( this is "in development" so some redundant or incomplete info, eg managing record data structure tables.)
gf6_file.f90 : demonstrates the record structure for unformatted records larger than 2GBytes. This also has routines to test the FTN95 header/trailer format, although I havn't tested them as yet.
test_xx.bat : a test batch file for gFortran.

Dan, as a short term test, the stream_test.f90 could be converted to FTN95, although it is structured as a single array per record.
You can use a more complex I/O list with stream I/O, but you would need to provide the header-data-trailer structure if you wanted to use this for stepping through the file and recognising the structure. All achievable with stream.

https://www.dropbox.com/s/itvo676lbxv60sd/stream_test.f90?dl=0
https://www.dropbox.com/s/63qddfo5bf36pbi/gf6_file.f90?dl=0
https://www.dropbox.com/s/ea0galabqsmpur7/test_xx.bat?dl=0
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Fri Oct 01, 2021 2:19 pm    Post subject: Reply with quote

Paul,

I have been trying to utilise stream I/O in FTN95 Ver 8.00.0 but I am not getting "pos=" to work for:
READ (...,pos=address, ...)
WRITE (...,pos=address, ...)
INQUIRE (...,pos=address, ...)

FTN95.chm does reference "pos=" in the OPEN ( ACCESS='STREAM' description.

I am trying to test a random access binary file format using stream access and an 8-byte header/data/trailer record format (which no compiler supports !!)

The syntax I am using is:
integer*8 :: address
integer*4 :: stream_unit, iostat
open ( unit=stream_unit, file=stream_file_name, access='STREAM', form='UNFORMATTED', status='UNKNOWN', iostat=iostat )
READ (unit=stream_unit, pos=address, iostat=iostat ) array(1:nw)
WRITE (unit=stream_unit, pos=address, iostat=iostat ) array(1:nw)
inquire ( unit=stream_unit, pos=stream_address, iostat=iostat )

Is this supported, or is there another way to get or set the file address ?

John
Back to top
View user's profile Send private message
JohnCampbell



Joined: 16 Feb 2006
Posts: 2341
Location: Sydney

PostPosted: Fri Oct 01, 2021 3:28 pm    Post subject: Reply with quote

Paul,

Could it be that "pos=" does not support an integer*8 address ?

I changed to integer*4 address ( which is not adequate ) and the program worked better, providing a more realistic address.

However "inquire ( unit=stream_unit, pos=stream_address, iostat=iostat )"
now returns the last address in use, rather than the next address to be used in the file,
except when at the start of the file, when it returns 1 : the next address to be used. (gFortran returns the next byte address to be used in all cases)

John
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    forums.silverfrost.com Forum Index -> General All times are GMT + 1 Hour
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group